As usual, the problem is not that I have to process 1.1 million "simulated universes," each with its own set of outcomes for an input set of parameters for a model beyond the Standard Model. I have 2000 cores available at a keystroke that make that an easy problem. The problem is that 10 of those points fail. And then I have to find them. And then I have to resubmit them. And then I have to debug them if they continue to fail. And I don't like grepping through 50,000 log files (20 model hypotheses per job). Well . . . to be specific . . . the Lustre filesystem doesn't like me grepping through 50,000 log files.