Hi all,
In another thread I had a long running issue with firedrake on an HPC machine. I *think* I found part of the problem.
Everytime I submit a job script to a compute node, it has to (re)compile all the FFC/OP2 forms. What I did to circumvent this problem was modify the job script so that I run my firedrake program twice: once with a very small mesh (to compile all the necessities) and again to simulate my actual finite element problem. In the first run, I compiled the code on a single MPI processes, so if the subsequent run is performed on that same node, I have no issues. However, if it has to run on two different compute nodes, my program freezes because I suspect that the ranks on the original node has the cache/compiled forms whereas the other node does not, hence my program hanging.