Dear Henrik,
On 27 Mar 2017, at 10:13, Buesing, Henrik <HBuesing@eonerc.rwth-aachen.de> wrote:
Dear all,
I am having problems running Firdrake in parallel on more than one node. If I run on 16 cores (2x8 cores, 1 node) everything is fine. If I go to more than one node (2,4,…) I get an error (see also attached log):
OSError: /w0/tmp/lsf_user.35589703.0/pyop2-cache-uid17470/f51f6074c31bf0cd78d85bcb40381923.so: cannot open shared object file: No such file or directory
Do you have an idea what could be wrong? Thank you!
Right now, Firedrake compiles code only on rank zero, and therefore requires that all processes can see the directory it writes to. Assuming you have access to a shared filesystem, set the environment variable: PYOP2_CACHE_DIR to point to a temporary directory that all ranks can see. We are working on node-local compilation, which will be more scalable, and make this problem go away. Thanks, Lawrence