Old OpenMPI versions, especially those using some infiniband stuff,
don't like fork either, for example
https://www.open-mpi.org/faq/?category=openfabrics#ofa-fork



I replaced my OpenMPI 1.10.4 with MPICH 3.2 and the problem (parallel execution on more than 1 node) is gone. Probably, my OpenMPI really did not like the fork.

Thank you!
Henrik