Hello firedrake,
I'm having problems running a firedrake test script across multiple nodes. Petsc crashes with seg faults when using more than one node but the script runs fine when restricted to a single node.
The cluster has 4 nodes where each node has 16 cores/64GB mem with Ubuntu  16.04.2 LTS and IB interconnects. We use SLURM but the errors occur using straight mpirun and a hostfile. I've run openmpi and openib tests and they indicate no problems with either subsystem. Firedrake installs cleanly.
pturner@ubuntu-0-0:~/mpitest$ which python
/home/pturner/firedrake/bin/python
>>> print firedrake.__version__
0.13.0+1303.g9070020
Test simple firedrake script using mpirun
pturner@ubuntu-0-0:~/mpitest$ cat firedrake_proj.py
from firedrake import *
mesh = UnitSquareMesh(10, 10)
p1 = FunctionSpace(mesh, 'CG', 1)
f = Function(p1, name='function')
x, y = SpatialCoordinate(mesh)
expr = sin(2*pi*x)*(1 + y)
f.project(expr)
n = norm(f)
if mesh.comm.rank == 0:
   print('Norm {:}'.format(n))
   print('SUCCESS')
Execute with 16 cores (single node):
pturner@ubuntu-0-0:~/mpitest$ mpirun --mca btl openib,sm,self --mca mpi_warn_on_fork 0 -n 16 -hostfile ~/hostfile python firedrake_proj.py
Norm 1.07999261002
SUCCESS
Execute with 32 cores (2 nodes):
pturner@ubuntu-0-0:~/mpitest$ mpirun --mca btl openib,sm,self --mca mpi_warn_on_fork 0 -n 32 -hostfile ~/hostfile python firedrake_proj.py
Consistent failures on ranks 4 and 20:
[20]PETSC ERROR: ------------------------------------------------------------------------
[20]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[...]
[4]PETSC ERROR: ------------------------------------------------------------------------
[4]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[...]
Any idea what might be going wrong?
Thx,
--Paul
Paul J Turner
OHSU/CMOP