In my opinion, there is some kind of race condition in Firedrake when running on more than one node. Thus, until this is fixed it is very unlikely for me to get the 64 cores case running.
Hmm, we are running Firedrake in parallel with no problems here. What is the error?
[Buesing, Henrik] See [1] for the error message and the attached three logs (for the 32 core case this was a 2/5 running and 3/5 crashing ).
This is just for running the compiled code. During the compile stage I had problems, too. What I did is the following: 1) Run Firedrake on 1 node (this works). Now all the *.so files are in place. 2) Run Firedrake on more than one node. This crashes more often the more processes I use.
I’m guessing for a race condition, because on 17 cores (1 node + 1 core) my problem runs fine. On 32 cores it sometimes runs. And on 64 cores it, up to now, never
 runs. 
But if you are not having these problems, and if the provided code reproduces the MatCreateSubMats problem, then you can do tests on your own. Well, a lot of ifs, but better than nothing.
Thank you!
Henrik
[1]
Traceback (most recent call last):
  File "/work/hb111949/Firedrake/
    solver = NonlinearVariationalSolver(
  File "/rwthfs/rz/cluster/work/
    pre_function_callback=pre_f_
  File "/rwthfs/rz/cluster/work/
form_compiler_parameters=fcp)
  File "/rwthfs/rz/cluster/work/
collect_loops=True)
File "<decorator-gen-279>", line 2, in _assemble
  File "/rwthfs/rz/cluster/work/
return f(*args, **kwargs)
  File "/rwthfs/rz/cluster/work/
    kernels = tsfc_interface.compile_form(f, "form", parameters=form_compiler_
  File "/rwthfs/rz/cluster/work/
number_map).kernels
  File "/rwthfs/rz/cluster/work/
obj = make_obj()
  File "/rwthfs/rz/cluster/work/
obj.__init__(*args, **kwargs)
  File "/rwthfs/rz/cluster/work/
   
kernels.append(KernelInfo(
 
File "/rwthfs/rz/cluster/work/
obj = make_obj()
  File "/rwthfs/rz/cluster/work/
obj.__init__(*args, **kwargs)
  File "/rwthfs/rz/cluster/work/
self._code = self._ast_to_c(self._ast, opts)
  File "/rwthfs/rz/cluster/work/
    ast_handler.plan_cpu(self._
  File "/rwthfs/rz/cluster/work/
loop_opt.rewrite(rewrite)
  File "/rwthfs/rz/cluster/work/
ew.sharing_graph_rewrite()
  File "/rwthfs/rz/cluster/work/
prob.solve(ilp.GLPK(msg=0))
  File "/rwthfs/rz/cluster/work/
status = solver.actualSolve(self, **kwargs)
  File "/rwthfs/rz/cluster/work/
rc = subprocess.call(proc, stdout = pipe, stderr = pipe)
  File "/rwthfs/rz/cluster/work/
p = Popen(*popenargs, **kwargs)
  File "/rwthfs/rz/cluster/work/
restore_signals, start_new_session)
  File "/rwthfs/rz/cluster/work/
    raise child_exception_type(errno_
OSError: [Errno 14] Bad address
Thanks,
    Matt
--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener