Problem after parallelization

11 Jul 2016

      Hi all,

Recently I was made aware that there could be a possible problem with how
nektar++ had been compiled on the cluster that I was trying to run it on.
Apparently nektar++ had not been compiled with the MPI turned on with the
result being that it was not parallelizing properly.

After I pointed this out to the system administrator he went on to fix this
issue. After fixing this however, whenever I try to submit a job I get the
following error

*mpirun has exited due to process rank 18 with PID 8273 onnode pod16a5.ibb
exiting improperly. There are two reasons this could occur:1. this process
did not call "init" before exiting, but others inthe job did. This can
cause a job to hang indefinitely while it waitsfor all processes to call
"init". By rule, if one process calls "init",then ALL processes must call
"init" prior to termination.2. this process called "init", but exited
without calling "finalize".By rule, all processes that call "init" MUST
call "finalize" prior toexiting or it will be considered an "abnormal
termination"This may have caused other processes in the application to
beterminated by signals sent by mpirun (as reported here).*
Note however, that when I try to run the same problem in serial mode there
are no problems.

I was hoping someone could shed light on any well-known issues regarding
this.

Sincerely,
-- 

*Amitvikram Dutta*

*MASc Candidate*

*Graduate Research Assistant *

*Okanagan CFD Laboratory*

*University of British Columbia | Okanagan Campus*

Amitvikram Dutta

Stanisław Gepner

tags

participants (2)