Hi all,
I'm trying to run a few scaling tests on the cluster I have acess to.
I'm using a mesh with 10,800 elements using an expansion order of 5. The simulation is set to run for 10,000 time steps. The issue that I'm running into is as follows :
Doubling the number of processors
increases the total cpu wall time.
Procs Wall Time
1 201 s 20 209 s
40 242 s
I believe this is due to the overhead caused by writing checkpoint files (each parallel stream seems to write a separate checkpoint file). I have reduced the output frequency to the point that only 1 checkpoint file should be written for the entire simulation time, However, this still requires
n checkpoint files to be written where
n is the number of processors the case is parallelised on.
In all cases I use the mpirun command. For example
mpirun -np n IncNavierStokesSolver case.xml
Could I have some pointers for proceeding further with this issue?
Sincerely,
--