Hi Syavash,
Can you run ldd the IncNavierStokesSolver and put module list as an extra line in your submission script before the solver execution and run again and send through the output?
It looks like you’ve compiled at least Spatial Domains with an intel specific instruction set and these can’t be found at runtime.
Kind regards,
James.
From: nektar-users-bounces@imperial.ac.uk <nektar-users-bounces@imperial.ac.uk>
On Behalf Of Ehsan Asgari
Sent: Saturday, March 11, 2023 8:32 AM
To: nektar-users <nektar-users@imperial.ac.uk>
Subject: [Nektar-users] Installing Nektar++ 5.2.0 on a cluster
Hi Parv
Thank you for your kind response.
I managed to install ver. 5.3 using intel compiler and OpenMPI 4.0.3 at last. However, I am still getting MPI-related issues when running a medium-sized mesh with 760K cells (success with a smaller mesh in parallel):
=======================================================================
EquationType: UnsteadyNavierStokes
Session Name: clusteredToGmsh
Spatial Dim.: 3
Max SEM Exp. Order: 4
Num. Processes: 32
Expansion Dim.: 3
Projection Type: Continuous Galerkin
Advect. advancement: explicit
Diffuse. advancement: implicit
Time Step: 0.0001
No. of Steps: 500000
Checkpoints (steps): 10000
Integration Type: IMEX
Splitting Scheme: Velocity correction (strong press. form)
=======================================================================
Initial Conditions:
- Field u: 1.0
- Field v: 0.0
- Field w: 0.0
- Field p: 0.0
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node cra1080 exited on signal 9 (Killed).
I was suspicious that it might be a problem with the inlet compiler and so I switched to GCC 9.3 for a fresh installation. But it seems that GCC is causing problems and I get the following error at some point:
../../SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '__intel_sse2_strcpy'
../../SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '_intel_fast_memmove'
../../SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '__intel_sse2_strlen'
../../SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '_intel_fast_memcpy'
../../SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '_intel_fast_memset'
collect2: error: ld returned 1 exit status
make[2]: *** [library/Demos/SpatialDomains/CMakeFiles/PartitionAnalyse.dir/build.make:127: library/Demos/SpatialDomains/PartitionAnalyse] Error 1
make[1]: *** [CMakeFiles/Makefile2:3681: library/Demos/SpatialDomains/CMakeFiles/PartitionAnalyse.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
I discussed the mesh problem with James Slaughter prior to this and he suggested that the problem might be due to a bug in a lower version (5.0.3) that I was working with. That is why I decided to go with the most recent version.
After all, I am still struggling with my parallel simulations!
Kind regards
syavash
On Thu, Mar 9, 2023 at 4:57 PM Khurana, Parv <p.khurana22@imperial.ac.uk> wrote:
Hi Syavash,
A few questions pop up on my mind on seeing this:
- What compilers are you using (GCC or Intel?)
- Are you loading the modules which are compatible with the compiler you are using?
- Do you have a version of OpenBLAS or MKL already loaded as one of the modules on your cluster?
As often is the case, the problem might be more detailed and it’ll be great to see the modules and cmake commands you are using for you installation in order to debug this properly. Happy to hop on a call if needed!
Best
Parv
From: nektar-users-bounces@imperial.ac.uk <nektar-users-bounces@imperial.ac.uk> On Behalf Of Ehsan Asgari
Sent: 09 March 2023 09:54
To: nektar-users <nektar-users@imperial.ac.uk>
Subject: [Nektar-users] Installing Nektar++ 5.2.0 on a cluster
This email from eh.asgari@gmail.com originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list to disable email stamping for this address.
Hi Everyone,
I am trying to install the latest version of Nektar on a cluster. However, I get the following error at some point:
../../library/SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '__intel_sse2_strlen'
../../library/SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '_intel_fast_memcpy'
../../library/SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '_intel_fast_memset'
collect2: error: ld returned 1 exit status
make[2]: *** [utilities/NekMesh/CMakeFiles/NekMesh.dir/build.make:106: utilities/NekMesh/NekMesh] Error 1
make[1]: *** [CMakeFiles/Makefile2:1647: utilities/NekMesh/CMakeFiles/NekMesh.dir/all] Error 2
make: *** [Makefile:141: all] Error 2
I had "NEKTAR_USE_SYSTEM_BLAS_LAPACK:BOOL=ON " and "THIRDPARTY_BUILD_BLAS_LAPACK:BOOL=ON" in the ccmake as per suggested in the user archives.
I appreciate your kind help.
Kind regards
syavash