Re: [Nektar-users] FW: Problem while installing nektar++ with lapack
Hi all, I had to use a slight workaround because the platform I was compiling on could not download the thirdparty files correctly and would always give a hash mismatched error. I downloaded the third party files on my home machine and re-uploaded them into the $NEKTAR_HOME folder. As a result when I begin the build process the ThirdParty folder is populated with the zipped versions of the third party software. Is this what what might be causing the problem? Is there any way to edit the cmake file and to get the compilation process working correctly? On Fri, Oct 12, 2018 at 2:57 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
*From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 2:56:43 PM (UTC-05:00) Eastern Time (US & Canada)
*To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
Have you attempted to build lapack separately at any point? It's probably worth clearing out your build directory and also all the contents of the ThirdParty directory in the base nektar++ source directory, which I'll call $NEKTAR_HOME, so $NEKTAR_HOME/ThirdParty and then re-running cmake in an empty $NEKTAR_HOME/build directory and trying the build again.
It looks like the build step is encountering a previous source tree in the location where it's trying to build which seems strange.
I've just had a look at the log from my clean build and I see exactly the same messages as you in relation to lapack-3.7.0 in the same order as far as "Checking whether /usr/bin/f95 supports Fortran 90 -- yes", however, I then see "-- Looking for Python greater than 2.6 - " and the build of lapack completes successfully.
Just to confirm, I am running cmake and make in a separate build directory under the main nektar++ source tree directory, so I'm building in $NEKTAR_HOME/build - I assume you're doing something similar? You should see a ThirdParty directory in $NEKTAR_HOME and another ThirdParty directory in $NEKTAR_HOME/build/
I believe that the initial download of the lapack-3.7.0.tar.gz should be placed in $NEKTAR_HOME/ThirdParty and unpacked there. Then, when the build succeeds or stops, in $NEKTAR_HOME/build/ThirdParty, you should see lapack-3.7.0/ where I think the build actually takes place, and then a separate $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0-tmp which should contain a couple of lapack-3.7.0-cfgcmd.txt files that I think contain the build command that is used - you could perhaps paste the contents of the lapack-3.7.0-cfgcmd.txt into an email if you're still having issues and we can see if that looks correct.
It is, of course, possible that this is something related to the specific configuration of the platform that you're building on, but I think the third party lapack build should be straightforward and it sounds like for some reason, it's attempting to build in the wrong location, or a location where an existing source tree has ended up for some reason.
I'm afraid I don't have a very detailed knowledge of the build system beyond this so if none of the suggestions so far help you to resolve the problem, maybe someone with more knowledge of the build system can provide some advice.
Cheers, Jeremy
On 12 Oct 2018, at 19:34, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I tried to compile nektar using Jeremy's latest suggestions having both THIRDPARTY_BUILD_BLAS_LAPACK and NEKTAR_USE_SYSTEM_BLAS_LAPACK turned on. The following error occured. It seems that I might have to compile lapack separately. Is this unusual?
<image.png>
On Fri, Oct 12, 2018 at 1:16 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
*From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 1:16:14 PM (UTC-05:00) Eastern Time (US & Canada)
*To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
As Chris suggests, it's probably better to use vendor supplied libraries if you can get those working.
In addition to the further information Chris has asked to take a look at, one thing you could check is to whether there are any files in your nektar++/build/ThirdParty/lapack-3.7.0 directory (if that directory exists at all).
I've been trying to see if I can recreate the problem and I was able to see something similar when setting THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=OFF. In this case, I can configure successfully and start the build but it fails with a large number of undefined references that are similar to, and include, the dtpmv_ symbol that you mentioned. When I look in nektar++/build/ThirdParty/lapack-3.7.0, the directory is empty so it looks like the build system has configured on the basis of building its own blas/lapack but the build hasn't been carried out and therefore LibUtilities can't be linked against it.
As a test, you could try running the build with both THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=ON, if this isn't the setting you've been using already. When I tried this, the build of blas/lapack is carried out successfully and the linking is fine with the full build of Nektar++ completing successfully. I removed the system blas/lapack on my test system to be sure it was linking against the correct instance.
Cheers, Jeremy
On 12 Oct 2018, at 17:50, Chris Cantwell <c.cantwell@imperial.ac.uk> wrote:
Hi Amitvikram,
Could you send us your CMakeCache.txt file from your build directory and the output from running: make VERBOSE=1 for both cases.
In the case of using ThirdParty LAPACK, it seems to not be linking to it. Probably you should be using vendor-supplied libraries if possible though so better if ee can get those working.
Thanks, Chris
On 12 October 2018 14:08:55 BST, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi Jeremy,
I'm actually trying to build nektar++ on a BGQ cluster similar to Mira.
I'm trying to build nektar++-4.4.1 and the system lapacek version is 3.4.2
Sincerely, On Fri, Oct 12, 2018 at 4:24 AM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
------------------------------ *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 4:24:33 AM (UTC-05:00) Eastern Time (US & Canada) *To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
Can you provide some further details of the problem you're encountering.
Specifically, can you confirm what platform (including version) you're building on, and if Linux, which I assume is the platform you're using, which distribution.
Can you also confirm what version of Nektar++ you're trying to build, and the version of the system Lapack distribution that you're using.
Thanks,
Jeremy
On 12 Oct 2018, at 01:05, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I keep having the same problem while trying to install nektar++ with regards to the Lapack libraries.
When I try to use the system Lapack installation I get the following message
*/scinet/bgq/Libraries/lapack/lib/liblapack.so: undefined reference to `_xlfEndIO@XLF_1.0'*
while when I try to install using the ThirdParty Lapack supplied with the nektar++ source directory I get the following error
*../../library/LibUtilities/libLibUtilities.so.4.4.1: undefined reference to `dtpmv_'*
I have a feeling these errors have been encountered by the community at large before. Could someone point out where I'm going wrong?
Sincerely, --
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo _______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/nektar-users
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
--
*Amitvikram Dutta* Graduate Research Assistant Fluid Mechanics Research Lab Multi-Physics Interaction Lab University of Waterloo
Hi Amitvikram, Some sites block non-SSL enabled HTTP traffic, returning a webpage reporting the error rather than the actual file (hence the hash mismatch). You could try turning on the THIRDPARTY_USE_SSL option to see if that is allowed. Cheers, Chris On Fri, 12 Oct 2018 15:15:10 -0400, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I had to use a slight workaround because the platform I was compiling on could not download the thirdparty files correctly and would always give a hash mismatched error. I downloaded the third party files on my home machine and re-uploaded them into the $NEKTAR_HOME folder. As a result when I begin the build process the ThirdParty folder is populated with the zipped versions of the third party software. Is this what what might be causing the problem? Is there any way to edit the cmake file and to get the compilation process working correctly?
On Fri, Oct 12, 2018 at 2:57 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
*From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 2:56:43 PM (UTC-05:00) Eastern Time (US & Canada)
*To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
Have you attempted to build lapack separately at any point? It's probably worth clearing out your build directory and also all the contents of the ThirdParty directory in the base nektar++ source directory, which I'll call $NEKTAR_HOME, so $NEKTAR_HOME/ThirdParty and then re-running cmake in an empty $NEKTAR_HOME/build directory and trying the build again.
It looks like the build step is encountering a previous source tree in the location where it's trying to build which seems strange.
I've just had a look at the log from my clean build and I see exactly the same messages as you in relation to lapack-3.7.0 in the same order as far as "Checking whether /usr/bin/f95 supports Fortran 90 -- yes", however, I then see "-- Looking for Python greater than 2.6 - " and the build of lapack completes successfully.
Just to confirm, I am running cmake and make in a separate build directory under the main nektar++ source tree directory, so I'm building in $NEKTAR_HOME/build - I assume you're doing something similar? You should see a ThirdParty directory in $NEKTAR_HOME and another ThirdParty directory in $NEKTAR_HOME/build/
I believe that the initial download of the lapack-3.7.0.tar.gz should be placed in $NEKTAR_HOME/ThirdParty and unpacked there. Then, when the build succeeds or stops, in $NEKTAR_HOME/build/ThirdParty, you should see lapack-3.7.0/ where I think the build actually takes place, and then a separate $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0-tmp which should contain a couple of lapack-3.7.0-cfgcmd.txt files that I think contain the build command that is used - you could perhaps paste the contents of the lapack-3.7.0-cfgcmd.txt into an email if you're still having issues and we can see if that looks correct.
It is, of course, possible that this is something related to the specific configuration of the platform that you're building on, but I think the third party lapack build should be straightforward and it sounds like for some reason, it's attempting to build in the wrong location, or a location where an existing source tree has ended up for some reason.
I'm afraid I don't have a very detailed knowledge of the build system beyond this so if none of the suggestions so far help you to resolve the problem, maybe someone with more knowledge of the build system can provide some advice.
Cheers, Jeremy
On 12 Oct 2018, at 19:34, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I tried to compile nektar using Jeremy's latest suggestions having both THIRDPARTY_BUILD_BLAS_LAPACK and NEKTAR_USE_SYSTEM_BLAS_LAPACK turned on. The following error occured. It seems that I might have to compile lapack separately. Is this unusual?
<image.png>
On Fri, Oct 12, 2018 at 1:16 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
*From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 1:16:14 PM (UTC-05:00) Eastern Time (US & Canada)
*To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
As Chris suggests, it's probably better to use vendor supplied libraries if you can get those working.
In addition to the further information Chris has asked to take a look at, one thing you could check is to whether there are any files in your nektar++/build/ThirdParty/lapack-3.7.0 directory (if that directory exists at all).
I've been trying to see if I can recreate the problem and I was able to see something similar when setting THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=OFF. In this case, I can configure successfully and start the build but it fails with a large number of undefined references that are similar to, and include, the dtpmv_ symbol that you mentioned. When I look in nektar++/build/ThirdParty/lapack-3.7.0, the directory is empty so it looks like the build system has configured on the basis of building its own blas/lapack but the build hasn't been carried out and therefore LibUtilities can't be linked against it.
As a test, you could try running the build with both THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=ON, if this isn't the setting you've been using already. When I tried this, the build of blas/lapack is carried out successfully and the linking is fine with the full build of Nektar++ completing successfully. I removed the system blas/lapack on my test system to be sure it was linking against the correct instance.
Cheers, Jeremy
On 12 Oct 2018, at 17:50, Chris Cantwell <c.cantwell@imperial.ac.uk> wrote:
Hi Amitvikram,
Could you send us your CMakeCache.txt file from your build directory and the output from running: make VERBOSE=1 for both cases.
In the case of using ThirdParty LAPACK, it seems to not be linking to it. Probably you should be using vendor-supplied libraries if possible though so better if ee can get those working.
Thanks, Chris
On 12 October 2018 14:08:55 BST, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi Jeremy,
I'm actually trying to build nektar++ on a BGQ cluster similar to Mira.
I'm trying to build nektar++-4.4.1 and the system lapacek version is 3.4.2
Sincerely, On Fri, Oct 12, 2018 at 4:24 AM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
------------------------------ *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 4:24:33 AM (UTC-05:00) Eastern Time (US & Canada) *To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
Can you provide some further details of the problem you're encountering.
Specifically, can you confirm what platform (including version) you're building on, and if Linux, which I assume is the platform you're using, which distribution.
Can you also confirm what version of Nektar++ you're trying to build, and the version of the system Lapack distribution that you're using.
Thanks,
Jeremy
On 12 Oct 2018, at 01:05, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I keep having the same problem while trying to install nektar++ with regards to the Lapack libraries.
When I try to use the system Lapack installation I get the following message
*/scinet/bgq/Libraries/lapack/lib/liblapack.so: undefined reference to `_xlfEndIO@XLF_1.0'*
while when I try to install using the ThirdParty Lapack supplied with the nektar++ source directory I get the following error
*../../library/LibUtilities/libLibUtilities.so.4.4.1: undefined reference to `dtpmv_'*
I have a feeling these errors have been encountered by the community at large before. Could someone point out where I'm going wrong?
Sincerely, --
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo _______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/nektar-users
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
Hi Amitvikram, I would certainly try Chris's suggestion. However, something else to check is where you're getting the third party downloads from. If you take a clean Nektar++ source tree and place the standard netlib lapack-3.7.0.tgz source file that build system downloads into $NEKTAR_HOME/ThirdParty (i.e. the download from http://www.netlib.org/lapack/lapack-3.7.0.tgz), the build should proceed successfully. It looks like the lapack tar file that you're using may already have some build artefacts in it - did you tar the content from $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0 into a lapack-3.7.0.tgz file or you're working with the standard .tgz file from the netlib.org site? Cheers, Jeremy On 13 Oct 2018, at 21:19, Chris Cantwell <c.cantwell@imperial.ac.uk> wrote:
Hi Amitvikram,
Some sites block non-SSL enabled HTTP traffic, returning a webpage reporting the error rather than the actual file (hence the hash mismatch).
You could try turning on the THIRDPARTY_USE_SSL option to see if that is allowed.
Cheers, Chris
On Fri, 12 Oct 2018 15:15:10 -0400, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I had to use a slight workaround because the platform I was compiling on could not download the thirdparty files correctly and would always give a hash mismatched error. I downloaded the third party files on my home machine and re-uploaded them into the $NEKTAR_HOME folder. As a result when I begin the build process the ThirdParty folder is populated with the zipped versions of the third party software. Is this what what might be causing the problem? Is there any way to edit the cmake file and to get the compilation process working correctly?
On Fri, Oct 12, 2018 at 2:57 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
*From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 2:56:43 PM (UTC-05:00) Eastern Time (US & Canada)
*To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
Have you attempted to build lapack separately at any point? It's probably worth clearing out your build directory and also all the contents of the ThirdParty directory in the base nektar++ source directory, which I'll call $NEKTAR_HOME, so $NEKTAR_HOME/ThirdParty and then re-running cmake in an empty $NEKTAR_HOME/build directory and trying the build again.
It looks like the build step is encountering a previous source tree in the location where it's trying to build which seems strange.
I've just had a look at the log from my clean build and I see exactly the same messages as you in relation to lapack-3.7.0 in the same order as far as "Checking whether /usr/bin/f95 supports Fortran 90 -- yes", however, I then see "-- Looking for Python greater than 2.6 - " and the build of lapack completes successfully.
Just to confirm, I am running cmake and make in a separate build directory under the main nektar++ source tree directory, so I'm building in $NEKTAR_HOME/build - I assume you're doing something similar? You should see a ThirdParty directory in $NEKTAR_HOME and another ThirdParty directory in $NEKTAR_HOME/build/
I believe that the initial download of the lapack-3.7.0.tar.gz should be placed in $NEKTAR_HOME/ThirdParty and unpacked there. Then, when the build succeeds or stops, in $NEKTAR_HOME/build/ThirdParty, you should see lapack-3.7.0/ where I think the build actually takes place, and then a separate $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0-tmp which should contain a couple of lapack-3.7.0-cfgcmd.txt files that I think contain the build command that is used - you could perhaps paste the contents of the lapack-3.7.0-cfgcmd.txt into an email if you're still having issues and we can see if that looks correct.
It is, of course, possible that this is something related to the specific configuration of the platform that you're building on, but I think the third party lapack build should be straightforward and it sounds like for some reason, it's attempting to build in the wrong location, or a location where an existing source tree has ended up for some reason.
I'm afraid I don't have a very detailed knowledge of the build system beyond this so if none of the suggestions so far help you to resolve the problem, maybe someone with more knowledge of the build system can provide some advice.
Cheers, Jeremy
On 12 Oct 2018, at 19:34, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I tried to compile nektar using Jeremy's latest suggestions having both THIRDPARTY_BUILD_BLAS_LAPACK and NEKTAR_USE_SYSTEM_BLAS_LAPACK turned on. The following error occured. It seems that I might have to compile lapack separately. Is this unusual?
<image.png>
On Fri, Oct 12, 2018 at 1:16 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
*From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 1:16:14 PM (UTC-05:00) Eastern Time (US & Canada)
*To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
As Chris suggests, it's probably better to use vendor supplied libraries if you can get those working.
In addition to the further information Chris has asked to take a look at, one thing you could check is to whether there are any files in your nektar++/build/ThirdParty/lapack-3.7.0 directory (if that directory exists at all).
I've been trying to see if I can recreate the problem and I was able to see something similar when setting THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=OFF. In this case, I can configure successfully and start the build but it fails with a large number of undefined references that are similar to, and include, the dtpmv_ symbol that you mentioned. When I look in nektar++/build/ThirdParty/lapack-3.7.0, the directory is empty so it looks like the build system has configured on the basis of building its own blas/lapack but the build hasn't been carried out and therefore LibUtilities can't be linked against it.
As a test, you could try running the build with both THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=ON, if this isn't the setting you've been using already. When I tried this, the build of blas/lapack is carried out successfully and the linking is fine with the full build of Nektar++ completing successfully. I removed the system blas/lapack on my test system to be sure it was linking against the correct instance.
Cheers, Jeremy
On 12 Oct 2018, at 17:50, Chris Cantwell <c.cantwell@imperial.ac.uk> wrote:
Hi Amitvikram,
Could you send us your CMakeCache.txt file from your build directory and the output from running: make VERBOSE=1 for both cases.
In the case of using ThirdParty LAPACK, it seems to not be linking to it. Probably you should be using vendor-supplied libraries if possible though so better if ee can get those working.
Thanks, Chris
On 12 October 2018 14:08:55 BST, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi Jeremy,
I'm actually trying to build nektar++ on a BGQ cluster similar to Mira.
I'm trying to build nektar++-4.4.1 and the system lapacek version is 3.4.2
Sincerely, On Fri, Oct 12, 2018 at 4:24 AM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
------------------------------ *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 4:24:33 AM (UTC-05:00) Eastern Time (US & Canada) *To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
Can you provide some further details of the problem you're encountering.
Specifically, can you confirm what platform (including version) you're building on, and if Linux, which I assume is the platform you're using, which distribution.
Can you also confirm what version of Nektar++ you're trying to build, and the version of the system Lapack distribution that you're using.
Thanks,
Jeremy
On 12 Oct 2018, at 01:05, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I keep having the same problem while trying to install nektar++ with regards to the Lapack libraries.
When I try to use the system Lapack installation I get the following message
*/scinet/bgq/Libraries/lapack/lib/liblapack.so: undefined reference to `_xlfEndIO@XLF_1.0'*
while when I try to install using the ThirdParty Lapack supplied with the nektar++ source directory I get the following error
*../../library/LibUtilities/libLibUtilities.so.4.4.1: undefined reference to `dtpmv_'*
I have a feeling these errors have been encountered by the community at large before. Could someone point out where I'm going wrong?
Sincerely, --
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo _______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/nektar-users
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
Hello everyone, I am working on the same task as Amitvikram, on the same cluster and currently having the exact same problem: *undefined reference to lapack libs even though they are compiled successfully*. I've read through this thread as well as some others, so here is a brief summary about what I've done so far before asking some questions. -- System info: Using cmake-2.8.12, cross-compiled gcc-4.8 and mpich-2 on a Blue Gene Q cluster. -- Nektar version: Decided to use the git repo. -- Added "-dynamic" flag to the "CMakeLists.txt" as it was suggested here: https://www.nektar.info/nektar-on-mira-cluster/ -- Boost: I initially used system installed boost but then decided to stick to the third-party version shipped with nektar. It is because, some of the required libs (for instance boost_iostreams) weren't part of the central installation. To deal with that, I firstly set up a partial build by referencing each individual library file explicitly in cmake command. In fact, it seems to build the required libs successfully but later fails during the nektar compilation. I think it messes up the environment and basically links to the wrong files. So anyway, I am using "ThirdParty/boost_1_57_0". -- Lapack: The reason that I am not using system lapack is simply because cmake says "dgemm_" is not found in the system blas version. Therefore, I am compiling the "ThirdParty/Lapack-3.7.0" which I downloaded from " http://www.netlib.org/lapack/lapack-3.7.0.tgz". *Note that compilation fails with the same error even when I use ThirdParty/lapack.* -- FFTW: Using system installed version. -- Download process: I cancelled MD5 checks and downloading with "wget" due to the similar ssl error mentioned before. This is an easy workaround though, and probably has nothing to do with the error. I download all packages to the "nektar/ThirdParty" and copy them to "nektar/build/ThirdParty" as well. The reason of this copy operation is that when nektar extracts the downloaded packages, I see that uncompressed folders are somehow empty. I don't know if that's a cmake bug, or a problem from my side. So that's why I download, extract and copy third party sources to "nektar/build/ThirdParty" manually. -- CMake command:
cmake $src \ -DCMAKE_INSTALL_PREFIX=$prf \ -DNEKTAR_USE_MPI=ON \ -DNEKTAR_USE_SYSTEM_BLAS_LAPACK=ON \ -DNEKTAR_USE_FFTW=ON \ -DFFTW_INCLUDE_DIR=${SCINET_FFTW_INC} \ -DFFTW_LIBRARY=${SCINET_FFTW_LIB}/libfftw3.a \ -DTHIRDPARTY_BUILD_BLAS_LAPACK=ON
As you can see, I enabled both "DNEKTAR_USE_SYSTEM_BLAS_LAPACK" and " DTHIRDPARTY_BUILD_BLAS_LAPACK" due to the suggestions; however this didn't seem to make a difference for me. Compilation fails at the same step with both are enabled or not. -- Build process for Third-Party: In general they are compiled without any errors. In particular, I checked cmake files for each package and Lapack is configured with "-DBUILD_SHARED_LIBS:STRING=ON". I can see that objects are compiled with "-fPIC" option, it is in the cmake. However, " lapack/CMakeLists.txt" contains this line: "option(BUILD_SHARED_LIBS "Build shared libraries" OFF)" which I set to "ON" in my build script. This is how libraries look in the "nektar/build/ThirdParty/dist/lib" directory after compiling ThirdParty libraries:
bgqdev-fen1-$ ls nektar/build/ThirdParty/dist/lib/ cmake libboost_program_options.so
libgsmpi.a libtinyxml.a
*libblas.so* libboost_program_options.so.1.57.0
*liblapack.so* libxxt.a
libblas.so.3 libboost_regex.so
liblapack.so.3 libz.a
libblas.so.3.7.0 libboost_regex.so.1.57.0
liblapack.so.3.7.0 libz.so
libboost_filesystem.so
libboost_system.so libscotch.a
libz.so.1 libboost_filesystem.so.1.57.0
libboost_system.so.1.57.0
libscotcherr.a libz.so.1.2.7
libboost_iostreams.so
libboost_thread.so libscotcherrexit.a
pkgconfig libboost_iostreams.so.1.57.0
libboost_thread.so.1.57.0
libscotchmetis.a
This folder is about 1.5GB by the way. However, "nektar/build/ThirdParty/dist/include" folder doesn't have lapack related headers:
bgqdev-fen1-$ ls boost scotchf.h scotch.h tinystr.h tinyxml.h zconf.h zlib.h bgqdev-fen1-$ pwd /scinet/bgq/Applications/nektar/nektar/build/ThirdParty/dist/include
Also, I can share the initial parts of the lapack build - in this version I tried to reference to the system blas for lapack installation:
[ 6%] Performing configure step for 'lapack-3.7.0' cd /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0 && /gpfs/home/scinet/bgq/tools/cmake/2.8.12.1/bin/cmake -G "Unix Makefiles" -DCMAKE_Fortran_COMPILER:FIL EPATH=/scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -DCMAKE_INSTALL_PREFIX:PATH=/scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/dist -DCMAKE_INSTALL_LIBDIR:PATH=/scine t/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/dist/lib -DBUILD_SHARED_LIBS:STRING=ON -DBUILD_TESTING:STRING=OFF /scinet/bgq/Applications/nektar/nektar/ThirdParty/lapack-3.7 .0 Re-run cmake no build system arguments -- Setting build type to 'Release' as none was specified. -- The Fortran compiler identification is GNU -- Check for working Fortran compiler: /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -- Check for working Fortran compiler: /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -- works -- Detecting Fortran compiler ABI info -- Detecting Fortran compiler ABI info - done -- Checking whether /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 supports Fortran 90 -- Checking whether /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 supports Fortran 90 -- yes -- Looking for Python greater than 2.6 - -- Could NOT find PythonInterp: Found unsuitable version "2.6.6", but required is at least "2.7" (found /usr/bin/python2) -- No suitable Python version found, so skipping summary tests. -- Reducing RELEASE optimization level to O2 -- Looking for Fortran NONE - found -- Looking for Fortran INT_CPU_TIME - found -- Looking for Fortran EXT_ETIME - not found -- Looking for Fortran EXT_ETIME_ - not found -- Looking for Fortran INT_ETIME - found -- --> Will use second_INT_ETIME.f and dsecnd_INT_ETIME.f as timing function. *-- Using supplied NETLIB BLAS implementation* *-- Using supplied NETLIB LAPACK implementation* -- Building Single Precision -- Building Double Precision -- Building Complex Precision -- Building Double Complex Precision -- BUILD TESTING : OFF -- Configuring done -- Generating done -- Build files have been written to: /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0
Additionally, I can see "dgemm" in the log.make:
bgqdev-fen1-$ grep -rn "dgemm" nektar/build-gcc/log.make.2 13756:[ 3%] Building Fortran object *BLAS/SRC/CMakeFiles/blas.dir/dgemm.f.o* 13757:cd /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0/BLAS/SRC && /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -Dblas_EXPORTS -O2 -fPIC -c /scinet/bgq/Applications/nektar/ *nektar/ThirdParty/lapack-3.7.0/BLAS/SRC/dgemm.f* -o CMakeFiles/blas.dir/ *dgemm.f.o* 14018:/scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -fPIC -O2 -Wl,-rpath=/bgsys/drivers/ppcfloor/comm/lib/libmpichf90-gcc.so.8 -shared -Wl,-soname,libblas.so.3 -o ../../lib/libblas.so.3.7.0 CMakeFiles/blas.dir/isamax.f.o CMakeFiles/blas.dir/sasum.f.o CMakeFiles/blas.dir/saxpy.f.o CMakeFiles/blas.dir/scopy.f.o
...............
*CMakeFiles/blas.dir/dgemm.f.o *
This is the part that compilation fails:
[ 34%] Building CXX object utilities/NekMesh/CMakeFiles/ NekMesh.dir/ProcessModules/ProcessVarOpti/ElUtil.cpp.o
......... .........
/........./bgq/compilers/gcc/4.8.1/bin/../lib/gcc/powerpc64-bgq-linux/4. 8.1/../../../../*powerpc64-bgq-**linux/bin/ld: warning: libmpichf90-gcc.so.8, needed by /scinet/bgq/Applications/* nektar/nektar/build/ThirdParty/dist/lib/libblas.*so*, *not found (try using -rpath or -rpath-link) *
......... NodeOpti.cpp:(.text+0x1784): undefined reference to `dgeev_'
......... NodeOpti.cpp:(.text+0x25b4): undefined reference to `dgemm_'
......... ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to
`dgetri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrs_' collect2: error: ld returned 1 exit status make[2]: *** [utilities/NekMesh/NekMesh] Error 1
I've tried many things; xx enabled configure option "build_shared_libs" in CMakeLists.txt in "ThirdParty/lapack" xx made a copy of "make.inc.example" in ThirdParty/lapack and reduced optimization levels xx since this is a Blue Gene environment *made reference to ESSL instead of BLAS* *But none of it seems to makes a difference. It always fails in the exact same step.* This "*libmpichf90-gcc.so.8*" warning seems a bit odd to me and I am not sure if that has anything to do with the undefined ref err. I created a symlink to this library and added it to "LD_LIBRARY_PATH" as well, but then it failed with the following message "undefined symbol: _cnkspi_MemoryRegionCacheLastAccessedElementNumber" by "*libpami-gcc.so*" where PAMI is a lower level messaging api by IBM. Also, "cnkspi" sound far too low level because "CNK" is the kernel on the compute nodes and "SPI" is the implementation that allows communication with that kernel. I added a linker flag "-Wl,-rpath" but I guess it only makes things go uglier. bgqdev-fen1-$ readelf -d nektar/build-gcc/ThirdParty/dist/lib/libblas.so |
grep NEEDED 0x0000000000000001 (NEEDED) Shared library: [ *libmpichf90-gcc.so.8*] 0x0000000000000001 (NEEDED) Shared library: [libmpich-gcc.so.8] 0x0000000000000001 (NEEDED) Shared library: [libopa-gcc.so.0] 0x0000000000000001 (NEEDED) Shared library: [libmpl-gcc.so.1] 0x0000000000000001 (NEEDED) Shared library: [*libpami-gcc.so* ] 0x0000000000000001 (NEEDED) Shared library: [librt.so.1] 0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0] 0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6] 0x0000000000000001 (NEEDED) Shared library: [libc.so.6] 0x0000000000000001 (NEEDED) Shared library: [libnss_files.so.2] 0x0000000000000001 (NEEDED) Shared library: [libnss_dns.so.2] 0x0000000000000001 (NEEDED) Shared library: [libresolv.so.2] 0x0000000000000001 (NEEDED) Shared library: [libgfortran.so.3] 0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
xx *As an alternative, I switched to static linking*. I initially changed " NEKTAR_LIBRARY_TYPE" to "STATIC" in the "CMakeLists.txt". xx It seems that some of the ThirdParty libraries are configured with the assumption of shared objects so I changed them as well. For instance, boost is configured with options "link=shared" and "runtime-link=shared" which I set to static.* I can see all required boost libs are successfully compiled and written to "build/ThirdParty/dist/lib".* Now this is the cmake command:
cmake $src \ -DCMAKE_INSTALL_PREFIX=$prf \ -DNEKTAR_USE_MPI=ON \ -DNEKTAR_USE_SYSTEM_BLAS_LAPACK=ON \ -DNATIVE_BLAS:FILEPATH=${SCINET_LAPACK_LIB}/libblas.a \ -DNATIVE_BLAS_LIB_DIR:FILEPATH=${SCINET_LAPACK_LIB} \ -DNATIVE_LAPACK:FILEPATH=${SCINET_LAPACK_LIB}/liblapack.a \ -DNATIVE_LAPACK_LIB_DIR:FILEPATH=${SCINET_LAPACK_LIB} \ -DNEKTAR_USE_FFTW=ON \ -DFFTW_INCLUDE_DIR=${SCINET_FFTW_INC} \ -DFFTW_LIBRARY=${SCINET_FFTW_LIB}/libfftw3.a \ -DBoost_NO_SYSTEM_PATHS:BOOL=TRUE \ -DZLIB_INCLUDE_DIR:PATH=${SCINET_ZLIB_INC} \ -DZLIB_LIBRARY:FILEPATH=${SCINET_ZLIB_LIB}/libz.a
The issue now is installer seems to ignore "-DBoost_NO_SYSTEM_PATHS:BOOL=TRUE" and seeks locations other than "BOOST_ROOT" which I set to " nektar/build-gcc/dist". See for instance:
[ 5%] Building CXX object library/LibUtilities/CMakeFiles/LibUtilities.dir/BasicUtils/ArrayEqualityCompa rison.cpp.o cd /scinet/bgq/Applications/nektar/nektar++-4.4.1/build-gcc-static/library/LibUtilities && /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpicxx -DLIB_UTILITIES_EXPORTS -DNEKTAR_MEMORY_POOL_ENABLED -DNEKTAR_USE_MPI -DNEKTAR_USING_BLAS -DNEKTAR_USING_LAPACK -DNEKTAR_VERSION=\"4.4.1\" -DTIXML_USE_STL -O3 -DNDEBUG -Wall -Wno-deprecated -Wno-sign-compare -DNEKTAR_RELEASE -isystem /scinet/bgq/Applications/nektar/nektar++-4.4.1/build-gcc-static/ThirdParty/dist/include -isystem /scinet/bgq/Libraries/zlib-1.2.7-gcc4.8.1/include -isystem /scinet/bgq/Applications/nektar/nektar++-4.4.1/ThirdParty/dist/include -isystem /scinet/bgq/Libraries/fftw-3.3.5-gcc/include -I/scinet/bgq/Applications/nektar/nektar++-4.4.1 -I/scinet/bgq/Applications/nektar/nektar++-4.4.1/library -o CMakeFiles/LibUtilities.dir/BasicUtils/ArrayEqualityComparison.cpp.o -c /scinet/bgq/Applications/nektar/nektar++-4.4.1/library/LibUtilities/BasicUtils/ArrayEqualityComparison.cpp In file included from */bgsys/linux/ionfloor/usr/include/boost/config.hpp:57:0*, from */bgsys/linux/ionfloor/usr/include/boost/cstdint.hpp:26*, from /scinet/bgq/Applications/nektar/nektar++-4.4.1/library/LibUtilities/BasicConst/NektarUnivTypeDefs.hpp:40,
So the main quiestion is: *Why does it check "/usr/include/boost" when "cstdint.hpp" already exists in the "build/dist/include/boost/"?*
bgqdev-fen1-$ ls build-gcc/dist/include/boost/cstdint.hpp -l -rw-r--r-- 1 fertinaz scinet 18017 Nov 14 19:00 build-gcc/dist/include/boost/cstdint.hpp
This is how it finally fails:
*/bgsys/linux/ionfloor/usr/include/boost/archive/iterators/binary_from_base64.hpp*:52:9: warning: narrowing conversion of ‘-1’ from ‘int’ to ‘const char’ inside { } is ill-formed in C++11 [-Wnarrowing] make[2]: *** [library/LibUtilities/CMakeFiles/LibUtilities.dir/BasicUtils/CompressData.cpp.o] Error 1
It doesn't help to change the boost code from "const char lookup_table" to "signed char lookup_table" because then "switch-case" statement that returns the endianness information fails in the following file: " nektar/library/LibUtilities/BasicUtils/CompressData.cpp" As you can guess, I disabled the switch-case block, and returned the value, but it fails anyway... Sorry for the long message, hope you could follow. I've run out of ideas and any suggestion is highly appreciated.... // Fatih On Sun, Oct 14, 2018 at 7:42 AM Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote:
Hi Amitvikram,
I would certainly try Chris's suggestion. However, something else to
check is where you're getting the third party downloads from.
If you take a clean Nektar++ source tree and place the standard netlib
lapack-3.7.0.tgz source file that build system downloads into $NEKTAR_HOME/ThirdParty (i.e. the download from http://www.netlib.org/lapack/lapack-3.7.0.tgz), the build should proceed successfully.
It looks like the lapack tar file that you're using may already have some
build artefacts in it - did you tar the content from $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0 into a lapack-3.7.0.tgz file or you're working with the standard .tgz file from the netlib.org site?
Cheers, Jeremy
On 13 Oct 2018, at 21:19, Chris Cantwell <c.cantwell@imperial.ac.uk>
wrote:
Hi Amitvikram,
Some sites block non-SSL enabled HTTP traffic, returning a webpage
reporting the error rather than the actual file (hence the hash mismatch).
You could try turning on the THIRDPARTY_USE_SSL option to see if that
is allowed.
Cheers, Chris
On Fri, 12 Oct 2018 15:15:10 -0400, Amitvikram Dutta <
amitvdutta23@gmail.com> wrote:
Hi all,
I had to use a slight workaround because the platform I was compiling on could not download the thirdparty files correctly and would always give a hash mismatched error. I downloaded the third party files on my home machine and re-uploaded them into the $NEKTAR_HOME folder. As a result when I begin the build process the ThirdParty folder is populated with the zipped versions of the third party software. Is this what what might be causing the problem? Is there any way to edit the cmake file and to get the compilation process working correctly?
On Fri, Oct 12, 2018 at 2:57 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
*From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 2:56:43 PM (UTC-05:00) Eastern Time (US & Canada)
*To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
Have you attempted to build lapack separately at any point? It's probably worth clearing out your build directory and also all the contents of the ThirdParty directory in the base nektar++ source directory, which I'll call $NEKTAR_HOME, so $NEKTAR_HOME/ThirdParty and then re-running cmake in an empty $NEKTAR_HOME/build directory and trying the build again.
It looks like the build step is encountering a previous source tree in the location where it's trying to build which seems strange.
I've just had a look at the log from my clean build and I see exactly the same messages as you in relation to lapack-3.7.0 in the same order as far as "Checking whether /usr/bin/f95 supports Fortran 90 -- yes", however, I then see "-- Looking for Python greater than 2.6 - " and the build of lapack completes successfully.
Just to confirm, I am running cmake and make in a separate build directory under the main nektar++ source tree directory, so I'm building in $NEKTAR_HOME/build - I assume you're doing something similar? You should see a ThirdParty directory in $NEKTAR_HOME and another ThirdParty directory in $NEKTAR_HOME/build/
I believe that the initial download of the lapack-3.7.0.tar.gz should be placed in $NEKTAR_HOME/ThirdParty and unpacked there. Then, when the build succeeds or stops, in $NEKTAR_HOME/build/ThirdParty, you should see lapack-3.7.0/ where I think the build actually takes place, and then a separate $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0-tmp which should contain a couple of lapack-3.7.0-cfgcmd.txt files that I think contain the build command that is used - you could perhaps paste the contents of the lapack-3.7.0-cfgcmd.txt into an email if you're still having issues and we can see if that looks correct.
It is, of course, possible that this is something related to the specific configuration of the platform that you're building on, but I think the third party lapack build should be straightforward and it sounds like for some reason, it's attempting to build in the wrong location, or a location where an existing source tree has ended up for some reason.
I'm afraid I don't have a very detailed knowledge of the build system beyond this so if none of the suggestions so far help you to resolve the problem, maybe someone with more knowledge of the build system can provide some advice.
Cheers, Jeremy
On 12 Oct 2018, at 19:34, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I tried to compile nektar using Jeremy's latest suggestions having both THIRDPARTY_BUILD_BLAS_LAPACK and NEKTAR_USE_SYSTEM_BLAS_LAPACK turned on. The following error occured. It seems that I might have to compile lapack separately. Is this unusual?
<image.png>
On Fri, Oct 12, 2018 at 1:16 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
*From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 1:16:14 PM (UTC-05:00) Eastern Time (US & Canada)
*To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
As Chris suggests, it's probably better to use vendor supplied libraries if you can get those working.
In addition to the further information Chris has asked to take a look at, one thing you could check is to whether there are any files in your nektar++/build/ThirdParty/lapack-3.7.0 directory (if that directory exists at all).
I've been trying to see if I can recreate the problem and I was able to see something similar when setting THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=OFF. In this case, I can configure successfully and start the build but it fails with a large number of undefined references that are similar to, and include, the dtpmv_ symbol that you mentioned. When I look in nektar++/build/ThirdParty/lapack-3.7.0, the directory is empty so it looks like the build system has configured on the basis of building its own blas/lapack but the build hasn't been carried out and therefore LibUtilities can't be linked against it.
As a test, you could try running the build with both THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=ON, if this isn't the setting you've been using already. When I tried this, the build of blas/lapack is carried out successfully and the linking is fine with the full build of Nektar++ completing successfully. I removed the system blas/lapack on my test system to be sure it was linking against the correct instance.
Cheers, Jeremy
On 12 Oct 2018, at 17:50, Chris Cantwell <c.cantwell@imperial.ac.uk> wrote:
Hi Amitvikram,
Could you send us your CMakeCache.txt file from your build directory and the output from running: make VERBOSE=1 for both cases.
In the case of using ThirdParty LAPACK, it seems to not be linking to it. Probably you should be using vendor-supplied libraries if possible though so better if ee can get those working.
Thanks, Chris
On 12 October 2018 14:08:55 BST, Amitvikram Dutta < amitvdutta23@gmail.com> wrote:
Hi Jeremy,
I'm actually trying to build nektar++ on a BGQ cluster similar to
Mira.
I'm trying to build nektar++-4.4.1 and the system lapacek version is 3.4.2
Sincerely, On Fri, Oct 12, 2018 at 4:24 AM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
> > ------------------------------ > *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen > *Sent:* October 12, 2018 4:24:33 AM (UTC-05:00) Eastern Time (US & > Canada) > *To:* Amitvikram Dutta > *Cc:* nektar-users > *Subject:* Re: [Nektar-users] Problem while installing nektar++
with
> lapack > > Hi Amitvikram, > > Can you provide some further details of the problem you're encountering. > > Specifically, can you confirm what platform (including version) you're > building on, and if Linux, which I assume is the platform you're using, > which distribution. > > Can you also confirm what version of Nektar++ you're trying to build, > and the version of the system Lapack distribution that you're using. > > Thanks, > > Jeremy > > On 12 Oct 2018, at 01:05, Amitvikram Dutta <amitvdutta23@gmail.com> > wrote: > > Hi all, > > I keep having the same problem while trying to install nektar++ with > regards to the Lapack libraries. > > When I try to use the system Lapack installation I get the following > message > > */scinet/bgq/Libraries/lapack/lib/liblapack.so: undefined reference to > `_xlfEndIO@XLF_1.0'* > > while when I try to install using the ThirdParty Lapack supplied with > the nektar++ source directory I get the following error > > *../../library/LibUtilities/libLibUtilities.so.4.4.1: undefined > reference to `dtpmv_'* > > I have a feeling these errors have been encountered by the community at > large before. Could someone point out where I'm going wrong? > > Sincerely, > -- > > *Amitvikram Dutta* > > Graduate Research Assistant > > Fluid Mechanics Research Lab > > Multi-Physics Interaction Lab > > University of Waterloo > _______________________________________________ > Nektar-users mailing list > Nektar-users@imperial.ac.uk > https://mailman.ic.ac.uk/mailman/listinfo/nektar-users > > > --
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
_______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/nektar-users
Hi Fatih, Sorry for the delay in getting back to you on this. I'm afraid I don't have an immediate answer to the problem you're experiencing but I've done some investigations into the issues you're having so hopefully some feedback on these may provide some helpful information: On 15 Nov 2018, at 06:35, Fatih Ertinaz <fertinaz@gmail.com> wrote:
Hello everyone,
I am working on the same task as Amitvikram, on the same cluster and currently having the exact same problem: undefined reference to lapack libs even though they are compiled successfully.
I've read through this thread as well as some others, so here is a brief summary about what I've done so far before asking some questions.
-- System info: Using cmake-2.8.12, cross-compiled gcc-4.8 and mpich-2 on a Blue Gene Q cluster.
-- Nektar version: Decided to use the git repo.
While I can't replicate the environment you're working in directly, I've set up a clean Ubuntu linux container with only a very basic initial set of packages installed. (All of my build attempts using the details provided below complete successfully.) Within my base container, I've installed gcc-4.8 from packages (gcc-5 is the default version for the Ubuntu version I'm using - 16.04) built and installed cmake 2.8.12 from source and built and installed MPICH2 (1.5) from source. I'm working with Nektar++ from source, using the master branch.
-- Added "-dynamic" flag to the "CMakeLists.txt" as it was suggested here: https://www.nektar.info/nektar-on-mira-cluster/
-- Boost: I initially used system installed boost but then decided to stick to the third-party version shipped with nektar. It is because, some of the required libs (for instance boost_iostreams) weren't part of the central installation. To deal with that, I firstly set up a partial build by referencing each individual library file explicitly in cmake command. In fact, it seems to build the required libs successfully but later fails during the nektar compilation. I think it messes up the environment and basically links to the wrong files. So anyway, I am using "ThirdParty/boost_1_57_0".
-- Lapack: The reason that I am not using system lapack is simply because cmake says "dgemm_" is not found in the system blas version. Therefore, I am compiling the "ThirdParty/Lapack-3.7.0" which I downloaded from "http://www.netlib.org/lapack/lapack-3.7.0.tgz".
Note that compilation fails with the same error even when I use ThirdParty/lapack.
I initially tried without using the dynamic flag but have subsequently tried with the -dynamic flag too. I'm using ThirdParty boost 1.57 and ThirdParty lapack 3.7.0. I'm also using ThirdParty Scotch. TinyXML and GSMPI are also built from source as ThirdParty dependencies during the Nektar++ build.
-- FFTW: Using system installed version.
I'm using a system installed FFTW from packages.
-- Download process: I cancelled MD5 checks and downloading with "wget" due to the similar ssl error mentioned before. This is an easy workaround though, and probably has nothing to do with the error. I download all packages to the "nektar/ThirdParty" and copy them to "nektar/build/ThirdParty" as well. The reason of this copy operation is that when nektar extracts the downloaded packages, I see that uncompressed folders are somehow empty. I don't know if that's a cmake bug, or a problem from my side. So that's why I download, extract and copy third party sources to "nektar/build/ThirdParty" manually.
Following the discussion on the list with Amitvikram about building when there is no external Internet access to download third party dependencies, I wrote up some of the points that I made about downloading dependencies manually - if you hadn't already seen this, it's on the Nektar++ website at https://www.nektar.info/building-nektar-offline-deps/
-- CMake command:
cmake $src \ -DCMAKE_INSTALL_PREFIX=$prf \ -DNEKTAR_USE_MPI=ON \ -DNEKTAR_USE_SYSTEM_BLAS_LAPACK=ON \ -DNEKTAR_USE_FFTW=ON \ -DFFTW_INCLUDE_DIR=${SCINET_FFTW_INC} \ -DFFTW_LIBRARY=${SCINET_FFTW_LIB}/libfftw3.a \ -DTHIRDPARTY_BUILD_BLAS_LAPACK=ON
As you can see, I enabled both "DNEKTAR_USE_SYSTEM_BLAS_LAPACK" and "DTHIRDPARTY_BUILD_BLAS_LAPACK" due to the suggestions; however this didn't seem to make a difference for me. Compilation fails at the same step with both are enabled or not.
I tried configuring using a similar cmake command to that which you've shown here - the only difference for me was that I didn't need to set the FFTW include directory since my FFTW install is in the system include path. I explicitly specified the path to FFTW_LIBRARY although this shouldn't be necessary since the library is, again, in the standard system library path.
-- Build process for Third-Party: In general they are compiled without any errors. In particular, I checked cmake files for each package and Lapack is configured with "-DBUILD_SHARED_LIBS:STRING=ON". I can see that objects are compiled with "-fPIC" option, it is in the cmake. However, "lapack/CMakeLists.txt" contains this line: "option(BUILD_SHARED_LIBS "Build shared libraries" OFF)" which I set to "ON" in my build script.
This is how libraries look in the "nektar/build/ThirdParty/dist/lib" directory after compiling ThirdParty libraries: bgqdev-fen1-$ ls nektar/build/ThirdParty/dist/lib/ cmake libboost_program_options.so libgsmpi.a libtinyxml.a libblas.so libboost_program_options.so.1.57.0 liblapack.so libxxt.a libblas.so.3 libboost_regex.so liblapack.so.3 libz.a libblas.so.3.7.0 libboost_regex.so.1.57.0 liblapack.so.3.7.0 libz.so libboost_filesystem.so libboost_system.so libscotch.a libz.so.1 libboost_filesystem.so.1.57.0 libboost_system.so.1.57.0 libscotcherr.a libz.so.1.2.7 libboost_iostreams.so libboost_thread.so libscotcherrexit.a pkgconfig libboost_iostreams.so.1.57.0 libboost_thread.so.1.57.0 libscotchmetis.a
This folder is about 1.5GB by the way.
I have exactly the same contents in my ThirdParty/dist/lib directory after building of the third party dependencies. The resulting files are nowhere near as large as yours, I assume the very large size of the folder is something to do with the static libraries being very large although I'm not sure why they would be so big. I think the point you make about building of shared libraries being set to OFF in the CMakeLists.txt file for lapack shouldn't be an issue. If you look in $src/cmake/ThirdPartyBlasLapack.cmake, you should see in the EXTERNAL_PROJECT_ADD command that it is configuring lapack using CMake and specifying -DBUILD_SHARED_LIBS:STRING=ON. You should also be able to verify that lapack was, indeed, configured with this parameter by looking in the CMakeCache.txt file in $build/ThirdParty/lapack-3.7.0/.
However, "nektar/build/ThirdParty/dist/include" folder doesn't have lapack related headers: bgqdev-fen1-$ ls boost scotchf.h scotch.h tinystr.h tinyxml.h zconf.h zlib.h bgqdev-fen1-$ pwd /scinet/bgq/Applications/nektar/nektar/build/ThirdParty/dist/include
I also see exactly the same in my include folder, there are no lapack related headers.
Also, I can share the initial parts of the lapack build - in this version I tried to reference to the system blas for lapack installation: [ 6%] Performing configure step for 'lapack-3.7.0' cd /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0 && /gpfs/home/scinet/bgq/tools/cmake/2.8.12.1/bin/cmake -G "Unix Makefiles" -DCMAKE_Fortran_COMPILER:FIL EPATH=/scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -DCMAKE_INSTALL_PREFIX:PATH=/scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/dist -DCMAKE_INSTALL_LIBDIR:PATH=/scine t/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/dist/lib -DBUILD_SHARED_LIBS:STRING=ON -DBUILD_TESTING:STRING=OFF /scinet/bgq/Applications/nektar/nektar/ThirdParty/lapack-3.7 .0 Re-run cmake no build system arguments -- Setting build type to 'Release' as none was specified. -- The Fortran compiler identification is GNU -- Check for working Fortran compiler: /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -- Check for working Fortran compiler: /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -- works -- Detecting Fortran compiler ABI info -- Detecting Fortran compiler ABI info - done -- Checking whether /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 supports Fortran 90 -- Checking whether /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 supports Fortran 90 -- yes -- Looking for Python greater than 2.6 - -- Could NOT find PythonInterp: Found unsuitable version "2.6.6", but required is at least "2.7" (found /usr/bin/python2) -- No suitable Python version found, so skipping summary tests. -- Reducing RELEASE optimization level to O2 -- Looking for Fortran NONE - found -- Looking for Fortran INT_CPU_TIME - found -- Looking for Fortran EXT_ETIME - not found -- Looking for Fortran EXT_ETIME_ - not found -- Looking for Fortran INT_ETIME - found -- --> Will use second_INT_ETIME.f and dsecnd_INT_ETIME.f as timing function. -- Using supplied NETLIB BLAS implementation -- Using supplied NETLIB LAPACK implementation -- Building Single Precision -- Building Double Precision -- Building Complex Precision -- Building Double Complex Precision -- BUILD TESTING : OFF -- Configuring done -- Generating done -- Build files have been written to: /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0
Again, I see exactly the same output for configuration of lapack. However, when I initially ran this, the build system was picking up my standard C/C++/Fortran compilers so it was using gfortran rather than the MPI version. I reconfigured/rebuilt from scratch specifically telling the build system to use mpicc and mpic++ and setting -DCMAKE_Fortran_COMPILER to point to mpif90, after this I see the same as you have shown above and build again completes successfully.
Additionally, I can see "dgemm" in the log.make: bgqdev-fen1-$ grep -rn "dgemm" nektar/build-gcc/log.make.2 13756:[ 3%] Building Fortran object BLAS/SRC/CMakeFiles/blas.dir/dgemm.f.o 13757:cd /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0/BLAS/SRC && /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -Dblas_EXPORTS -O2 -fPIC -c /scinet/bgq/Applications/nektar/nektar/ThirdParty/lapack-3.7.0/BLAS/SRC/dgemm.f -o CMakeFiles/blas.dir/dgemm.f.o 14018:/scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -fPIC -O2 -Wl,-rpath=/bgsys/drivers/ppcfloor/comm/lib/libmpichf90-gcc.so.8 -shared -Wl,-soname,libblas.so.3 -o ../../lib/libblas.so.3.7.0 CMakeFiles/blas.dir/isamax.f.o CMakeFiles/blas.dir/sasum.f.o CMakeFiles/blas.dir/saxpy.f.o CMakeFiles/blas.dir/scopy.f.o ............... CMakeFiles/blas.dir/dgemm.f.o
This is the part that compilation fails:
[ 34%] Building CXX object utilities/NekMesh/CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/ElUtil.cpp.o ......... ......... /........./bgq/compilers/gcc/4.8.1/bin/../lib/gcc/powerpc64-bgq-linux/4.8.1/../../../../powerpc64-bgq-linux/bin/ld: warning: libmpichf90-gcc.so.8, needed by /scinet/bgq/Applications/nektar/nektar/build/ThirdParty/dist/lib/libblas.so, not found (try using -rpath or -rpath-link) ......... NodeOpti.cpp:(.text+0x1784): undefined reference to `dgeev_' ......... NodeOpti.cpp:(.text+0x25b4): undefined reference to `dgemm_' ......... ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrs_' collect2: error: ld returned 1 exit status make[2]: *** [utilities/NekMesh/NekMesh] Error 1
I've tried many things; xx enabled configure option "build_shared_libs" in CMakeLists.txt in "ThirdParty/lapack" xx made a copy of "make.inc.example" in ThirdParty/lapack and reduced optimization levels xx since this is a Blue Gene environment made reference to ESSL instead of BLAS
But none of it seems to makes a difference. It always fails in the exact same step.
This "libmpichf90-gcc.so.8" warning seems a bit odd to me and I am not sure if that has anything to do with the undefined ref err. I created a symlink to this library and added it to "LD_LIBRARY_PATH" as well, but then it failed with the following message "undefined symbol: _cnkspi_MemoryRegionCacheLastAccessedElementNumber" by "libpami-gcc.so" where PAMI is a lower level messaging api by IBM. Also, "cnkspi" sound far too low level because "CNK" is the kernel on the compute nodes and "SPI" is the implementation that allows communication with that kernel. I added a linker flag "-Wl,-rpath" but I guess it only makes things go uglier.
bgqdev-fen1-$ readelf -d nektar/build-gcc/ThirdParty/dist/lib/libblas.so | grep NEEDED 0x0000000000000001 (NEEDED) Shared library: [libmpichf90-gcc.so.8] 0x0000000000000001 (NEEDED) Shared library: [libmpich-gcc.so.8] 0x0000000000000001 (NEEDED) Shared library: [libopa-gcc.so.0] 0x0000000000000001 (NEEDED) Shared library: [libmpl-gcc.so.1] 0x0000000000000001 (NEEDED) Shared library: [libpami-gcc.so] 0x0000000000000001 (NEEDED) Shared library: [librt.so.1] 0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0] 0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6] 0x0000000000000001 (NEEDED) Shared library: [libc.so.6] 0x0000000000000001 (NEEDED) Shared library: [libnss_files.so.2] 0x0000000000000001 (NEEDED) Shared library: [libnss_dns.so.2] 0x0000000000000001 (NEEDED) Shared library: [libresolv.so.2] 0x0000000000000001 (NEEDED) Shared library: [libgfortran.so.3] 0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
xx As an alternative, I switched to static linking. I initially changed "NEKTAR_LIBRARY_TYPE" to "STATIC" in the "CMakeLists.txt".
This is the one area where I have a number of differences to you. I'm not sure that switching to static linking is likely to make much difference (although I could be wrong) however I note that your blas library seems to require various mpich libraries. My libblas.so library only lists libm.so.6 and libgfortran.so.3 as "NEEDED". Can you also provide the RPATH value that you get from readelf -d for this library... If you could also provide your output of readelf -d for library/LibUtilities/libLibUtilities.so, that would be useful. My libLibUtilities.so needs a few boost libraries as well as libz, libblas, liblapack, libmpich, libpthread, libgcc_s libc, libstdc++ and libm.
xx It seems that some of the ThirdParty libraries are configured with the assumption of shared objects so I changed them as well. For instance, boost is configured with options "link=shared" and "runtime-link=shared" which I set to static. I can see all required boost libs are successfully compiled and written to "build/ThirdParty/dist/lib".
Now this is the cmake command: cmake $src \ -DCMAKE_INSTALL_PREFIX=$prf \ -DNEKTAR_USE_MPI=ON \ -DNEKTAR_USE_SYSTEM_BLAS_LAPACK=ON \ -DNATIVE_BLAS:FILEPATH=${SCINET_LAPACK_LIB}/libblas.a \ -DNATIVE_BLAS_LIB_DIR:FILEPATH=${SCINET_LAPACK_LIB} \ -DNATIVE_LAPACK:FILEPATH=${SCINET_LAPACK_LIB}/liblapack.a \ -DNATIVE_LAPACK_LIB_DIR:FILEPATH=${SCINET_LAPACK_LIB} \ -DNEKTAR_USE_FFTW=ON \ -DFFTW_INCLUDE_DIR=${SCINET_FFTW_INC} \ -DFFTW_LIBRARY=${SCINET_FFTW_LIB}/libfftw3.a \ -DBoost_NO_SYSTEM_PATHS:BOOL=TRUE \ -DZLIB_INCLUDE_DIR:PATH=${SCINET_ZLIB_INC} \ -DZLIB_LIBRARY:FILEPATH=${SCINET_ZLIB_LIB}/libz.a
The issue now is installer seems to ignore "-DBoost_NO_SYSTEM_PATHS:BOOL=TRUE" and seeks locations other than "BOOST_ROOT" which I set to "nektar/build-gcc/dist".
See for instance: [ 5%] Building CXX object library/LibUtilities/CMakeFiles/LibUtilities.dir/BasicUtils/ArrayEqualityCompa rison.cpp.o cd /scinet/bgq/Applications/nektar/nektar++-4.4.1/build-gcc-static/library/LibUtilities && /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpicxx -DLIB_UTILITIES_EXPORTS -DNEKTAR_MEMORY_POOL_ENABLED -DNEKTAR_USE_MPI -DNEKTAR_USING_BLAS -DNEKTAR_USING_LAPACK -DNEKTAR_VERSION=\"4.4.1\" -DTIXML_USE_STL -O3 -DNDEBUG -Wall -Wno-deprecated -Wno-sign-compare -DNEKTAR_RELEASE -isystem /scinet/bgq/Applications/nektar/nektar++-4.4.1/build-gcc-static/ThirdParty/dist/include -isystem /scinet/bgq/Libraries/zlib-1.2.7-gcc4.8.1/include -isystem /scinet/bgq/Applications/nektar/nektar++-4.4.1/ThirdParty/dist/include -isystem /scinet/bgq/Libraries/fftw-3.3.5-gcc/include -I/scinet/bgq/Applications/nektar/nektar++-4.4.1 -I/scinet/bgq/Applications/nektar/nektar++-4.4.1/library -o CMakeFiles/LibUtilities.dir/BasicUtils/ArrayEqualityComparison.cpp.o -c /scinet/bgq/Applications/nektar/nektar++-4.4.1/library/LibUtilities/BasicUtils/ArrayEqualityComparison.cpp In file included from /bgsys/linux/ionfloor/usr/include/boost/config.hpp:57:0, from /bgsys/linux/ionfloor/usr/include/boost/cstdint.hpp:26, from /scinet/bgq/Applications/nektar/nektar++-4.4.1/library/LibUtilities/BasicConst/NektarUnivTypeDefs.hpp:40,
So the main quiestion is: Why does it check "/usr/include/boost" when "cstdint.hpp" already exists in the "build/dist/include/boost/"? bgqdev-fen1-$ ls build-gcc/dist/include/boost/cstdint.hpp -l -rw-r--r-- 1 fertinaz scinet 18017 Nov 14 19:00 build-gcc/dist/include/boost/cstdint.hpp
This is how it finally fails: /bgsys/linux/ionfloor/usr/include/boost/archive/iterators/binary_from_base64.hpp:52:9: warning: narrowing conversion of ‘-1’ from ‘int’ to ‘const char’ inside { } is ill-formed in C++11 [-Wnarrowing] make[2]: *** [library/LibUtilities/CMakeFiles/LibUtilities.dir/BasicUtils/CompressData.cpp.o] Error 1
It doesn't help to change the boost code from "const char lookup_table" to "signed char lookup_table" because then "switch-case" statement that returns the endianness information fails in the following file: "nektar/library/LibUtilities/BasicUtils/CompressData.cpp"
As you can guess, I disabled the switch-case block, and returned the value, but it fails anyway...
Sorry for the long message, hope you could follow. I've run out of ideas and any suggestion is highly appreciated....
// Fatih
I'd be inclined to stick with the third party boost and lapack and see if we can find a solution to that. I can't see what C/C++ compilers you're using but have you tried forcing the use of mpicc and mpic++ as shown when running cmake in the instructions at https://www.nektar.info/nektar-on-mira-cluster/? (in fact, I see above that in your log output for the build command for ArrayEqualityComparison.cpp, it looks like it's using mpicxx) I'm not sure why libblas.so is linking in libmpichf90-gcc.so.8 but I'm assuming this is the core of the problem. Can you confirm what settings you're using to get the additional logging output that you're showing (which the build commands) - is it just -vv? I can then try and run the same and see if I can provide any further suggestions. I'm not clear at the moment but I'm assuming the undefined reference errors are a result of trying to link in libblas.so and that library itself having an undefined reference to libmpichf90. It might be that the rpath settings can be modified to take account of this. Can you provide the output of running ldd on libblas.so, liblapck.so and libLibUtilities.so? Thanks, Jeremy
On Sun, Oct 14, 2018 at 7:42 AM Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote:
Hi Amitvikram,
I would certainly try Chris's suggestion. However, something else to check is where you're getting the third party downloads from.
If you take a clean Nektar++ source tree and place the standard netlib lapack-3.7.0.tgz source file that build system downloads into $NEKTAR_HOME/ThirdParty (i.e. the download from http://www.netlib.org/lapack/lapack-3.7.0.tgz), the build should proceed successfully.
It looks like the lapack tar file that you're using may already have some build artefacts in it - did you tar the content from $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0 into a lapack-3.7.0.tgz file or you're working with the standard .tgz file from the netlib.org site?
Cheers, Jeremy
On 13 Oct 2018, at 21:19, Chris Cantwell <c.cantwell@imperial.ac.uk> wrote:
Hi Amitvikram,
Some sites block non-SSL enabled HTTP traffic, returning a webpage reporting the error rather than the actual file (hence the hash mismatch).
You could try turning on the THIRDPARTY_USE_SSL option to see if that is allowed.
Cheers, Chris
On Fri, 12 Oct 2018 15:15:10 -0400, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I had to use a slight workaround because the platform I was compiling on could not download the thirdparty files correctly and would always give a hash mismatched error. I downloaded the third party files on my home machine and re-uploaded them into the $NEKTAR_HOME folder. As a result when I begin the build process the ThirdParty folder is populated with the zipped versions of the third party software. Is this what what might be causing the problem? Is there any way to edit the cmake file and to get the compilation process working correctly?
On Fri, Oct 12, 2018 at 2:57 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
*From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 2:56:43 PM (UTC-05:00) Eastern Time (US & Canada)
*To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
Have you attempted to build lapack separately at any point? It's probably worth clearing out your build directory and also all the contents of the ThirdParty directory in the base nektar++ source directory, which I'll call $NEKTAR_HOME, so $NEKTAR_HOME/ThirdParty and then re-running cmake in an empty $NEKTAR_HOME/build directory and trying the build again.
It looks like the build step is encountering a previous source tree in the location where it's trying to build which seems strange.
I've just had a look at the log from my clean build and I see exactly the same messages as you in relation to lapack-3.7.0 in the same order as far as "Checking whether /usr/bin/f95 supports Fortran 90 -- yes", however, I then see "-- Looking for Python greater than 2.6 - " and the build of lapack completes successfully.
Just to confirm, I am running cmake and make in a separate build directory under the main nektar++ source tree directory, so I'm building in $NEKTAR_HOME/build - I assume you're doing something similar? You should see a ThirdParty directory in $NEKTAR_HOME and another ThirdParty directory in $NEKTAR_HOME/build/
I believe that the initial download of the lapack-3.7.0.tar.gz should be placed in $NEKTAR_HOME/ThirdParty and unpacked there. Then, when the build succeeds or stops, in $NEKTAR_HOME/build/ThirdParty, you should see lapack-3.7.0/ where I think the build actually takes place, and then a separate $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0-tmp which should contain a couple of lapack-3.7.0-cfgcmd.txt files that I think contain the build command that is used - you could perhaps paste the contents of the lapack-3.7.0-cfgcmd.txt into an email if you're still having issues and we can see if that looks correct.
It is, of course, possible that this is something related to the specific configuration of the platform that you're building on, but I think the third party lapack build should be straightforward and it sounds like for some reason, it's attempting to build in the wrong location, or a location where an existing source tree has ended up for some reason.
I'm afraid I don't have a very detailed knowledge of the build system beyond this so if none of the suggestions so far help you to resolve the problem, maybe someone with more knowledge of the build system can provide some advice.
Cheers, Jeremy
On 12 Oct 2018, at 19:34, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I tried to compile nektar using Jeremy's latest suggestions having both THIRDPARTY_BUILD_BLAS_LAPACK and NEKTAR_USE_SYSTEM_BLAS_LAPACK turned on. The following error occured. It seems that I might have to compile lapack separately. Is this unusual?
<image.png>
On Fri, Oct 12, 2018 at 1:16 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
*From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 1:16:14 PM (UTC-05:00) Eastern Time (US & Canada)
*To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
As Chris suggests, it's probably better to use vendor supplied libraries if you can get those working.
In addition to the further information Chris has asked to take a look at, one thing you could check is to whether there are any files in your nektar++/build/ThirdParty/lapack-3.7.0 directory (if that directory exists at all).
I've been trying to see if I can recreate the problem and I was able to see something similar when setting THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=OFF. In this case, I can configure successfully and start the build but it fails with a large number of undefined references that are similar to, and include, the dtpmv_ symbol that you mentioned. When I look in nektar++/build/ThirdParty/lapack-3.7.0, the directory is empty so it looks like the build system has configured on the basis of building its own blas/lapack but the build hasn't been carried out and therefore LibUtilities can't be linked against it.
As a test, you could try running the build with both THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=ON, if this isn't the setting you've been using already. When I tried this, the build of blas/lapack is carried out successfully and the linking is fine with the full build of Nektar++ completing successfully. I removed the system blas/lapack on my test system to be sure it was linking against the correct instance.
Cheers, Jeremy
On 12 Oct 2018, at 17:50, Chris Cantwell <c.cantwell@imperial.ac.uk> wrote:
Hi Amitvikram,
Could you send us your CMakeCache.txt file from your build directory and the output from running: make VERBOSE=1 for both cases.
In the case of using ThirdParty LAPACK, it seems to not be linking to it. Probably you should be using vendor-supplied libraries if possible though so better if ee can get those working.
Thanks, Chris
On 12 October 2018 14:08:55 BST, Amitvikram Dutta <amitvdutta23@gmail.com> wrote: > > Hi Jeremy, > > I'm actually trying to build nektar++ on a BGQ cluster similar to Mira. > > I'm trying to build nektar++-4.4.1 and the system lapacek version is > 3.4.2 > > Sincerely, > On Fri, Oct 12, 2018 at 4:24 AM Amitvikram Dutta < > amitvikram.dutta@uwaterloo.ca> wrote: > >> >> ------------------------------ >> *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen >> *Sent:* October 12, 2018 4:24:33 AM (UTC-05:00) Eastern Time (US & >> Canada) >> *To:* Amitvikram Dutta >> *Cc:* nektar-users >> *Subject:* Re: [Nektar-users] Problem while installing nektar++ with >> lapack >> >> Hi Amitvikram, >> >> Can you provide some further details of the problem you're encountering. >> >> Specifically, can you confirm what platform (including version) you're >> building on, and if Linux, which I assume is the platform you're using, >> which distribution. >> >> Can you also confirm what version of Nektar++ you're trying to build, >> and the version of the system Lapack distribution that you're using. >> >> Thanks, >> >> Jeremy >> >> On 12 Oct 2018, at 01:05, Amitvikram Dutta <amitvdutta23@gmail.com> >> wrote: >> >> Hi all, >> >> I keep having the same problem while trying to install nektar++ with >> regards to the Lapack libraries. >> >> When I try to use the system Lapack installation I get the following >> message >> >> */scinet/bgq/Libraries/lapack/lib/liblapack.so: undefined reference to >> `_xlfEndIO@XLF_1.0'* >> >> while when I try to install using the ThirdParty Lapack supplied with >> the nektar++ source directory I get the following error >> >> *../../library/LibUtilities/libLibUtilities.so.4.4.1: undefined >> reference to `dtpmv_'* >> >> I have a feeling these errors have been encountered by the community at >> large before. Could someone point out where I'm going wrong? >> >> Sincerely, >> -- >> >> *Amitvikram Dutta* >> >> Graduate Research Assistant >> >> Fluid Mechanics Research Lab >> >> Multi-Physics Interaction Lab >> >> University of Waterloo >> _______________________________________________ >> Nektar-users mailing list >> Nektar-users@imperial.ac.uk >> https://mailman.ic.ac.uk/mailman/listinfo/nektar-users >> >> >> -- > > *Amitvikram Dutta* > > Graduate Research Assistant > > Fluid Mechanics Research Lab > > Multi-Physics Interaction Lab > > University of Waterloo >
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
_______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/nektar-users
Hello Jeremy I appreciate your detailed response and sorry for my late reply. I came back to this issue over the weekend, made certain changes and achieved some progress. Thus wanted to share current status. I am convinced that the problem is related to linking BLAS and LAPACK, and probably it is specific to this platform -- Blue Gene Q. Because I managed to run nektar successfully on many different platforms and never encountered an issue. I am using latest git repo for nektar by the way. I compiled some of the third-party libraries (boost, scotch and blas & lapack) separately under the directory I created "nektar/ThirdParty_compiled". This resolved earlier problems with boost. Also, when I check blas & lapack functions that are used by nektar, I can find their references in their libraries:
bgqdev-fen1-$ nm libblas.a | grep -i dgemm dgemm.f.o: 0000000000000000 D dgemm bgqdev-fen1-$ nm liblapack.a | grep -i dgeev dgeev.f.o: 0000000000000000 D dgeev dgeevx.f.o: 0000000000000000 D dgeevx
As you see those are static libs because when shared objects are used, cmake doesn't detect BLAS (don't know why -- maybe BGQ) even though full-paths are provided. Same thing with static libs seems to be at least detected by cmake (prints out BLAS API found -- see below). On the other hand, I compiled boost with shared libs, and cmake recognizes them correctly. I also want to emphasize that, mangled names don't appear in the BLAS-LAPACK libs. So for instance "dgemm_" doesn't exist. And this is why the installation fails. This is from cmake -- I hacked cmake to seek "dgemm" as well, but it is also not found:
-- Looking for dgemm_ -- Looking for dgemm_ - not found -- Looking for Fortran sgemm -- Looking for Fortran sgemm - found -- A library with BLAS API found. -- Looking for Fortran cheev -- Looking for Fortran cheev - found
Before going forward, I'd like to make a suggestion for these types of checks. CMake documentation says, "Prefer using CHECK_SYMBOL_EXISTS instead of this module..." referring to CHECK_FUNCTION_EXISTS which is used by Nektar at the moment. There are certain types of implementations which cannot be detected by CHECK_FUNCTION_EXISTS. I didn't test it though. For further details: https://cmake.org/cmake/help/v3.8/module/CheckFunctionExists.html Anyway this is where installation fails:
[ 26%] Built target LocalRegions Linking CXX executable NekMesh /ess01/homebgq/scinet/bgq/compilers/gcc/4.8.1/bin/../lib/gcc/powerpc64-bgq-linux/4.8.1/../../../../powerpc64-bgq-linux/bin/ld: warning: libboost_atomic.so.1.57.0, needed by /scinet/bgq/Applications/nektar/ThirdParty_compiled/boost_1_57_0/install/lib/libboost_thread.so, not found (try using -rpath or -rpath-link) CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) [clone .part.27]': NodeOpti.cpp:(.text+0x1784): undefined reference to `dgeev_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `std::ctype<char>::widen(char) const [clone .part.33]': NodeOpti.cpp:(.text+0x1e6c): undefined reference to `dgeev_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::exception_detail::error_info_injector<std::domain_error>::~error_info_injector()': NodeOpti.cpp:(.text+0x25b4): undefined reference to `dgemm_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::exception_detail::error_info_injector<std::overflow_error>::~error_info_injector()': NodeOpti.cpp:(.text+0x44a4): undefined reference to `dgemm_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::system::system_error::what() const': NodeOpti.cpp:(.text._ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi2EEEdRdb[_ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi2EEEdRdb]+0x2e8): undefined reference to `dgemm_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::io::basic_altstringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_altstringbuf()': NodeOpti.cpp:(.text._ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi3EEEdRdb[_ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi3EEEdRdb]+0x318): undefined reference to `dgemm_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dtptrs_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `daxpy_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpbtrf_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dsptrf_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `dgemv_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbmv_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `dscal_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dtpmv_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetrf_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrf_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpbtrs_' ../../library/StdRegions/libStdRegions.so.4.5.0: undefined reference to `dcopy_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dsptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrf_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `ddot_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dspmv_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dsptri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrs_' collect2: error: ld returned 1 exit status make[2]: *** [utilities/NekMesh/NekMesh] Error 1 make[1]: *** [utilities/NekMesh/CMakeFiles/NekMesh.dir/all] Error 2 make: *** [all] Error 2
However implementation for dgemm:
grep -irn "dgemm" * nektar/utilities/NekMesh/ProcessModules/ProcessVarOpti/Evaluator.hxx: *Blas::Dgemm*('N', 'N', pts, DIM * nElmt, ptsStd, 1.0,
By the way, as suggest in earlier messages in this thread, I enabled both -DTHIRDPARTY_BUILD_BLAS_LAPACK and -DNEKTAR_USE_SYSTEM_BLAS_LAPACK. Just in case, I've attached CMakeLists.txt file as well for other settings. So the question is, why is it trying to reference the mangled names? Is it cmake causing the mess or the compiler? And how this can be tackled? When I grep "dgemm_", I get nothing except the object files. Moreover, in earlier versions of Nektar (4.0.0 and 4.3.5 as far as I checked), there is a preprocessor definition called "NEKTAR_LIB_UTILITIES_LINEAR_ALGEBRA_DGEMM_OVERRIDE_HPP". This definition doesn't exist in the most recent git repo. However, I still can find the Blas.hpp for wrapper functions under the "LibUtilities/LinearAlgebra". Also, "DgemmOverride.hpp" doesn't exist as well. So at some point, could it be possible that portability is broken? Thanks very much // Fatih On Mon, Nov 19, 2018 at 8:12 AM Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote:
Hi Fatih,
Sorry for the delay in getting back to you on this. I'm afraid I don't have an immediate answer to the problem you're experiencing but I've done some investigations into the issues you're having so hopefully some feedback on these may provide some helpful information:
On 15 Nov 2018, at 06:35, Fatih Ertinaz <fertinaz@gmail.com> wrote:
Hello everyone,
I am working on the same task as Amitvikram, on the same cluster and currently having the exact same problem: *undefined reference to lapack libs even though they are compiled successfully*.
I've read through this thread as well as some others, so here is a brief summary about what I've done so far before asking some questions.
-- System info: Using cmake-2.8.12, cross-compiled gcc-4.8 and mpich-2 on a Blue Gene Q cluster.
-- Nektar version: Decided to use the git repo.
While I can't replicate the environment you're working in directly, I've set up a clean Ubuntu linux container with only a very basic initial set of packages installed. (All of my build attempts using the details provided below complete successfully.)
Within my base container, I've installed gcc-4.8 from packages (gcc-5 is the default version for the Ubuntu version I'm using - 16.04) built and installed cmake 2.8.12 from source and built and installed MPICH2 (1.5) from source.
I'm working with Nektar++ from source, using the master branch.
-- Added "-dynamic" flag to the "CMakeLists.txt" as it was suggested here: https://www.nektar.info/nektar-on-mira-cluster/
-- Boost: I initially used system installed boost but then decided to stick to the third-party version shipped with nektar. It is because, some of the required libs (for instance boost_iostreams) weren't part of the central installation. To deal with that, I firstly set up a partial build by referencing each individual library file explicitly in cmake command. In fact, it seems to build the required libs successfully but later fails during the nektar compilation. I think it messes up the environment and basically links to the wrong files. So anyway, I am using "ThirdParty/boost_1_57_0".
-- Lapack: The reason that I am not using system lapack is simply because cmake says "dgemm_" is not found in the system blas version. Therefore, I am compiling the "ThirdParty/Lapack-3.7.0" which I downloaded from " http://www.netlib.org/lapack/lapack-3.7.0.tgz".
*Note that compilation fails with the same error even when I use ThirdParty/lapack.*
I initially tried without using the dynamic flag but have subsequently tried with the -dynamic flag too.
I'm using ThirdParty boost 1.57 and ThirdParty lapack 3.7.0.
I'm also using ThirdParty Scotch. TinyXML and GSMPI are also built from source as ThirdParty dependencies during the Nektar++ build.
-- FFTW: Using system installed version.
I'm using a system installed FFTW from packages.
-- Download process: I cancelled MD5 checks and downloading with "wget" due to the similar ssl error mentioned before. This is an easy workaround though, and probably has nothing to do with the error. I download all packages to the "nektar/ThirdParty" and copy them to "nektar/build/ThirdParty" as well. The reason of this copy operation is that when nektar extracts the downloaded packages, I see that uncompressed folders are somehow empty. I don't know if that's a cmake bug, or a problem from my side. So that's why I download, extract and copy third party sources to "nektar/build/ThirdParty" manually.
Following the discussion on the list with Amitvikram about building when there is no external Internet access to download third party dependencies, I wrote up some of the points that I made about downloading dependencies manually - if you hadn't already seen this, it's on the Nektar++ website at https://www.nektar.info/building-nektar-offline-deps/
-- CMake command:
cmake $src \ -DCMAKE_INSTALL_PREFIX=$prf \ -DNEKTAR_USE_MPI=ON \ -DNEKTAR_USE_SYSTEM_BLAS_LAPACK=ON \ -DNEKTAR_USE_FFTW=ON \ -DFFTW_INCLUDE_DIR=${SCINET_FFTW_INC} \ -DFFTW_LIBRARY=${SCINET_FFTW_LIB}/libfftw3.a \ -DTHIRDPARTY_BUILD_BLAS_LAPACK=ON
As you can see, I enabled both "DNEKTAR_USE_SYSTEM_BLAS_LAPACK" and " DTHIRDPARTY_BUILD_BLAS_LAPACK" due to the suggestions; however this didn't seem to make a difference for me. Compilation fails at the same step with both are enabled or not.
I tried configuring using a similar cmake command to that which you've shown here - the only difference for me was that I didn't need to set the FFTW include directory since my FFTW install is in the system include path. I explicitly specified the path to FFTW_LIBRARY although this shouldn't be necessary since the library is, again, in the standard system library path.
-- Build process for Third-Party: In general they are compiled without any errors. In particular, I checked cmake files for each package and Lapack is configured with "-DBUILD_SHARED_LIBS:STRING=ON". I can see that objects are compiled with "-fPIC" option, it is in the cmake. However, " lapack/CMakeLists.txt" contains this line: "option(BUILD_SHARED_LIBS "Build shared libraries" OFF)" which I set to "ON" in my build script.
This is how libraries look in the "nektar/build/ThirdParty/dist/lib" directory after compiling ThirdParty libraries:
bgqdev-fen1-$ ls nektar/build/ThirdParty/dist/lib/ cmake libboost_program_options.so
libgsmpi.a libtinyxml.a
*libblas.so* libboost_program_options.so.1.57.0
*liblapack.so* libxxt.a
libblas.so.3 libboost_regex.so
liblapack.so.3 libz.a
libblas.so.3.7.0 libboost_regex.so.1.57.0
liblapack.so.3.7.0 libz.so
libboost_filesystem.so
libboost_system.so libscotch.a
libz.so.1 libboost_filesystem.so.1.57.0
libboost_system.so.1.57.0
libscotcherr.a libz.so.1.2.7
libboost_iostreams.so
libboost_thread.so libscotcherrexit.a
pkgconfig libboost_iostreams.so.1.57.0
libboost_thread.so.1.57.0
libscotchmetis.a
This folder is about 1.5GB by the way.
I have exactly the same contents in my ThirdParty/dist/lib directory after building of the third party dependencies. The resulting files are nowhere near as large as yours, I assume the very large size of the folder is something to do with the static libraries being very large although I'm not sure why they would be so big.
I think the point you make about building of shared libraries being set to OFF in the CMakeLists.txt file for lapack shouldn't be an issue. If you look in $src/cmake/ThirdPartyBlasLapack.cmake, you should see in the EXTERNAL_PROJECT_ADD command that it is configuring lapack using CMake and specifying -DBUILD_SHARED_LIBS:STRING=ON. You should also be able to verify that lapack was, indeed, configured with this parameter by looking in the CMakeCache.txt file in $build/ThirdParty/lapack-3.7.0/.
However, "nektar/build/ThirdParty/dist/include" folder doesn't have lapack related headers:
bgqdev-fen1-$ ls boost scotchf.h scotch.h tinystr.h tinyxml.h zconf.h zlib.h bgqdev-fen1-$ pwd /scinet/bgq/Applications/nektar/nektar/build/ThirdParty/dist/include
I also see exactly the same in my include folder, there are no lapack related headers.
Also, I can share the initial parts of the lapack build - in this version I tried to reference to the system blas for lapack installation:
[ 6%] Performing configure step for 'lapack-3.7.0' cd /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0 && /gpfs/home/scinet/bgq/tools/cmake/2.8.12.1/bin/cmake -G "Unix Makefiles" -DCMAKE_Fortran_COMPILER:FIL EPATH=/scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -DCMAKE_INSTALL_PREFIX:PATH=/scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/dist -DCMAKE_INSTALL_LIBDIR:PATH=/scine t/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/dist/lib -DBUILD_SHARED_LIBS:STRING=ON -DBUILD_TESTING:STRING=OFF /scinet/bgq/Applications/nektar/nektar/ThirdParty/lapack-3.7 .0 Re-run cmake no build system arguments -- Setting build type to 'Release' as none was specified. -- The Fortran compiler identification is GNU -- Check for working Fortran compiler: /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -- Check for working Fortran compiler: /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -- works -- Detecting Fortran compiler ABI info -- Detecting Fortran compiler ABI info - done -- Checking whether /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 supports Fortran 90 -- Checking whether /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 supports Fortran 90 -- yes -- Looking for Python greater than 2.6 - -- Could NOT find PythonInterp: Found unsuitable version "2.6.6", but required is at least "2.7" (found /usr/bin/python2) -- No suitable Python version found, so skipping summary tests. -- Reducing RELEASE optimization level to O2 -- Looking for Fortran NONE - found -- Looking for Fortran INT_CPU_TIME - found -- Looking for Fortran EXT_ETIME - not found -- Looking for Fortran EXT_ETIME_ - not found -- Looking for Fortran INT_ETIME - found -- --> Will use second_INT_ETIME.f and dsecnd_INT_ETIME.f as timing function. *-- Using supplied NETLIB BLAS implementation* *-- Using supplied NETLIB LAPACK implementation* -- Building Single Precision -- Building Double Precision -- Building Complex Precision -- Building Double Complex Precision -- BUILD TESTING : OFF -- Configuring done -- Generating done -- Build files have been written to: /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0
Again, I see exactly the same output for configuration of lapack. However, when I initially ran this, the build system was picking up my standard C/C++/Fortran compilers so it was using gfortran rather than the MPI version. I reconfigured/rebuilt from scratch specifically telling the build system to use mpicc and mpic++ and setting -DCMAKE_Fortran_COMPILER to point to mpif90, after this I see the same as you have shown above and build again completes successfully.
Additionally, I can see "dgemm" in the log.make:
bgqdev-fen1-$ grep -rn "dgemm" nektar/build-gcc/log.make.2 13756:[ 3%] Building Fortran object *BLAS/SRC/CMakeFiles/blas.dir/dgemm.f.o* 13757:cd /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0/BLAS/SRC && /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -Dblas_EXPORTS -O2 -fPIC -c /scinet/bgq/Applications/nektar/ *nektar/ThirdParty/lapack-3.7.0/BLAS/SRC/dgemm.f* -o CMakeFiles/blas.dir/ *dgemm.f.o* 14018:/scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -fPIC -O2 -Wl,-rpath=/bgsys/drivers/ppcfloor/comm/lib/libmpichf90-gcc.so.8 -shared -Wl,-soname,libblas.so.3 -o ../../lib/libblas.so.3.7.0 CMakeFiles/blas.dir/isamax.f.o CMakeFiles/blas.dir/sasum.f.o CMakeFiles/blas.dir/saxpy.f.o CMakeFiles/blas.dir/scopy.f.o
...............
*CMakeFiles/blas.dir/dgemm.f.o *
This is the part that compilation fails:
[ 34%] Building CXX object utilities/NekMesh/CMakeFiles/ NekMesh.dir/ProcessModules/ProcessVarOpti/ElUtil.cpp.o
.........
.........
/........./bgq/compilers/gcc/4.8.1/bin/../lib/gcc/powerpc64-bgq-linux/4. 8.1/../../../../*powerpc64-bgq-**linux/bin/ld: warning: libmpichf90-gcc.so.8, needed by /scinet/bgq/Applications/* nektar/nektar/build/ThirdParty/dist/lib/libblas.*so*, *not found (try using -rpath or -rpath-link) *
.........
NodeOpti.cpp:(.text+0x1784): undefined reference to `dgeev_'
.........
NodeOpti.cpp:(.text+0x25b4): undefined reference to `dgemm_'
.........
../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference
to `dgetri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrs_' collect2: error: ld returned 1 exit status make[2]: *** [utilities/NekMesh/NekMesh] Error 1
I've tried many things; xx enabled configure option "build_shared_libs" in CMakeLists.txt in "ThirdParty/lapack" xx made a copy of "make.inc.example" in ThirdParty/lapack and reduced optimization levels xx since this is a Blue Gene environment *made reference to ESSL instead of BLAS*
*But none of it seems to makes a difference. It always fails in the exact same step.*
This "*libmpichf90-gcc.so.8*" warning seems a bit odd to me and I am not sure if that has anything to do with the undefined ref err. I created a symlink to this library and added it to "LD_LIBRARY_PATH" as well, but then it failed with the following message "undefined symbol: _cnkspi_MemoryRegionCacheLastAccessedElementNumber" by "*libpami-gcc.so*" where PAMI is a lower level messaging api by IBM. Also, "cnkspi" sound far too low level because "CNK" is the kernel on the compute nodes and "SPI" is the implementation that allows communication with that kernel. I added a linker flag "-Wl,-rpath" but I guess it only makes things go uglier.
bgqdev-fen1-$ readelf -d nektar/build-gcc/ThirdParty/dist/lib/libblas.so |
grep NEEDED 0x0000000000000001 (NEEDED) Shared library: [ *libmpichf90-gcc.so.8*] 0x0000000000000001 (NEEDED) Shared library: [libmpich-gcc.so.8] 0x0000000000000001 (NEEDED) Shared library: [libopa-gcc.so.0] 0x0000000000000001 (NEEDED) Shared library: [libmpl-gcc.so.1] 0x0000000000000001 (NEEDED) Shared library: [ *libpami-gcc.so*] 0x0000000000000001 (NEEDED) Shared library: [librt.so.1] 0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0] 0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6] 0x0000000000000001 (NEEDED) Shared library: [libc.so.6] 0x0000000000000001 (NEEDED) Shared library: [libnss_files.so.2] 0x0000000000000001 (NEEDED) Shared library: [libnss_dns.so.2] 0x0000000000000001 (NEEDED) Shared library: [libresolv.so.2] 0x0000000000000001 (NEEDED) Shared library: [libgfortran.so.3] 0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
xx *As an alternative, I switched to static linking*. I initially changed "NEKTAR_LIBRARY_TYPE" to "STATIC" in the "CMakeLists.txt".
This is the one area where I have a number of differences to you. I'm not sure that switching to static linking is likely to make much difference (although I could be wrong) however I note that your blas library seems to require various mpich libraries. My libblas.so library only lists libm.so.6 and libgfortran.so.3 as "NEEDED". Can you also provide the RPATH value that you get from readelf -d for this library...
If you could also provide your output of readelf -d for library/LibUtilities/libLibUtilities.so, that would be useful. My libLibUtilities.so needs a few boost libraries as well as libz, libblas, liblapack, libmpich, libpthread, libgcc_s libc, libstdc++ and libm.
xx It seems that some of the ThirdParty libraries are configured with the assumption of shared objects so I changed them as well. For instance, boost is configured with options "link=shared" and "runtime-link=shared" which I set to static.* I can see all required boost libs are successfully compiled and written to "build/ThirdParty/dist/lib".*
Now this is the cmake command:
cmake $src \ -DCMAKE_INSTALL_PREFIX=$prf \ -DNEKTAR_USE_MPI=ON \ -DNEKTAR_USE_SYSTEM_BLAS_LAPACK=ON \ -DNATIVE_BLAS:FILEPATH=${SCINET_LAPACK_LIB}/libblas.a \ -DNATIVE_BLAS_LIB_DIR:FILEPATH=${SCINET_LAPACK_LIB} \ -DNATIVE_LAPACK:FILEPATH=${SCINET_LAPACK_LIB}/liblapack.a \ -DNATIVE_LAPACK_LIB_DIR:FILEPATH=${SCINET_LAPACK_LIB} \ -DNEKTAR_USE_FFTW=ON \ -DFFTW_INCLUDE_DIR=${SCINET_FFTW_INC} \ -DFFTW_LIBRARY=${SCINET_FFTW_LIB}/libfftw3.a \ -DBoost_NO_SYSTEM_PATHS:BOOL=TRUE \ -DZLIB_INCLUDE_DIR:PATH=${SCINET_ZLIB_INC} \ -DZLIB_LIBRARY:FILEPATH=${SCINET_ZLIB_LIB}/libz.a
The issue now is installer seems to ignore "-DBoost_NO_SYSTEM_PATHS:BOOL=TRUE" and seeks locations other than "BOOST_ROOT" which I set to " nektar/build-gcc/dist".
See for instance:
[ 5%] Building CXX object library/LibUtilities/CMakeFiles/LibUtilities.dir/BasicUtils/ArrayEqualityCompa rison.cpp.o cd /scinet/bgq/Applications/nektar/nektar++-4.4.1/build-gcc-static/library/LibUtilities && /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpicxx -DLIB_UTILITIES_EXPORTS -DNEKTAR_MEMORY_POOL_ENABLED -DNEKTAR_USE_MPI -DNEKTAR_USING_BLAS -DNEKTAR_USING_LAPACK -DNEKTAR_VERSION=\"4.4.1\" -DTIXML_USE_STL -O3 -DNDEBUG -Wall -Wno-deprecated -Wno-sign-compare -DNEKTAR_RELEASE -isystem /scinet/bgq/Applications/nektar/nektar++-4.4.1/build-gcc-static/ThirdParty/dist/include -isystem /scinet/bgq/Libraries/zlib-1.2.7-gcc4.8.1/include -isystem /scinet/bgq/Applications/nektar/nektar++-4.4.1/ThirdParty/dist/include -isystem /scinet/bgq/Libraries/fftw-3.3.5-gcc/include -I/scinet/bgq/Applications/nektar/nektar++-4.4.1 -I/scinet/bgq/Applications/nektar/nektar++-4.4.1/library -o CMakeFiles/LibUtilities.dir/BasicUtils/ArrayEqualityComparison.cpp.o -c /scinet/bgq/Applications/nektar/nektar++-4.4.1/library/LibUtilities/BasicUtils/ArrayEqualityComparison.cpp In file included from */bgsys/linux/ionfloor/usr/include/boost/config.hpp:57:0*, from */bgsys/linux/ionfloor/usr/include/boost/cstdint.hpp:26*, from /scinet/bgq/Applications/nektar/nektar++-4.4.1/library/LibUtilities/BasicConst/NektarUnivTypeDefs.hpp:40,
So the main quiestion is: *Why does it check "/usr/include/boost" when "cstdint.hpp" already exists in the "build/dist/include/boost/"?*
bgqdev-fen1-$ ls build-gcc/dist/include/boost/cstdint.hpp -l -rw-r--r-- 1 fertinaz scinet 18017 Nov 14 19:00 build-gcc/dist/include/boost/cstdint.hpp
This is how it finally fails:
*/bgsys/linux/ionfloor/usr/include/boost/archive/iterators/binary_from_base64.hpp*:52:9: warning: narrowing conversion of ‘-1’ from ‘int’ to ‘const char’ inside { } is ill-formed in C++11 [-Wnarrowing] make[2]: *** [library/LibUtilities/CMakeFiles/LibUtilities.dir/BasicUtils/CompressData.cpp.o] Error 1
It doesn't help to change the boost code from "const char lookup_table" to "signed char lookup_table" because then "switch-case" statement that returns the endianness information fails in the following file: " nektar/library/LibUtilities/BasicUtils/CompressData.cpp"
As you can guess, I disabled the switch-case block, and returned the value, but it fails anyway...
Sorry for the long message, hope you could follow. I've run out of ideas and any suggestion is highly appreciated....
// Fatih
I'd be inclined to stick with the third party boost and lapack and see if we can find a solution to that. I can't see what C/C++ compilers you're using but have you tried forcing the use of mpicc and mpic++ as shown when running cmake in the instructions at https://www.nektar.info/nektar-on-mira-cluster/? (in fact, I see above that in your log output for the build command for ArrayEqualityComparison.cpp, it looks like it's using mpicxx)
I'm not sure why libblas.so is linking in libmpichf90-gcc.so.8 but I'm assuming this is the core of the problem. Can you confirm what settings you're using to get the additional logging output that you're showing (which the build commands) - is it just -vv? I can then try and run the same and see if I can provide any further suggestions. I'm not clear at the moment but I'm assuming the undefined reference errors are a result of trying to link in libblas.so and that library itself having an undefined reference to libmpichf90. It might be that the rpath settings can be modified to take account of this.
Can you provide the output of running ldd on libblas.so, liblapck.so and libLibUtilities.so?
Thanks,
Jeremy
On Sun, Oct 14, 2018 at 7:42 AM Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote:
Hi Amitvikram,
I would certainly try Chris's suggestion. However, something else to
check is where you're getting the third party downloads from.
If you take a clean Nektar++ source tree and place the standard netlib
lapack-3.7.0.tgz source file that build system downloads into $NEKTAR_HOME/ThirdParty (i.e. the download from http://www.netlib.org/lapack/lapack-3.7.0.tgz), the build should proceed successfully.
It looks like the lapack tar file that you're using may already have
some build artefacts in it - did you tar the content from $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0 into a lapack-3.7.0.tgz file or you're working with the standard .tgz file from the netlib.org site?
Cheers, Jeremy
On 13 Oct 2018, at 21:19, Chris Cantwell <c.cantwell@imperial.ac.uk>
wrote:
Hi Amitvikram,
Some sites block non-SSL enabled HTTP traffic, returning a webpage
reporting the error rather than the actual file (hence the hash mismatch).
You could try turning on the THIRDPARTY_USE_SSL option to see if that
is allowed.
Cheers, Chris
On Fri, 12 Oct 2018 15:15:10 -0400, Amitvikram Dutta <
amitvdutta23@gmail.com> wrote:
Hi all,
I had to use a slight workaround because the platform I was compiling on could not download the thirdparty files correctly and would always give a hash mismatched error. I downloaded the third party files on my home machine and re-uploaded them into the $NEKTAR_HOME folder. As a result when I begin the build process the ThirdParty folder is populated with the zipped versions of the third party software. Is this what what might be causing the problem? Is there any way to edit the cmake file and to get the compilation process working correctly?
On Fri, Oct 12, 2018 at 2:57 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
*From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 2:56:43 PM (UTC-05:00) Eastern Time (US & Canada)
*To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
Have you attempted to build lapack separately at any point? It's probably worth clearing out your build directory and also all the contents of the ThirdParty directory in the base nektar++ source directory, which I'll call $NEKTAR_HOME, so $NEKTAR_HOME/ThirdParty and then re-running cmake in an empty $NEKTAR_HOME/build directory and trying the build again.
It looks like the build step is encountering a previous source tree in the location where it's trying to build which seems strange.
I've just had a look at the log from my clean build and I see exactly the same messages as you in relation to lapack-3.7.0 in the same order as far as "Checking whether /usr/bin/f95 supports Fortran 90 -- yes", however, I then see "-- Looking for Python greater than 2.6 - " and the build of lapack completes successfully.
Just to confirm, I am running cmake and make in a separate build directory under the main nektar++ source tree directory, so I'm building in $NEKTAR_HOME/build - I assume you're doing something similar? You should see a ThirdParty directory in $NEKTAR_HOME and another ThirdParty directory in $NEKTAR_HOME/build/
I believe that the initial download of the lapack-3.7.0.tar.gz should be placed in $NEKTAR_HOME/ThirdParty and unpacked there. Then, when the build succeeds or stops, in $NEKTAR_HOME/build/ThirdParty, you should see lapack-3.7.0/ where I think the build actually takes place, and then a separate $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0-tmp which should contain a couple of lapack-3.7.0-cfgcmd.txt files that I think contain the build command that is used - you could perhaps paste the contents of the lapack-3.7.0-cfgcmd.txt into an email if you're still having issues and we can see if that looks correct.
It is, of course, possible that this is something related to the specific configuration of the platform that you're building on, but I think the third party lapack build should be straightforward and it sounds like for some reason, it's attempting to build in the wrong location, or a location where an existing source tree has ended up for some reason.
I'm afraid I don't have a very detailed knowledge of the build system beyond this so if none of the suggestions so far help you to resolve the problem, maybe someone with more knowledge of the build system can provide some advice.
Cheers, Jeremy
On 12 Oct 2018, at 19:34, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I tried to compile nektar using Jeremy's latest suggestions having both THIRDPARTY_BUILD_BLAS_LAPACK and NEKTAR_USE_SYSTEM_BLAS_LAPACK turned on. The following error occured. It seems that I might have to compile lapack separately. Is this unusual?
<image.png>
On Fri, Oct 12, 2018 at 1:16 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
*From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 1:16:14 PM (UTC-05:00) Eastern Time (US & Canada)
*To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
As Chris suggests, it's probably better to use vendor supplied libraries if you can get those working.
In addition to the further information Chris has asked to take a look at, one thing you could check is to whether there are any files in your nektar++/build/ThirdParty/lapack-3.7.0 directory (if that directory exists at all).
I've been trying to see if I can recreate the problem and I was able to see something similar when setting THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=OFF. In this case, I can configure successfully and start the build but it fails with a large number of undefined references that are similar to, and include, the dtpmv_ symbol that you mentioned. When I look in nektar++/build/ThirdParty/lapack-3.7.0, the directory is empty so it looks like the build system has configured on the basis of building its own blas/lapack but the build hasn't been carried out and therefore LibUtilities can't be linked against it.
As a test, you could try running the build with both THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=ON, if this isn't the setting you've been using already. When I tried this, the build of blas/lapack is carried out successfully and the linking is fine with the full build of Nektar++ completing successfully. I removed the system blas/lapack on my test system to be sure it was linking against the correct instance.
Cheers, Jeremy
On 12 Oct 2018, at 17:50, Chris Cantwell <c.cantwell@imperial.ac.uk
wrote:
Hi Amitvikram,
Could you send us your CMakeCache.txt file from your build directory and the output from running: make VERBOSE=1 for both cases.
In the case of using ThirdParty LAPACK, it seems to not be linking to it. Probably you should be using vendor-supplied libraries if possible though so better if ee can get those working.
Thanks, Chris
On 12 October 2018 14:08:55 BST, Amitvikram Dutta < amitvdutta23@gmail.com> wrote: > > Hi Jeremy, > > I'm actually trying to build nektar++ on a BGQ cluster similar to Mira. > > I'm trying to build nektar++-4.4.1 and the system lapacek version is > 3.4.2 > > Sincerely, > On Fri, Oct 12, 2018 at 4:24 AM Amitvikram Dutta < > amitvikram.dutta@uwaterloo.ca> wrote: > >> >> ------------------------------ >> *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen >> *Sent:* October 12, 2018 4:24:33 AM (UTC-05:00) Eastern Time (US & >> Canada) >> *To:* Amitvikram Dutta >> *Cc:* nektar-users >> *Subject:* Re: [Nektar-users] Problem while installing nektar++ with >> lapack >> >> Hi Amitvikram, >> >> Can you provide some further details of the problem you're encountering. >> >> Specifically, can you confirm what platform (including version) you're >> building on, and if Linux, which I assume is the platform you're using, >> which distribution. >> >> Can you also confirm what version of Nektar++ you're trying to build, >> and the version of the system Lapack distribution that you're using. >> >> Thanks, >> >> Jeremy >> >> On 12 Oct 2018, at 01:05, Amitvikram Dutta < amitvdutta23@gmail.com> >> wrote: >> >> Hi all, >> >> I keep having the same problem while trying to install nektar++ with >> regards to the Lapack libraries. >> >> When I try to use the system Lapack installation I get the following >> message >> >> */scinet/bgq/Libraries/lapack/lib/liblapack.so: undefined reference to >> `_xlfEndIO@XLF_1.0'* >> >> while when I try to install using the ThirdParty Lapack supplied with >> the nektar++ source directory I get the following error >> >> *../../library/LibUtilities/libLibUtilities.so.4.4.1: undefined >> reference to `dtpmv_'* >> >> I have a feeling these errors have been encountered by the community at >> large before. Could someone point out where I'm going wrong? >> >> Sincerely, >> -- >> >> *Amitvikram Dutta* >> >> Graduate Research Assistant >> >> Fluid Mechanics Research Lab >> >> Multi-Physics Interaction Lab >> >> University of Waterloo >> _______________________________________________ >> Nektar-users mailing list >> Nektar-users@imperial.ac.uk >> https://mailman.ic.ac.uk/mailman/listinfo/nektar-users >> >> >> -- > > *Amitvikram Dutta* > > Graduate Research Assistant > > Fluid Mechanics Research Lab > > Multi-Physics Interaction Lab > > University of Waterloo >
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
_______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/nektar-users
Hi Fatih, As you suggest, it looks like this is a name mangling problem and I suspect it must be something specific to the BGQ platform. There are others on the nektar list who are much more experienced with CMake than me and have more knowledge of the history of the codebase so maybe someone else can explain the now missing "NEKTAR_LIB_UTILITIES_LINEAR_ALGEBRA_DGEMM_OVERRIDE_HPP" preprocessor directive that you mention and the missing DgemmOverride.hpp header. Aside from this, I don't know if you've seen the following: https://github.com/mfem/mfem/issues/397 While this is for a completely unrelated library, it looks like it's describing a similar problem to what you're experiencing. That thread also links to the hypre repository, providing an example of using "configureable macros" for name mangling. Don't know if this is of any help but it might be worth investigating. Regards, Jeremy On 12 Feb 2019, at 22:03, Fatih Ertinaz <fertinaz@gmail.com> wrote:
Hello Jeremy
I appreciate your detailed response and sorry for my late reply. I came back to this issue over the weekend, made certain changes and achieved some progress. Thus wanted to share current status.
I am convinced that the problem is related to linking BLAS and LAPACK, and probably it is specific to this platform -- Blue Gene Q. Because I managed to run nektar successfully on many different platforms and never encountered an issue. I am using latest git repo for nektar by the way.
I compiled some of the third-party libraries (boost, scotch and blas & lapack) separately under the directory I created "nektar/ThirdParty_compiled". This resolved earlier problems with boost. Also, when I check blas & lapack functions that are used by nektar, I can find their references in their libraries: bgqdev-fen1-$ nm libblas.a | grep -i dgemm dgemm.f.o: 0000000000000000 D dgemm bgqdev-fen1-$ nm liblapack.a | grep -i dgeev dgeev.f.o: 0000000000000000 D dgeev dgeevx.f.o: 0000000000000000 D dgeevx
As you see those are static libs because when shared objects are used, cmake doesn't detect BLAS (don't know why -- maybe BGQ) even though full-paths are provided. Same thing with static libs seems to be at least detected by cmake (prints out BLAS API found -- see below).
On the other hand, I compiled boost with shared libs, and cmake recognizes them correctly.
I also want to emphasize that, mangled names don't appear in the BLAS-LAPACK libs. So for instance "dgemm_" doesn't exist. And this is why the installation fails.
This is from cmake -- I hacked cmake to seek "dgemm" as well, but it is also not found: -- Looking for dgemm_ -- Looking for dgemm_ - not found -- Looking for Fortran sgemm -- Looking for Fortran sgemm - found -- A library with BLAS API found. -- Looking for Fortran cheev -- Looking for Fortran cheev - found
Before going forward, I'd like to make a suggestion for these types of checks. CMake documentation says, "Prefer using CHECK_SYMBOL_EXISTS instead of this module..." referring to CHECK_FUNCTION_EXISTS which is used by Nektar at the moment. There are certain types of implementations which cannot be detected by CHECK_FUNCTION_EXISTS. I didn't test it though. For further details: https://cmake.org/cmake/help/v3.8/module/CheckFunctionExists.html
Anyway this is where installation fails: [ 26%] Built target LocalRegions Linking CXX executable NekMesh /ess01/homebgq/scinet/bgq/compilers/gcc/4.8.1/bin/../lib/gcc/powerpc64-bgq-linux/4.8.1/../../../../powerpc64-bgq-linux/bin/ld: warning: libboost_atomic.so.1.57.0, needed by /scinet/bgq/Applications/nektar/ThirdParty_compiled/boost_1_57_0/install/lib/libboost_thread.so, not found (try using -rpath or -rpath-link) CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) [clone .part.27]': NodeOpti.cpp:(.text+0x1784): undefined reference to `dgeev_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `std::ctype<char>::widen(char) const [clone .part.33]': NodeOpti.cpp:(.text+0x1e6c): undefined reference to `dgeev_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::exception_detail::error_info_injector<std::domain_error>::~error_info_injector()': NodeOpti.cpp:(.text+0x25b4): undefined reference to `dgemm_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::exception_detail::error_info_injector<std::overflow_error>::~error_info_injector()': NodeOpti.cpp:(.text+0x44a4): undefined reference to `dgemm_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::system::system_error::what() const': NodeOpti.cpp:(.text._ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi2EEEdRdb[_ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi2EEEdRdb]+0x2e8): undefined reference to `dgemm_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::io::basic_altstringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_altstringbuf()': NodeOpti.cpp:(.text._ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi3EEEdRdb[_ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi3EEEdRdb]+0x318): undefined reference to `dgemm_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dtptrs_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `daxpy_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpbtrf_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dsptrf_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `dgemv_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbmv_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `dscal_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dtpmv_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetrf_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrf_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpbtrs_' ../../library/StdRegions/libStdRegions.so.4.5.0: undefined reference to `dcopy_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dsptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrf_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `ddot_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dspmv_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dsptri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrs_' collect2: error: ld returned 1 exit status make[2]: *** [utilities/NekMesh/NekMesh] Error 1 make[1]: *** [utilities/NekMesh/CMakeFiles/NekMesh.dir/all] Error 2 make: *** [all] Error 2
However implementation for dgemm: grep -irn "dgemm" * nektar/utilities/NekMesh/ProcessModules/ProcessVarOpti/Evaluator.hxx: Blas::Dgemm('N', 'N', pts, DIM * nElmt, ptsStd, 1.0,
By the way, as suggest in earlier messages in this thread, I enabled both -DTHIRDPARTY_BUILD_BLAS_LAPACK and -DNEKTAR_USE_SYSTEM_BLAS_LAPACK. Just in case, I've attached CMakeLists.txt file as well for other settings.
So the question is, why is it trying to reference the mangled names? Is it cmake causing the mess or the compiler? And how this can be tackled? When I grep "dgemm_", I get nothing except the object files.
Moreover, in earlier versions of Nektar (4.0.0 and 4.3.5 as far as I checked), there is a preprocessor definition called "NEKTAR_LIB_UTILITIES_LINEAR_ALGEBRA_DGEMM_OVERRIDE_HPP". This definition doesn't exist in the most recent git repo. However, I still can find the Blas.hpp for wrapper functions under the "LibUtilities/LinearAlgebra". Also, "DgemmOverride.hpp" doesn't exist as well. So at some point, could it be possible that portability is broken?
Thanks very much
// Fatih
On Mon, Nov 19, 2018 at 8:12 AM Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote: Hi Fatih,
Sorry for the delay in getting back to you on this. I'm afraid I don't have an immediate answer to the problem you're experiencing but I've done some investigations into the issues you're having so hopefully some feedback on these may provide some helpful information:
On 15 Nov 2018, at 06:35, Fatih Ertinaz <fertinaz@gmail.com> wrote:
Hello everyone,
I am working on the same task as Amitvikram, on the same cluster and currently having the exact same problem: undefined reference to lapack libs even though they are compiled successfully.
I've read through this thread as well as some others, so here is a brief summary about what I've done so far before asking some questions.
-- System info: Using cmake-2.8.12, cross-compiled gcc-4.8 and mpich-2 on a Blue Gene Q cluster.
-- Nektar version: Decided to use the git repo.
While I can't replicate the environment you're working in directly, I've set up a clean Ubuntu linux container with only a very basic initial set of packages installed. (All of my build attempts using the details provided below complete successfully.)
Within my base container, I've installed gcc-4.8 from packages (gcc-5 is the default version for the Ubuntu version I'm using - 16.04) built and installed cmake 2.8.12 from source and built and installed MPICH2 (1.5) from source.
I'm working with Nektar++ from source, using the master branch.
-- Added "-dynamic" flag to the "CMakeLists.txt" as it was suggested here: https://www.nektar.info/nektar-on-mira-cluster/
-- Boost: I initially used system installed boost but then decided to stick to the third-party version shipped with nektar. It is because, some of the required libs (for instance boost_iostreams) weren't part of the central installation. To deal with that, I firstly set up a partial build by referencing each individual library file explicitly in cmake command. In fact, it seems to build the required libs successfully but later fails during the nektar compilation. I think it messes up the environment and basically links to the wrong files. So anyway, I am using "ThirdParty/boost_1_57_0".
-- Lapack: The reason that I am not using system lapack is simply because cmake says "dgemm_" is not found in the system blas version. Therefore, I am compiling the "ThirdParty/Lapack-3.7.0" which I downloaded from "http://www.netlib.org/lapack/lapack-3.7.0.tgz".
Note that compilation fails with the same error even when I use ThirdParty/lapack.
I initially tried without using the dynamic flag but have subsequently tried with the -dynamic flag too.
I'm using ThirdParty boost 1.57 and ThirdParty lapack 3.7.0.
I'm also using ThirdParty Scotch. TinyXML and GSMPI are also built from source as ThirdParty dependencies during the Nektar++ build.
-- FFTW: Using system installed version.
I'm using a system installed FFTW from packages.
-- Download process: I cancelled MD5 checks and downloading with "wget" due to the similar ssl error mentioned before. This is an easy workaround though, and probably has nothing to do with the error. I download all packages to the "nektar/ThirdParty" and copy them to "nektar/build/ThirdParty" as well. The reason of this copy operation is that when nektar extracts the downloaded packages, I see that uncompressed folders are somehow empty. I don't know if that's a cmake bug, or a problem from my side. So that's why I download, extract and copy third party sources to "nektar/build/ThirdParty" manually.
Following the discussion on the list with Amitvikram about building when there is no external Internet access to download third party dependencies, I wrote up some of the points that I made about downloading dependencies manually - if you hadn't already seen this, it's on the Nektar++ website at https://www.nektar.info/building-nektar-offline-deps/
-- CMake command:
cmake $src \ -DCMAKE_INSTALL_PREFIX=$prf \ -DNEKTAR_USE_MPI=ON \ -DNEKTAR_USE_SYSTEM_BLAS_LAPACK=ON \ -DNEKTAR_USE_FFTW=ON \ -DFFTW_INCLUDE_DIR=${SCINET_FFTW_INC} \ -DFFTW_LIBRARY=${SCINET_FFTW_LIB}/libfftw3.a \ -DTHIRDPARTY_BUILD_BLAS_LAPACK=ON
As you can see, I enabled both "DNEKTAR_USE_SYSTEM_BLAS_LAPACK" and "DTHIRDPARTY_BUILD_BLAS_LAPACK" due to the suggestions; however this didn't seem to make a difference for me. Compilation fails at the same step with both are enabled or not.
I tried configuring using a similar cmake command to that which you've shown here - the only difference for me was that I didn't need to set the FFTW include directory since my FFTW install is in the system include path. I explicitly specified the path to FFTW_LIBRARY although this shouldn't be necessary since the library is, again, in the standard system library path.
-- Build process for Third-Party: In general they are compiled without any errors. In particular, I checked cmake files for each package and Lapack is configured with "-DBUILD_SHARED_LIBS:STRING=ON". I can see that objects are compiled with "-fPIC" option, it is in the cmake. However, "lapack/CMakeLists.txt" contains this line: "option(BUILD_SHARED_LIBS "Build shared libraries" OFF)" which I set to "ON" in my build script.
This is how libraries look in the "nektar/build/ThirdParty/dist/lib" directory after compiling ThirdParty libraries: bgqdev-fen1-$ ls nektar/build/ThirdParty/dist/lib/ cmake libboost_program_options.so libgsmpi.a libtinyxml.a libblas.so libboost_program_options.so.1.57.0 liblapack.so libxxt.a libblas.so.3 libboost_regex.so liblapack.so.3 libz.a libblas.so.3.7.0 libboost_regex.so.1.57.0 liblapack.so.3.7.0 libz.so libboost_filesystem.so libboost_system.so libscotch.a libz.so.1 libboost_filesystem.so.1.57.0 libboost_system.so.1.57.0 libscotcherr.a libz.so.1.2.7 libboost_iostreams.so libboost_thread.so libscotcherrexit.a pkgconfig libboost_iostreams.so.1.57.0 libboost_thread.so.1.57.0 libscotchmetis.a
This folder is about 1.5GB by the way.
I have exactly the same contents in my ThirdParty/dist/lib directory after building of the third party dependencies. The resulting files are nowhere near as large as yours, I assume the very large size of the folder is something to do with the static libraries being very large although I'm not sure why they would be so big.
I think the point you make about building of shared libraries being set to OFF in the CMakeLists.txt file for lapack shouldn't be an issue. If you look in $src/cmake/ThirdPartyBlasLapack.cmake, you should see in the EXTERNAL_PROJECT_ADD command that it is configuring lapack using CMake and specifying -DBUILD_SHARED_LIBS:STRING=ON. You should also be able to verify that lapack was, indeed, configured with this parameter by looking in the CMakeCache.txt file in $build/ThirdParty/lapack-3.7.0/.
However, "nektar/build/ThirdParty/dist/include" folder doesn't have lapack related headers: bgqdev-fen1-$ ls boost scotchf.h scotch.h tinystr.h tinyxml.h zconf.h zlib.h bgqdev-fen1-$ pwd /scinet/bgq/Applications/nektar/nektar/build/ThirdParty/dist/include
I also see exactly the same in my include folder, there are no lapack related headers.
Also, I can share the initial parts of the lapack build - in this version I tried to reference to the system blas for lapack installation: [ 6%] Performing configure step for 'lapack-3.7.0' cd /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0 && /gpfs/home/scinet/bgq/tools/cmake/2.8.12.1/bin/cmake -G "Unix Makefiles" -DCMAKE_Fortran_COMPILER:FIL EPATH=/scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -DCMAKE_INSTALL_PREFIX:PATH=/scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/dist -DCMAKE_INSTALL_LIBDIR:PATH=/scine t/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/dist/lib -DBUILD_SHARED_LIBS:STRING=ON -DBUILD_TESTING:STRING=OFF /scinet/bgq/Applications/nektar/nektar/ThirdParty/lapack-3.7 .0 Re-run cmake no build system arguments -- Setting build type to 'Release' as none was specified. -- The Fortran compiler identification is GNU -- Check for working Fortran compiler: /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -- Check for working Fortran compiler: /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -- works -- Detecting Fortran compiler ABI info -- Detecting Fortran compiler ABI info - done -- Checking whether /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 supports Fortran 90 -- Checking whether /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 supports Fortran 90 -- yes -- Looking for Python greater than 2.6 - -- Could NOT find PythonInterp: Found unsuitable version "2.6.6", but required is at least "2.7" (found /usr/bin/python2) -- No suitable Python version found, so skipping summary tests. -- Reducing RELEASE optimization level to O2 -- Looking for Fortran NONE - found -- Looking for Fortran INT_CPU_TIME - found -- Looking for Fortran EXT_ETIME - not found -- Looking for Fortran EXT_ETIME_ - not found -- Looking for Fortran INT_ETIME - found -- --> Will use second_INT_ETIME.f and dsecnd_INT_ETIME.f as timing function. -- Using supplied NETLIB BLAS implementation -- Using supplied NETLIB LAPACK implementation -- Building Single Precision -- Building Double Precision -- Building Complex Precision -- Building Double Complex Precision -- BUILD TESTING : OFF -- Configuring done -- Generating done -- Build files have been written to: /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0
Again, I see exactly the same output for configuration of lapack. However, when I initially ran this, the build system was picking up my standard C/C++/Fortran compilers so it was using gfortran rather than the MPI version. I reconfigured/rebuilt from scratch specifically telling the build system to use mpicc and mpic++ and setting -DCMAKE_Fortran_COMPILER to point to mpif90, after this I see the same as you have shown above and build again completes successfully.
Additionally, I can see "dgemm" in the log.make: bgqdev-fen1-$ grep -rn "dgemm" nektar/build-gcc/log.make.2 13756:[ 3%] Building Fortran object BLAS/SRC/CMakeFiles/blas.dir/dgemm.f.o 13757:cd /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0/BLAS/SRC && /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -Dblas_EXPORTS -O2 -fPIC -c /scinet/bgq/Applications/nektar/nektar/ThirdParty/lapack-3.7.0/BLAS/SRC/dgemm.f -o CMakeFiles/blas.dir/dgemm.f.o 14018:/scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -fPIC -O2 -Wl,-rpath=/bgsys/drivers/ppcfloor/comm/lib/libmpichf90-gcc.so.8 -shared -Wl,-soname,libblas.so.3 -o ../../lib/libblas.so.3.7.0 CMakeFiles/blas.dir/isamax.f.o CMakeFiles/blas.dir/sasum.f.o CMakeFiles/blas.dir/saxpy.f.o CMakeFiles/blas.dir/scopy.f.o ............... CMakeFiles/blas.dir/dgemm.f.o
This is the part that compilation fails:
[ 34%] Building CXX object utilities/NekMesh/CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/ElUtil.cpp.o ......... ......... /........./bgq/compilers/gcc/4.8.1/bin/../lib/gcc/powerpc64-bgq-linux/4.8.1/../../../../powerpc64-bgq-linux/bin/ld: warning: libmpichf90-gcc.so.8, needed by /scinet/bgq/Applications/nektar/nektar/build/ThirdParty/dist/lib/libblas.so, not found (try using -rpath or -rpath-link) ......... NodeOpti.cpp:(.text+0x1784): undefined reference to `dgeev_' ......... NodeOpti.cpp:(.text+0x25b4): undefined reference to `dgemm_' ......... ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrs_' collect2: error: ld returned 1 exit status make[2]: *** [utilities/NekMesh/NekMesh] Error 1
I've tried many things; xx enabled configure option "build_shared_libs" in CMakeLists.txt in "ThirdParty/lapack" xx made a copy of "make.inc.example" in ThirdParty/lapack and reduced optimization levels xx since this is a Blue Gene environment made reference to ESSL instead of BLAS
But none of it seems to makes a difference. It always fails in the exact same step.
This "libmpichf90-gcc.so.8" warning seems a bit odd to me and I am not sure if that has anything to do with the undefined ref err. I created a symlink to this library and added it to "LD_LIBRARY_PATH" as well, but then it failed with the following message "undefined symbol: _cnkspi_MemoryRegionCacheLastAccessedElementNumber" by "libpami-gcc.so" where PAMI is a lower level messaging api by IBM. Also, "cnkspi" sound far too low level because "CNK" is the kernel on the compute nodes and "SPI" is the implementation that allows communication with that kernel. I added a linker flag "-Wl,-rpath" but I guess it only makes things go uglier.
bgqdev-fen1-$ readelf -d nektar/build-gcc/ThirdParty/dist/lib/libblas.so | grep NEEDED 0x0000000000000001 (NEEDED) Shared library: [libmpichf90-gcc.so.8] 0x0000000000000001 (NEEDED) Shared library: [libmpich-gcc.so.8] 0x0000000000000001 (NEEDED) Shared library: [libopa-gcc.so.0] 0x0000000000000001 (NEEDED) Shared library: [libmpl-gcc.so.1] 0x0000000000000001 (NEEDED) Shared library: [libpami-gcc.so] 0x0000000000000001 (NEEDED) Shared library: [librt.so.1] 0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0] 0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6] 0x0000000000000001 (NEEDED) Shared library: [libc.so.6] 0x0000000000000001 (NEEDED) Shared library: [libnss_files.so.2] 0x0000000000000001 (NEEDED) Shared library: [libnss_dns.so.2] 0x0000000000000001 (NEEDED) Shared library: [libresolv.so.2] 0x0000000000000001 (NEEDED) Shared library: [libgfortran.so.3] 0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
xx As an alternative, I switched to static linking. I initially changed "NEKTAR_LIBRARY_TYPE" to "STATIC" in the "CMakeLists.txt".
This is the one area where I have a number of differences to you. I'm not sure that switching to static linking is likely to make much difference (although I could be wrong) however I note that your blas library seems to require various mpich libraries. My libblas.so library only lists libm.so.6 and libgfortran.so.3 as "NEEDED". Can you also provide the RPATH value that you get from readelf -d for this library...
If you could also provide your output of readelf -d for library/LibUtilities/libLibUtilities.so, that would be useful. My libLibUtilities.so needs a few boost libraries as well as libz, libblas, liblapack, libmpich, libpthread, libgcc_s libc, libstdc++ and libm.
xx It seems that some of the ThirdParty libraries are configured with the assumption of shared objects so I changed them as well. For instance, boost is configured with options "link=shared" and "runtime-link=shared" which I set to static. I can see all required boost libs are successfully compiled and written to "build/ThirdParty/dist/lib".
Now this is the cmake command: cmake $src \ -DCMAKE_INSTALL_PREFIX=$prf \ -DNEKTAR_USE_MPI=ON \ -DNEKTAR_USE_SYSTEM_BLAS_LAPACK=ON \ -DNATIVE_BLAS:FILEPATH=${SCINET_LAPACK_LIB}/libblas.a \ -DNATIVE_BLAS_LIB_DIR:FILEPATH=${SCINET_LAPACK_LIB} \ -DNATIVE_LAPACK:FILEPATH=${SCINET_LAPACK_LIB}/liblapack.a \ -DNATIVE_LAPACK_LIB_DIR:FILEPATH=${SCINET_LAPACK_LIB} \ -DNEKTAR_USE_FFTW=ON \ -DFFTW_INCLUDE_DIR=${SCINET_FFTW_INC} \ -DFFTW_LIBRARY=${SCINET_FFTW_LIB}/libfftw3.a \ -DBoost_NO_SYSTEM_PATHS:BOOL=TRUE \ -DZLIB_INCLUDE_DIR:PATH=${SCINET_ZLIB_INC} \ -DZLIB_LIBRARY:FILEPATH=${SCINET_ZLIB_LIB}/libz.a
The issue now is installer seems to ignore "-DBoost_NO_SYSTEM_PATHS:BOOL=TRUE" and seeks locations other than "BOOST_ROOT" which I set to "nektar/build-gcc/dist".
See for instance: [ 5%] Building CXX object library/LibUtilities/CMakeFiles/LibUtilities.dir/BasicUtils/ArrayEqualityCompa rison.cpp.o cd /scinet/bgq/Applications/nektar/nektar++-4.4.1/build-gcc-static/library/LibUtilities && /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpicxx -DLIB_UTILITIES_EXPORTS -DNEKTAR_MEMORY_POOL_ENABLED -DNEKTAR_USE_MPI -DNEKTAR_USING_BLAS -DNEKTAR_USING_LAPACK -DNEKTAR_VERSION=\"4.4.1\" -DTIXML_USE_STL -O3 -DNDEBUG -Wall -Wno-deprecated -Wno-sign-compare -DNEKTAR_RELEASE -isystem /scinet/bgq/Applications/nektar/nektar++-4.4.1/build-gcc-static/ThirdParty/dist/include -isystem /scinet/bgq/Libraries/zlib-1.2.7-gcc4.8.1/include -isystem /scinet/bgq/Applications/nektar/nektar++-4.4.1/ThirdParty/dist/include -isystem /scinet/bgq/Libraries/fftw-3.3.5-gcc/include -I/scinet/bgq/Applications/nektar/nektar++-4.4.1 -I/scinet/bgq/Applications/nektar/nektar++-4.4.1/library -o CMakeFiles/LibUtilities.dir/BasicUtils/ArrayEqualityComparison.cpp.o -c /scinet/bgq/Applications/nektar/nektar++-4.4.1/library/LibUtilities/BasicUtils/ArrayEqualityComparison.cpp In file included from /bgsys/linux/ionfloor/usr/include/boost/config.hpp:57:0, from /bgsys/linux/ionfloor/usr/include/boost/cstdint.hpp:26, from /scinet/bgq/Applications/nektar/nektar++-4.4.1/library/LibUtilities/BasicConst/NektarUnivTypeDefs.hpp:40,
So the main quiestion is: Why does it check "/usr/include/boost" when "cstdint.hpp" already exists in the "build/dist/include/boost/"? bgqdev-fen1-$ ls build-gcc/dist/include/boost/cstdint.hpp -l -rw-r--r-- 1 fertinaz scinet 18017 Nov 14 19:00 build-gcc/dist/include/boost/cstdint.hpp
This is how it finally fails: /bgsys/linux/ionfloor/usr/include/boost/archive/iterators/binary_from_base64.hpp:52:9: warning: narrowing conversion of ‘-1’ from ‘int’ to ‘const char’ inside { } is ill-formed in C++11 [-Wnarrowing] make[2]: *** [library/LibUtilities/CMakeFiles/LibUtilities.dir/BasicUtils/CompressData.cpp.o] Error 1
It doesn't help to change the boost code from "const char lookup_table" to "signed char lookup_table" because then "switch-case" statement that returns the endianness information fails in the following file: "nektar/library/LibUtilities/BasicUtils/CompressData.cpp"
As you can guess, I disabled the switch-case block, and returned the value, but it fails anyway...
Sorry for the long message, hope you could follow. I've run out of ideas and any suggestion is highly appreciated....
// Fatih
I'd be inclined to stick with the third party boost and lapack and see if we can find a solution to that. I can't see what C/C++ compilers you're using but have you tried forcing the use of mpicc and mpic++ as shown when running cmake in the instructions at https://www.nektar.info/nektar-on-mira-cluster/? (in fact, I see above that in your log output for the build command for ArrayEqualityComparison.cpp, it looks like it's using mpicxx)
I'm not sure why libblas.so is linking in libmpichf90-gcc.so.8 but I'm assuming this is the core of the problem. Can you confirm what settings you're using to get the additional logging output that you're showing (which the build commands) - is it just -vv? I can then try and run the same and see if I can provide any further suggestions. I'm not clear at the moment but I'm assuming the undefined reference errors are a result of trying to link in libblas.so and that library itself having an undefined reference to libmpichf90. It might be that the rpath settings can be modified to take account of this.
Can you provide the output of running ldd on libblas.so, liblapck.so and libLibUtilities.so?
Thanks,
Jeremy
On Sun, Oct 14, 2018 at 7:42 AM Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote:
Hi Amitvikram,
I would certainly try Chris's suggestion. However, something else to check is where you're getting the third party downloads from.
If you take a clean Nektar++ source tree and place the standard netlib lapack-3.7.0.tgz source file that build system downloads into $NEKTAR_HOME/ThirdParty (i.e. the download from http://www.netlib.org/lapack/lapack-3.7.0.tgz), the build should proceed successfully.
It looks like the lapack tar file that you're using may already have some build artefacts in it - did you tar the content from $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0 into a lapack-3.7.0.tgz file or you're working with the standard .tgz file from the netlib.org site?
Cheers, Jeremy
On 13 Oct 2018, at 21:19, Chris Cantwell <c.cantwell@imperial.ac.uk> wrote:
Hi Amitvikram,
Some sites block non-SSL enabled HTTP traffic, returning a webpage reporting the error rather than the actual file (hence the hash mismatch).
You could try turning on the THIRDPARTY_USE_SSL option to see if that is allowed.
Cheers, Chris
On Fri, 12 Oct 2018 15:15:10 -0400, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I had to use a slight workaround because the platform I was compiling on could not download the thirdparty files correctly and would always give a hash mismatched error. I downloaded the third party files on my home machine and re-uploaded them into the $NEKTAR_HOME folder. As a result when I begin the build process the ThirdParty folder is populated with the zipped versions of the third party software. Is this what what might be causing the problem? Is there any way to edit the cmake file and to get the compilation process working correctly?
On Fri, Oct 12, 2018 at 2:57 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
*From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 2:56:43 PM (UTC-05:00) Eastern Time (US & Canada)
*To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
Have you attempted to build lapack separately at any point? It's probably worth clearing out your build directory and also all the contents of the ThirdParty directory in the base nektar++ source directory, which I'll call $NEKTAR_HOME, so $NEKTAR_HOME/ThirdParty and then re-running cmake in an empty $NEKTAR_HOME/build directory and trying the build again.
It looks like the build step is encountering a previous source tree in the location where it's trying to build which seems strange.
I've just had a look at the log from my clean build and I see exactly the same messages as you in relation to lapack-3.7.0 in the same order as far as "Checking whether /usr/bin/f95 supports Fortran 90 -- yes", however, I then see "-- Looking for Python greater than 2.6 - " and the build of lapack completes successfully.
Just to confirm, I am running cmake and make in a separate build directory under the main nektar++ source tree directory, so I'm building in $NEKTAR_HOME/build - I assume you're doing something similar? You should see a ThirdParty directory in $NEKTAR_HOME and another ThirdParty directory in $NEKTAR_HOME/build/
I believe that the initial download of the lapack-3.7.0.tar.gz should be placed in $NEKTAR_HOME/ThirdParty and unpacked there. Then, when the build succeeds or stops, in $NEKTAR_HOME/build/ThirdParty, you should see lapack-3.7.0/ where I think the build actually takes place, and then a separate $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0-tmp which should contain a couple of lapack-3.7.0-cfgcmd.txt files that I think contain the build command that is used - you could perhaps paste the contents of the lapack-3.7.0-cfgcmd.txt into an email if you're still having issues and we can see if that looks correct.
It is, of course, possible that this is something related to the specific configuration of the platform that you're building on, but I think the third party lapack build should be straightforward and it sounds like for some reason, it's attempting to build in the wrong location, or a location where an existing source tree has ended up for some reason.
I'm afraid I don't have a very detailed knowledge of the build system beyond this so if none of the suggestions so far help you to resolve the problem, maybe someone with more knowledge of the build system can provide some advice.
Cheers, Jeremy
On 12 Oct 2018, at 19:34, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I tried to compile nektar using Jeremy's latest suggestions having both THIRDPARTY_BUILD_BLAS_LAPACK and NEKTAR_USE_SYSTEM_BLAS_LAPACK turned on. The following error occured. It seems that I might have to compile lapack separately. Is this unusual?
<image.png>
On Fri, Oct 12, 2018 at 1:16 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
> *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen > *Sent:* October 12, 2018 1:16:14 PM (UTC-05:00) Eastern Time (US & > Canada) > > *To:* Amitvikram Dutta > *Cc:* nektar-users > *Subject:* Re: [Nektar-users] Problem while installing nektar++ with > lapack > > Hi Amitvikram, > > As Chris suggests, it's probably better to use vendor supplied libraries > if you can get those working. > > In addition to the further information Chris has asked to take a look at, > one thing you could check is to whether there are any files in your > nektar++/build/ThirdParty/lapack-3.7.0 directory (if that directory exists > at all). > > I've been trying to see if I can recreate the problem and I was able to > see something similar when setting THIRDPARTY_BUILD_BLAS_LAPACK=ON and > NEKTAR_USE_SYSTEM_BLAS_LAPACK=OFF. In this case, I can configure > successfully and start the build but it fails with a large number of > undefined references that are similar to, and include, the dtpmv_ symbol > that you mentioned. When I look in nektar++/build/ThirdParty/lapack-3.7.0, > the directory is empty so it looks like the build system has configured on > the basis of building its own blas/lapack but the build hasn't been carried > out and therefore LibUtilities can't be linked against it. > > As a test, you could try running the build with both > THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=ON, if > this isn't the setting you've been using already. When I tried this, the > build of blas/lapack is carried out successfully and the linking is fine > with the full build of Nektar++ completing successfully. I removed the > system blas/lapack on my test system to be sure it was linking against the > correct instance. > > Cheers, > Jeremy > > On 12 Oct 2018, at 17:50, Chris Cantwell <c.cantwell@imperial.ac.uk> > wrote: > > Hi Amitvikram, > > Could you send us your CMakeCache.txt file from your build directory and > the output from running: > make VERBOSE=1 > for both cases. > > In the case of using ThirdParty LAPACK, it seems to not be linking to it. > Probably you should be using vendor-supplied libraries if possible though > so better if ee can get those working. > > Thanks, > Chris > > > > On 12 October 2018 14:08:55 BST, Amitvikram Dutta <amitvdutta23@gmail.com> > wrote: >> >> Hi Jeremy, >> >> I'm actually trying to build nektar++ on a BGQ cluster similar to Mira. >> >> I'm trying to build nektar++-4.4.1 and the system lapacek version is >> 3.4.2 >> >> Sincerely, >> On Fri, Oct 12, 2018 at 4:24 AM Amitvikram Dutta < >> amitvikram.dutta@uwaterloo.ca> wrote: >> >>> >>> ------------------------------ >>> *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen >>> *Sent:* October 12, 2018 4:24:33 AM (UTC-05:00) Eastern Time (US & >>> Canada) >>> *To:* Amitvikram Dutta >>> *Cc:* nektar-users >>> *Subject:* Re: [Nektar-users] Problem while installing nektar++ with >>> lapack >>> >>> Hi Amitvikram, >>> >>> Can you provide some further details of the problem you're encountering. >>> >>> Specifically, can you confirm what platform (including version) you're >>> building on, and if Linux, which I assume is the platform you're using, >>> which distribution. >>> >>> Can you also confirm what version of Nektar++ you're trying to build, >>> and the version of the system Lapack distribution that you're using. >>> >>> Thanks, >>> >>> Jeremy >>> >>> On 12 Oct 2018, at 01:05, Amitvikram Dutta <amitvdutta23@gmail.com> >>> wrote: >>> >>> Hi all, >>> >>> I keep having the same problem while trying to install nektar++ with >>> regards to the Lapack libraries. >>> >>> When I try to use the system Lapack installation I get the following >>> message >>> >>> */scinet/bgq/Libraries/lapack/lib/liblapack.so: undefined reference to >>> `_xlfEndIO@XLF_1.0'* >>> >>> while when I try to install using the ThirdParty Lapack supplied with >>> the nektar++ source directory I get the following error >>> >>> *../../library/LibUtilities/libLibUtilities.so.4.4.1: undefined >>> reference to `dtpmv_'* >>> >>> I have a feeling these errors have been encountered by the community at >>> large before. Could someone point out where I'm going wrong? >>> >>> Sincerely, >>> -- >>> >>> *Amitvikram Dutta* >>> >>> Graduate Research Assistant >>> >>> Fluid Mechanics Research Lab >>> >>> Multi-Physics Interaction Lab >>> >>> University of Waterloo >>> _______________________________________________ >>> Nektar-users mailing list >>> Nektar-users@imperial.ac.uk >>> https://mailman.ic.ac.uk/mailman/listinfo/nektar-users >>> >>> >>> -- >> >> *Amitvikram Dutta* >> >> Graduate Research Assistant >> >> Fluid Mechanics Research Lab >> >> Multi-Physics Interaction Lab >> >> University of Waterloo >> > > -- > Chris Cantwell > Imperial College London > South Kensington Campus > London SW7 2AZ > Email: c.cantwell@imperial.ac.uk > www.imperial.ac.uk/people/c.cantwell > > > --
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
_______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/nektar-users
<CMakeLists.txt>
Hello Jeremy I think this has been resolved now. I never saw the issue ticketed you sent, however I'm impressed how similar the way that the ticket owner expresses the problem and seeks workaround. So thank you for sharing it because it inspired a lot. I didn't find the exact solution there, since mfem's implementation is different than Nektar. However at some point, it made me realize that the code responsible for naming transformation is defined under -- nektar++-4.4.1:
$NEK_PATH/library/LibUtilities/LinearAlgebra/TransF77.hpp
I basically added the power pc to the macro, and that seems to be the solution. There was one single Lapack function call in the "DriverArnoldiModified.cpp" with an underscore that needed an explicit modification but other than that these two things were only source modifications I had to make. I had to make lots of changes to the CMake and eventually decided to compile Third-Party libraries completely separately from scratch. Thank you for your help. // Fatih On Wed, Feb 20, 2019 at 10:14 AM Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote:
Hi Fatih,
As you suggest, it looks like this is a name mangling problem and I suspect it must be something specific to the BGQ platform. There are others on the nektar list who are much more experienced with CMake than me and have more knowledge of the history of the codebase so maybe someone else can explain the now missing "NEKTAR_LIB_UTILITIES_LINEAR_ALGEBRA_DGEMM_OVERRIDE_HPP" preprocessor directive that you mention and the missing DgemmOverride.hpp header.
Aside from this, I don't know if you've seen the following: https://github.com/mfem/mfem/issues/397 While this is for a completely unrelated library, it looks like it's describing a similar problem to what you're experiencing. That thread also links to the hypre repository, providing an example of using "configureable macros" for name mangling. Don't know if this is of any help but it might be worth investigating.
Regards,
Jeremy
On 12 Feb 2019, at 22:03, Fatih Ertinaz <fertinaz@gmail.com> wrote:
Hello Jeremy
I appreciate your detailed response and sorry for my late reply. I came back to this issue over the weekend, made certain changes and achieved some progress. Thus wanted to share current status.
I am convinced that the problem is related to linking BLAS and LAPACK, and probably it is specific to this platform -- Blue Gene Q. Because I managed to run nektar successfully on many different platforms and never encountered an issue. I am using latest git repo for nektar by the way.
I compiled some of the third-party libraries (boost, scotch and blas & lapack) separately under the directory I created "nektar/ThirdParty_compiled". This resolved earlier problems with boost. Also, when I check blas & lapack functions that are used by nektar, I can find their references in their libraries:
bgqdev-fen1-$ nm libblas.a | grep -i dgemm dgemm.f.o: 0000000000000000 D dgemm bgqdev-fen1-$ nm liblapack.a | grep -i dgeev dgeev.f.o: 0000000000000000 D dgeev dgeevx.f.o: 0000000000000000 D dgeevx
As you see those are static libs because when shared objects are used, cmake doesn't detect BLAS (don't know why -- maybe BGQ) even though full-paths are provided. Same thing with static libs seems to be at least detected by cmake (prints out BLAS API found -- see below).
On the other hand, I compiled boost with shared libs, and cmake recognizes them correctly.
I also want to emphasize that, mangled names don't appear in the BLAS-LAPACK libs. So for instance "dgemm_" doesn't exist. And this is why the installation fails.
This is from cmake -- I hacked cmake to seek "dgemm" as well, but it is also not found:
-- Looking for dgemm_ -- Looking for dgemm_ - not found -- Looking for Fortran sgemm -- Looking for Fortran sgemm - found -- A library with BLAS API found. -- Looking for Fortran cheev -- Looking for Fortran cheev - found
Before going forward, I'd like to make a suggestion for these types of checks. CMake documentation says, "Prefer using CHECK_SYMBOL_EXISTS instead of this module..." referring to CHECK_FUNCTION_EXISTS which is used by Nektar at the moment. There are certain types of implementations which cannot be detected by CHECK_FUNCTION_EXISTS. I didn't test it though. For further details: https://cmake.org/cmake/help/v3.8/module/CheckFunctionExists.html
Anyway this is where installation fails:
[ 26%] Built target LocalRegions Linking CXX executable NekMesh /ess01/homebgq/scinet/bgq/compilers/gcc/4.8.1/bin/../lib/gcc/powerpc64-bgq-linux/4.8.1/../../../../powerpc64-bgq-linux/bin/ld: warning: libboost_atomic.so.1.57.0, needed by /scinet/bgq/Applications/nektar/ThirdParty_compiled/boost_1_57_0/install/lib/libboost_thread.so, not found (try using -rpath or -rpath-link) CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) [clone .part.27]': NodeOpti.cpp:(.text+0x1784): undefined reference to `dgeev_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `std::ctype<char>::widen(char) const [clone .part.33]': NodeOpti.cpp:(.text+0x1e6c): undefined reference to `dgeev_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::exception_detail::error_info_injector<std::domain_error>::~error_info_injector()': NodeOpti.cpp:(.text+0x25b4): undefined reference to `dgemm_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::exception_detail::error_info_injector<std::overflow_error>::~error_info_injector()': NodeOpti.cpp:(.text+0x44a4): undefined reference to `dgemm_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::system::system_error::what() const': NodeOpti.cpp:(.text._ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi2EEEdRdb[_ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi2EEEdRdb]+0x2e8): undefined reference to `dgemm_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::io::basic_altstringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_altstringbuf()': NodeOpti.cpp:(.text._ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi3EEEdRdb[_ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi3EEEdRdb]+0x318): undefined reference to `dgemm_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dtptrs_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `daxpy_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpbtrf_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dsptrf_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `dgemv_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbmv_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `dscal_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dtpmv_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetrf_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrf_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpbtrs_' ../../library/StdRegions/libStdRegions.so.4.5.0: undefined reference to `dcopy_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dsptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrf_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `ddot_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dspmv_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dsptri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrs_' collect2: error: ld returned 1 exit status make[2]: *** [utilities/NekMesh/NekMesh] Error 1 make[1]: *** [utilities/NekMesh/CMakeFiles/NekMesh.dir/all] Error 2 make: *** [all] Error 2
However implementation for dgemm:
grep -irn "dgemm" * nektar/utilities/NekMesh/ProcessModules/ProcessVarOpti/Evaluator.hxx: *Blas::Dgemm*('N', 'N', pts, DIM * nElmt, ptsStd, 1.0,
By the way, as suggest in earlier messages in this thread, I enabled both -DTHIRDPARTY_BUILD_BLAS_LAPACK and -DNEKTAR_USE_SYSTEM_BLAS_LAPACK. Just in case, I've attached CMakeLists.txt file as well for other settings.
So the question is, why is it trying to reference the mangled names? Is it cmake causing the mess or the compiler? And how this can be tackled? When I grep "dgemm_", I get nothing except the object files.
Moreover, in earlier versions of Nektar (4.0.0 and 4.3.5 as far as I checked), there is a preprocessor definition called "NEKTAR_LIB_UTILITIES_LINEAR_ALGEBRA_DGEMM_OVERRIDE_HPP". This definition doesn't exist in the most recent git repo. However, I still can find the Blas.hpp for wrapper functions under the "LibUtilities/LinearAlgebra". Also, "DgemmOverride.hpp" doesn't exist as well. So at some point, could it be possible that portability is broken?
Thanks very much
// Fatih
On Mon, Nov 19, 2018 at 8:12 AM Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote:
Hi Fatih,
Sorry for the delay in getting back to you on this. I'm afraid I don't have an immediate answer to the problem you're experiencing but I've done some investigations into the issues you're having so hopefully some feedback on these may provide some helpful information:
On 15 Nov 2018, at 06:35, Fatih Ertinaz <fertinaz@gmail.com> wrote:
Hello everyone,
I am working on the same task as Amitvikram, on the same cluster and currently having the exact same problem: *undefined reference to lapack libs even though they are compiled successfully*.
I've read through this thread as well as some others, so here is a brief summary about what I've done so far before asking some questions.
-- System info: Using cmake-2.8.12, cross-compiled gcc-4.8 and mpich-2 on a Blue Gene Q cluster.
-- Nektar version: Decided to use the git repo.
While I can't replicate the environment you're working in directly, I've set up a clean Ubuntu linux container with only a very basic initial set of packages installed. (All of my build attempts using the details provided below complete successfully.)
Within my base container, I've installed gcc-4.8 from packages (gcc-5 is the default version for the Ubuntu version I'm using - 16.04) built and installed cmake 2.8.12 from source and built and installed MPICH2 (1.5) from source.
I'm working with Nektar++ from source, using the master branch.
-- Added "-dynamic" flag to the "CMakeLists.txt" as it was suggested here: https://www.nektar.info/nektar-on-mira-cluster/
-- Boost: I initially used system installed boost but then decided to stick to the third-party version shipped with nektar. It is because, some of the required libs (for instance boost_iostreams) weren't part of the central installation. To deal with that, I firstly set up a partial build by referencing each individual library file explicitly in cmake command. In fact, it seems to build the required libs successfully but later fails during the nektar compilation. I think it messes up the environment and basically links to the wrong files. So anyway, I am using "ThirdParty/boost_1_57_0".
-- Lapack: The reason that I am not using system lapack is simply because cmake says "dgemm_" is not found in the system blas version. Therefore, I am compiling the "ThirdParty/Lapack-3.7.0" which I downloaded from " http://www.netlib.org/lapack/lapack-3.7.0.tgz".
*Note that compilation fails with the same error even when I use ThirdParty/lapack.*
I initially tried without using the dynamic flag but have subsequently tried with the -dynamic flag too.
I'm using ThirdParty boost 1.57 and ThirdParty lapack 3.7.0.
I'm also using ThirdParty Scotch. TinyXML and GSMPI are also built from source as ThirdParty dependencies during the Nektar++ build.
-- FFTW: Using system installed version.
I'm using a system installed FFTW from packages.
-- Download process: I cancelled MD5 checks and downloading with "wget" due to the similar ssl error mentioned before. This is an easy workaround though, and probably has nothing to do with the error. I download all packages to the "nektar/ThirdParty" and copy them to "nektar/build/ThirdParty" as well. The reason of this copy operation is that when nektar extracts the downloaded packages, I see that uncompressed folders are somehow empty. I don't know if that's a cmake bug, or a problem from my side. So that's why I download, extract and copy third party sources to "nektar/build/ThirdParty" manually.
Following the discussion on the list with Amitvikram about building when there is no external Internet access to download third party dependencies, I wrote up some of the points that I made about downloading dependencies manually - if you hadn't already seen this, it's on the Nektar++ website at https://www.nektar.info/building-nektar-offline-deps/
-- CMake command:
cmake $src \ -DCMAKE_INSTALL_PREFIX=$prf \ -DNEKTAR_USE_MPI=ON \ -DNEKTAR_USE_SYSTEM_BLAS_LAPACK=ON \ -DNEKTAR_USE_FFTW=ON \ -DFFTW_INCLUDE_DIR=${SCINET_FFTW_INC} \ -DFFTW_LIBRARY=${SCINET_FFTW_LIB}/libfftw3.a \ -DTHIRDPARTY_BUILD_BLAS_LAPACK=ON
As you can see, I enabled both "DNEKTAR_USE_SYSTEM_BLAS_LAPACK" and " DTHIRDPARTY_BUILD_BLAS_LAPACK" due to the suggestions; however this didn't seem to make a difference for me. Compilation fails at the same step with both are enabled or not.
I tried configuring using a similar cmake command to that which you've shown here - the only difference for me was that I didn't need to set the FFTW include directory since my FFTW install is in the system include path. I explicitly specified the path to FFTW_LIBRARY although this shouldn't be necessary since the library is, again, in the standard system library path.
-- Build process for Third-Party: In general they are compiled without any errors. In particular, I checked cmake files for each package and Lapack is configured with "-DBUILD_SHARED_LIBS:STRING=ON". I can see that objects are compiled with "-fPIC" option, it is in the cmake. However, "lapack/CMakeLists.txt" contains this line: "option(BUILD_SHARED_LIBS "Build shared libraries" OFF)" which I set to "ON" in my build script.
This is how libraries look in the "nektar/build/ThirdParty/dist/lib" directory after compiling ThirdParty libraries:
bgqdev-fen1-$ ls nektar/build/ThirdParty/dist/lib/ cmake libboost_program_options.so
libgsmpi.a libtinyxml.a
*libblas.so* libboost_program_options.so.1.57.0
*liblapack.so* libxxt.a
libblas.so.3 libboost_regex.so
liblapack.so.3 libz.a
libblas.so.3.7.0 libboost_regex.so.1.57.0
liblapack.so.3.7.0 libz.so
libboost_filesystem.so
libboost_system.so libscotch.a
libz.so.1 libboost_filesystem.so.1.57.0
libboost_system.so.1.57.0
libscotcherr.a libz.so.1.2.7
libboost_iostreams.so
libboost_thread.so libscotcherrexit.a
pkgconfig libboost_iostreams.so.1.57.0
libboost_thread.so.1.57.0
libscotchmetis.a
This folder is about 1.5GB by the way.
I have exactly the same contents in my ThirdParty/dist/lib directory after building of the third party dependencies. The resulting files are nowhere near as large as yours, I assume the very large size of the folder is something to do with the static libraries being very large although I'm not sure why they would be so big.
I think the point you make about building of shared libraries being set to OFF in the CMakeLists.txt file for lapack shouldn't be an issue. If you look in $src/cmake/ThirdPartyBlasLapack.cmake, you should see in the EXTERNAL_PROJECT_ADD command that it is configuring lapack using CMake and specifying -DBUILD_SHARED_LIBS:STRING=ON. You should also be able to verify that lapack was, indeed, configured with this parameter by looking in the CMakeCache.txt file in $build/ThirdParty/lapack-3.7.0/.
However, "nektar/build/ThirdParty/dist/include" folder doesn't have lapack related headers:
bgqdev-fen1-$ ls boost scotchf.h scotch.h tinystr.h tinyxml.h zconf.h zlib.h bgqdev-fen1-$ pwd /scinet/bgq/Applications/nektar/nektar/build/ThirdParty/dist/include
I also see exactly the same in my include folder, there are no lapack related headers.
Also, I can share the initial parts of the lapack build - in this version I tried to reference to the system blas for lapack installation:
[ 6%] Performing configure step for 'lapack-3.7.0' cd /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0 && /gpfs/home/scinet/bgq/tools/cmake/2.8.12.1/bin/cmake -G "Unix Makefiles" -DCMAKE_Fortran_COMPILER:FIL EPATH=/scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -DCMAKE_INSTALL_PREFIX:PATH=/scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/dist -DCMAKE_INSTALL_LIBDIR:PATH=/scine t/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/dist/lib -DBUILD_SHARED_LIBS:STRING=ON -DBUILD_TESTING:STRING=OFF /scinet/bgq/Applications/nektar/nektar/ThirdParty/lapack-3.7 .0 Re-run cmake no build system arguments -- Setting build type to 'Release' as none was specified. -- The Fortran compiler identification is GNU -- Check for working Fortran compiler: /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -- Check for working Fortran compiler: /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -- works -- Detecting Fortran compiler ABI info -- Detecting Fortran compiler ABI info - done -- Checking whether /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 supports Fortran 90 -- Checking whether /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 supports Fortran 90 -- yes -- Looking for Python greater than 2.6 - -- Could NOT find PythonInterp: Found unsuitable version "2.6.6", but required is at least "2.7" (found /usr/bin/python2) -- No suitable Python version found, so skipping summary tests. -- Reducing RELEASE optimization level to O2 -- Looking for Fortran NONE - found -- Looking for Fortran INT_CPU_TIME - found -- Looking for Fortran EXT_ETIME - not found -- Looking for Fortran EXT_ETIME_ - not found -- Looking for Fortran INT_ETIME - found -- --> Will use second_INT_ETIME.f and dsecnd_INT_ETIME.f as timing function. *-- Using supplied NETLIB BLAS implementation* *-- Using supplied NETLIB LAPACK implementation* -- Building Single Precision -- Building Double Precision -- Building Complex Precision -- Building Double Complex Precision -- BUILD TESTING : OFF -- Configuring done -- Generating done -- Build files have been written to: /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0
Again, I see exactly the same output for configuration of lapack. However, when I initially ran this, the build system was picking up my standard C/C++/Fortran compilers so it was using gfortran rather than the MPI version. I reconfigured/rebuilt from scratch specifically telling the build system to use mpicc and mpic++ and setting -DCMAKE_Fortran_COMPILER to point to mpif90, after this I see the same as you have shown above and build again completes successfully.
Additionally, I can see "dgemm" in the log.make:
bgqdev-fen1-$ grep -rn "dgemm" nektar/build-gcc/log.make.2 13756:[ 3%] Building Fortran object *BLAS/SRC/CMakeFiles/blas.dir/dgemm.f.o* 13757:cd /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0/BLAS/SRC && /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -Dblas_EXPORTS -O2 -fPIC -c /scinet/bgq/Applications/nektar/ *nektar/ThirdParty/lapack-3.7.0/BLAS/SRC/dgemm.f* -o CMakeFiles/blas.dir/*dgemm.f.o* 14018:/scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -fPIC -O2 -Wl,-rpath=/bgsys/drivers/ppcfloor/comm/lib/libmpichf90-gcc.so.8 -shared -Wl,-soname,libblas.so.3 -o ../../lib/libblas.so.3.7.0 CMakeFiles/blas.dir/isamax.f.o CMakeFiles/blas.dir/sasum.f.o CMakeFiles/blas.dir/saxpy.f.o CMakeFiles/blas.dir/scopy.f.o
...............
*CMakeFiles/blas.dir/dgemm.f.o *
This is the part that compilation fails:
[ 34%] Building CXX object utilities/NekMesh/CMakeFiles/ NekMesh.dir/ProcessModules/ProcessVarOpti/ElUtil.cpp.o
.........
.........
/........./bgq/compilers/gcc/4.8.1/bin/../lib/gcc/powerpc64-bgq-linux/4. 8.1/../../../../*powerpc64-bgq-**linux/bin/ld: warning: libmpichf90-gcc.so.8, needed by /scinet/bgq/Applications/* nektar/nektar/build/ThirdParty/dist/lib/libblas.*so*, *not found (try using -rpath or -rpath-link) *
.........
NodeOpti.cpp:(.text+0x1784): undefined reference to `dgeev_'
.........
NodeOpti.cpp:(.text+0x25b4): undefined reference to `dgemm_'
.........
../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference
to `dgetri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrs_' collect2: error: ld returned 1 exit status make[2]: *** [utilities/NekMesh/NekMesh] Error 1
I've tried many things; xx enabled configure option "build_shared_libs" in CMakeLists.txt in "ThirdParty/lapack" xx made a copy of "make.inc.example" in ThirdParty/lapack and reduced optimization levels xx since this is a Blue Gene environment *made reference to ESSL instead of BLAS*
*But none of it seems to makes a difference. It always fails in the exact same step.*
This "*libmpichf90-gcc.so.8*" warning seems a bit odd to me and I am not sure if that has anything to do with the undefined ref err. I created a symlink to this library and added it to "LD_LIBRARY_PATH" as well, but then it failed with the following message "undefined symbol: _cnkspi_MemoryRegionCacheLastAccessedElementNumber" by "*libpami-gcc.so*" where PAMI is a lower level messaging api by IBM. Also, "cnkspi" sound far too low level because "CNK" is the kernel on the compute nodes and "SPI" is the implementation that allows communication with that kernel. I added a linker flag "-Wl,-rpath" but I guess it only makes things go uglier.
bgqdev-fen1-$ readelf -d nektar/build-gcc/ThirdParty/dist/lib/libblas.so
| grep NEEDED 0x0000000000000001 (NEEDED) Shared library: [ *libmpichf90-gcc.so.8*] 0x0000000000000001 (NEEDED) Shared library: [libmpich-gcc.so.8] 0x0000000000000001 (NEEDED) Shared library: [libopa-gcc.so.0] 0x0000000000000001 (NEEDED) Shared library: [libmpl-gcc.so.1] 0x0000000000000001 (NEEDED) Shared library: [ *libpami-gcc.so*] 0x0000000000000001 (NEEDED) Shared library: [librt.so.1] 0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0] 0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6] 0x0000000000000001 (NEEDED) Shared library: [libc.so.6] 0x0000000000000001 (NEEDED) Shared library: [libnss_files.so.2] 0x0000000000000001 (NEEDED) Shared library: [libnss_dns.so.2] 0x0000000000000001 (NEEDED) Shared library: [libresolv.so.2] 0x0000000000000001 (NEEDED) Shared library: [libgfortran.so.3] 0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
xx *As an alternative, I switched to static linking*. I initially changed "NEKTAR_LIBRARY_TYPE" to "STATIC" in the "CMakeLists.txt".
This is the one area where I have a number of differences to you. I'm not sure that switching to static linking is likely to make much difference (although I could be wrong) however I note that your blas library seems to require various mpich libraries. My libblas.so library only lists libm.so.6 and libgfortran.so.3 as "NEEDED". Can you also provide the RPATH value that you get from readelf -d for this library...
If you could also provide your output of readelf -d for library/LibUtilities/libLibUtilities.so, that would be useful. My libLibUtilities.so needs a few boost libraries as well as libz, libblas, liblapack, libmpich, libpthread, libgcc_s libc, libstdc++ and libm.
xx It seems that some of the ThirdParty libraries are configured with the assumption of shared objects so I changed them as well. For instance, boost is configured with options "link=shared" and "runtime-link=shared" which I set to static.* I can see all required boost libs are successfully compiled and written to "build/ThirdParty/dist/lib".*
Now this is the cmake command:
cmake $src \ -DCMAKE_INSTALL_PREFIX=$prf \ -DNEKTAR_USE_MPI=ON \ -DNEKTAR_USE_SYSTEM_BLAS_LAPACK=ON \ -DNATIVE_BLAS:FILEPATH=${SCINET_LAPACK_LIB}/libblas.a \ -DNATIVE_BLAS_LIB_DIR:FILEPATH=${SCINET_LAPACK_LIB} \ -DNATIVE_LAPACK:FILEPATH=${SCINET_LAPACK_LIB}/liblapack.a \ -DNATIVE_LAPACK_LIB_DIR:FILEPATH=${SCINET_LAPACK_LIB} \ -DNEKTAR_USE_FFTW=ON \ -DFFTW_INCLUDE_DIR=${SCINET_FFTW_INC} \ -DFFTW_LIBRARY=${SCINET_FFTW_LIB}/libfftw3.a \ -DBoost_NO_SYSTEM_PATHS:BOOL=TRUE \ -DZLIB_INCLUDE_DIR:PATH=${SCINET_ZLIB_INC} \ -DZLIB_LIBRARY:FILEPATH=${SCINET_ZLIB_LIB}/libz.a
The issue now is installer seems to ignore "-DBoost_NO_SYSTEM_PATHS:BOOL=TRUE" and seeks locations other than "BOOST_ROOT" which I set to " nektar/build-gcc/dist".
See for instance:
[ 5%] Building CXX object library/LibUtilities/CMakeFiles/LibUtilities.dir/BasicUtils/ArrayEqualityCompa rison.cpp.o cd /scinet/bgq/Applications/nektar/nektar++-4.4.1/build-gcc-static/library/LibUtilities && /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpicxx -DLIB_UTILITIES_EXPORTS -DNEKTAR_MEMORY_POOL_ENABLED -DNEKTAR_USE_MPI -DNEKTAR_USING_BLAS -DNEKTAR_USING_LAPACK -DNEKTAR_VERSION=\"4.4.1\" -DTIXML_USE_STL -O3 -DNDEBUG -Wall -Wno-deprecated -Wno-sign-compare -DNEKTAR_RELEASE -isystem /scinet/bgq/Applications/nektar/nektar++-4.4.1/build-gcc-static/ThirdParty/dist/include -isystem /scinet/bgq/Libraries/zlib-1.2.7-gcc4.8.1/include -isystem /scinet/bgq/Applications/nektar/nektar++-4.4.1/ThirdParty/dist/include -isystem /scinet/bgq/Libraries/fftw-3.3.5-gcc/include -I/scinet/bgq/Applications/nektar/nektar++-4.4.1 -I/scinet/bgq/Applications/nektar/nektar++-4.4.1/library -o CMakeFiles/LibUtilities.dir/BasicUtils/ArrayEqualityComparison.cpp.o -c /scinet/bgq/Applications/nektar/nektar++-4.4.1/library/LibUtilities/BasicUtils/ArrayEqualityComparison.cpp In file included from */bgsys/linux/ionfloor/usr/include/boost/config.hpp:57:0*, from */bgsys/linux/ionfloor/usr/include/boost/cstdint.hpp:26*, from /scinet/bgq/Applications/nektar/nektar++-4.4.1/library/LibUtilities/BasicConst/NektarUnivTypeDefs.hpp:40,
So the main quiestion is: *Why does it check "/usr/include/boost" when "cstdint.hpp" already exists in the "build/dist/include/boost/"?*
bgqdev-fen1-$ ls build-gcc/dist/include/boost/cstdint.hpp -l -rw-r--r-- 1 fertinaz scinet 18017 Nov 14 19:00 build-gcc/dist/include/boost/cstdint.hpp
This is how it finally fails:
*/bgsys/linux/ionfloor/usr/include/boost/archive/iterators/binary_from_base64.hpp*:52:9: warning: narrowing conversion of ‘-1’ from ‘int’ to ‘const char’ inside { } is ill-formed in C++11 [-Wnarrowing] make[2]: *** [library/LibUtilities/CMakeFiles/LibUtilities.dir/BasicUtils/CompressData.cpp.o] Error 1
It doesn't help to change the boost code from "const char lookup_table" to "signed char lookup_table" because then "switch-case" statement that returns the endianness information fails in the following file: " nektar/library/LibUtilities/BasicUtils/CompressData.cpp"
As you can guess, I disabled the switch-case block, and returned the value, but it fails anyway...
Sorry for the long message, hope you could follow. I've run out of ideas and any suggestion is highly appreciated....
// Fatih
I'd be inclined to stick with the third party boost and lapack and see if we can find a solution to that. I can't see what C/C++ compilers you're using but have you tried forcing the use of mpicc and mpic++ as shown when running cmake in the instructions at https://www.nektar.info/nektar-on-mira-cluster/? (in fact, I see above that in your log output for the build command for ArrayEqualityComparison.cpp, it looks like it's using mpicxx)
I'm not sure why libblas.so is linking in libmpichf90-gcc.so.8 but I'm assuming this is the core of the problem. Can you confirm what settings you're using to get the additional logging output that you're showing (which the build commands) - is it just -vv? I can then try and run the same and see if I can provide any further suggestions. I'm not clear at the moment but I'm assuming the undefined reference errors are a result of trying to link in libblas.so and that library itself having an undefined reference to libmpichf90. It might be that the rpath settings can be modified to take account of this.
Can you provide the output of running ldd on libblas.so, liblapck.so and libLibUtilities.so?
Thanks,
Jeremy
On Sun, Oct 14, 2018 at 7:42 AM Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote:
Hi Amitvikram,
I would certainly try Chris's suggestion. However, something else to
check is where you're getting the third party downloads from.
If you take a clean Nektar++ source tree and place the standard netlib
lapack-3.7.0.tgz source file that build system downloads into $NEKTAR_HOME/ThirdParty (i.e. the download from http://www.netlib.org/lapack/lapack-3.7.0.tgz), the build should proceed successfully.
It looks like the lapack tar file that you're using may already have
some build artefacts in it - did you tar the content from $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0 into a lapack-3.7.0.tgz file or you're working with the standard .tgz file from the netlib.org site?
Cheers, Jeremy
On 13 Oct 2018, at 21:19, Chris Cantwell <c.cantwell@imperial.ac.uk>
wrote:
Hi Amitvikram,
Some sites block non-SSL enabled HTTP traffic, returning a webpage
reporting the error rather than the actual file (hence the hash mismatch).
You could try turning on the THIRDPARTY_USE_SSL option to see if that
is allowed.
Cheers, Chris
On Fri, 12 Oct 2018 15:15:10 -0400, Amitvikram Dutta <
amitvdutta23@gmail.com> wrote:
Hi all,
I had to use a slight workaround because the platform I was compiling on could not download the thirdparty files correctly and would always give a hash mismatched error. I downloaded the third party files on my home machine and re-uploaded them into the $NEKTAR_HOME folder. As a result when I begin the build process the ThirdParty folder is populated with the zipped versions of the third party software. Is this what what might be causing the problem? Is there any way to edit the cmake file and to get the compilation process working correctly?
On Fri, Oct 12, 2018 at 2:57 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
*From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen *Sent:* October 12, 2018 2:56:43 PM (UTC-05:00) Eastern Time (US & Canada)
*To:* Amitvikram Dutta *Cc:* nektar-users *Subject:* Re: [Nektar-users] Problem while installing nektar++ with lapack
Hi Amitvikram,
Have you attempted to build lapack separately at any point? It's probably worth clearing out your build directory and also all the contents of the ThirdParty directory in the base nektar++ source directory, which I'll call $NEKTAR_HOME, so $NEKTAR_HOME/ThirdParty and then re-running cmake in an empty $NEKTAR_HOME/build directory and trying the build again.
It looks like the build step is encountering a previous source tree in the location where it's trying to build which seems strange.
I've just had a look at the log from my clean build and I see exactly the same messages as you in relation to lapack-3.7.0 in the same order as far as "Checking whether /usr/bin/f95 supports Fortran 90 -- yes", however, I then see "-- Looking for Python greater than 2.6 - " and the build of lapack completes successfully.
Just to confirm, I am running cmake and make in a separate build directory under the main nektar++ source tree directory, so I'm building in $NEKTAR_HOME/build - I assume you're doing something similar? You should see a ThirdParty directory in $NEKTAR_HOME and another ThirdParty directory in $NEKTAR_HOME/build/
I believe that the initial download of the lapack-3.7.0.tar.gz should be placed in $NEKTAR_HOME/ThirdParty and unpacked there. Then, when the build succeeds or stops, in $NEKTAR_HOME/build/ThirdParty, you should see lapack-3.7.0/ where I think the build actually takes place, and then a separate $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0-tmp which should contain a couple of lapack-3.7.0-cfgcmd.txt files that I think contain the build command that is used - you could perhaps paste the contents of the lapack-3.7.0-cfgcmd.txt into an email if you're still having issues and we can see if that looks correct.
It is, of course, possible that this is something related to the specific configuration of the platform that you're building on, but I think the third party lapack build should be straightforward and it sounds like for some reason, it's attempting to build in the wrong location, or a location where an existing source tree has ended up for some reason.
I'm afraid I don't have a very detailed knowledge of the build system beyond this so if none of the suggestions so far help you to resolve the problem, maybe someone with more knowledge of the build system can provide some advice.
Cheers, Jeremy
On 12 Oct 2018, at 19:34, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I tried to compile nektar using Jeremy's latest suggestions having both THIRDPARTY_BUILD_BLAS_LAPACK and NEKTAR_USE_SYSTEM_BLAS_LAPACK turned on. The following error occured. It seems that I might have to compile lapack separately. Is this unusual?
<image.png>
On Fri, Oct 12, 2018 at 1:16 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
> *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen > *Sent:* October 12, 2018 1:16:14 PM (UTC-05:00) Eastern Time (US & > Canada) > > *To:* Amitvikram Dutta > *Cc:* nektar-users > *Subject:* Re: [Nektar-users] Problem while installing nektar++ with > lapack > > Hi Amitvikram, > > As Chris suggests, it's probably better to use vendor supplied libraries > if you can get those working. > > In addition to the further information Chris has asked to take a look at, > one thing you could check is to whether there are any files in your > nektar++/build/ThirdParty/lapack-3.7.0 directory (if that directory exists > at all). > > I've been trying to see if I can recreate the problem and I was able to > see something similar when setting THIRDPARTY_BUILD_BLAS_LAPACK=ON and > NEKTAR_USE_SYSTEM_BLAS_LAPACK=OFF. In this case, I can configure > successfully and start the build but it fails with a large number of > undefined references that are similar to, and include, the dtpmv_ symbol > that you mentioned. When I look in nektar++/build/ThirdParty/lapack-3.7.0, > the directory is empty so it looks like the build system has configured on > the basis of building its own blas/lapack but the build hasn't been carried > out and therefore LibUtilities can't be linked against it. > > As a test, you could try running the build with both > THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=ON, if > this isn't the setting you've been using already. When I tried this, the > build of blas/lapack is carried out successfully and the linking is fine > with the full build of Nektar++ completing successfully. I removed the > system blas/lapack on my test system to be sure it was linking against the > correct instance. > > Cheers, > Jeremy > > On 12 Oct 2018, at 17:50, Chris Cantwell < c.cantwell@imperial.ac.uk> > wrote: > > Hi Amitvikram, > > Could you send us your CMakeCache.txt file from your build directory and > the output from running: > make VERBOSE=1 > for both cases. > > In the case of using ThirdParty LAPACK, it seems to not be linking to it. > Probably you should be using vendor-supplied libraries if possible though > so better if ee can get those working. > > Thanks, > Chris > > > > On 12 October 2018 14:08:55 BST, Amitvikram Dutta < amitvdutta23@gmail.com> > wrote: >> >> Hi Jeremy, >> >> I'm actually trying to build nektar++ on a BGQ cluster similar to Mira. >> >> I'm trying to build nektar++-4.4.1 and the system lapacek version is >> 3.4.2 >> >> Sincerely, >> On Fri, Oct 12, 2018 at 4:24 AM Amitvikram Dutta < >> amitvikram.dutta@uwaterloo.ca> wrote: >> >>> >>> ------------------------------ >>> *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen >>> *Sent:* October 12, 2018 4:24:33 AM (UTC-05:00) Eastern Time (US & >>> Canada) >>> *To:* Amitvikram Dutta >>> *Cc:* nektar-users >>> *Subject:* Re: [Nektar-users] Problem while installing nektar++ with >>> lapack >>> >>> Hi Amitvikram, >>> >>> Can you provide some further details of the problem you're encountering. >>> >>> Specifically, can you confirm what platform (including version) you're >>> building on, and if Linux, which I assume is the platform you're using, >>> which distribution. >>> >>> Can you also confirm what version of Nektar++ you're trying to build, >>> and the version of the system Lapack distribution that you're using. >>> >>> Thanks, >>> >>> Jeremy >>> >>> On 12 Oct 2018, at 01:05, Amitvikram Dutta < amitvdutta23@gmail.com> >>> wrote: >>> >>> Hi all, >>> >>> I keep having the same problem while trying to install nektar++ with >>> regards to the Lapack libraries. >>> >>> When I try to use the system Lapack installation I get the following >>> message >>> >>> */scinet/bgq/Libraries/lapack/lib/liblapack.so: undefined reference to >>> `_xlfEndIO@XLF_1.0'* >>> >>> while when I try to install using the ThirdParty Lapack supplied with >>> the nektar++ source directory I get the following error >>> >>> *../../library/LibUtilities/libLibUtilities.so.4.4.1: undefined >>> reference to `dtpmv_'* >>> >>> I have a feeling these errors have been encountered by the community at >>> large before. Could someone point out where I'm going wrong? >>> >>> Sincerely, >>> -- >>> >>> *Amitvikram Dutta* >>> >>> Graduate Research Assistant >>> >>> Fluid Mechanics Research Lab >>> >>> Multi-Physics Interaction Lab >>> >>> University of Waterloo >>> _______________________________________________ >>> Nektar-users mailing list >>> Nektar-users@imperial.ac.uk >>> https://mailman.ic.ac.uk/mailman/listinfo/nektar-users >>> >>> >>> -- >> >> *Amitvikram Dutta* >> >> Graduate Research Assistant >> >> Fluid Mechanics Research Lab >> >> Multi-Physics Interaction Lab >> >> University of Waterloo >> > > -- > Chris Cantwell > Imperial College London > South Kensington Campus > London SW7 2AZ > Email: c.cantwell@imperial.ac.uk > www.imperial.ac.uk/people/c.cantwell > > > --
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
--
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
_______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/nektar-users
<CMakeLists.txt>
Hi Fatih, Many thanks for the feedback on how you solved this issue - it's good to hear that it's now resolved. Kind regards, Jeremy On 21 Feb 2019, at 17:58, Fatih Ertinaz <fertinaz@gmail.com> wrote:
Hello Jeremy
I think this has been resolved now.
I never saw the issue ticketed you sent, however I'm impressed how similar the way that the ticket owner expresses the problem and seeks workaround.
So thank you for sharing it because it inspired a lot. I didn't find the exact solution there, since mfem's implementation is different than Nektar. However at some point, it made me realize that the code responsible for naming transformation is defined under -- nektar++-4.4.1: $NEK_PATH/library/LibUtilities/LinearAlgebra/TransF77.hpp
I basically added the power pc to the macro, and that seems to be the solution. There was one single Lapack function call in the "DriverArnoldiModified.cpp" with an underscore that needed an explicit modification but other than that these two things were only source modifications I had to make.
I had to make lots of changes to the CMake and eventually decided to compile Third-Party libraries completely separately from scratch.
Thank you for your help.
// Fatih
On Wed, Feb 20, 2019 at 10:14 AM Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote: Hi Fatih,
As you suggest, it looks like this is a name mangling problem and I suspect it must be something specific to the BGQ platform. There are others on the nektar list who are much more experienced with CMake than me and have more knowledge of the history of the codebase so maybe someone else can explain the now missing "NEKTAR_LIB_UTILITIES_LINEAR_ALGEBRA_DGEMM_OVERRIDE_HPP" preprocessor directive that you mention and the missing DgemmOverride.hpp header.
Aside from this, I don't know if you've seen the following: https://github.com/mfem/mfem/issues/397 While this is for a completely unrelated library, it looks like it's describing a similar problem to what you're experiencing. That thread also links to the hypre repository, providing an example of using "configureable macros" for name mangling. Don't know if this is of any help but it might be worth investigating.
Regards,
Jeremy
On 12 Feb 2019, at 22:03, Fatih Ertinaz <fertinaz@gmail.com> wrote:
Hello Jeremy
I appreciate your detailed response and sorry for my late reply. I came back to this issue over the weekend, made certain changes and achieved some progress. Thus wanted to share current status.
I am convinced that the problem is related to linking BLAS and LAPACK, and probably it is specific to this platform -- Blue Gene Q. Because I managed to run nektar successfully on many different platforms and never encountered an issue. I am using latest git repo for nektar by the way.
I compiled some of the third-party libraries (boost, scotch and blas & lapack) separately under the directory I created "nektar/ThirdParty_compiled". This resolved earlier problems with boost. Also, when I check blas & lapack functions that are used by nektar, I can find their references in their libraries: bgqdev-fen1-$ nm libblas.a | grep -i dgemm dgemm.f.o: 0000000000000000 D dgemm bgqdev-fen1-$ nm liblapack.a | grep -i dgeev dgeev.f.o: 0000000000000000 D dgeev dgeevx.f.o: 0000000000000000 D dgeevx
As you see those are static libs because when shared objects are used, cmake doesn't detect BLAS (don't know why -- maybe BGQ) even though full-paths are provided. Same thing with static libs seems to be at least detected by cmake (prints out BLAS API found -- see below).
On the other hand, I compiled boost with shared libs, and cmake recognizes them correctly.
I also want to emphasize that, mangled names don't appear in the BLAS-LAPACK libs. So for instance "dgemm_" doesn't exist. And this is why the installation fails.
This is from cmake -- I hacked cmake to seek "dgemm" as well, but it is also not found: -- Looking for dgemm_ -- Looking for dgemm_ - not found -- Looking for Fortran sgemm -- Looking for Fortran sgemm - found -- A library with BLAS API found. -- Looking for Fortran cheev -- Looking for Fortran cheev - found
Before going forward, I'd like to make a suggestion for these types of checks. CMake documentation says, "Prefer using CHECK_SYMBOL_EXISTS instead of this module..." referring to CHECK_FUNCTION_EXISTS which is used by Nektar at the moment. There are certain types of implementations which cannot be detected by CHECK_FUNCTION_EXISTS. I didn't test it though. For further details: https://cmake.org/cmake/help/v3.8/module/CheckFunctionExists.html
Anyway this is where installation fails: [ 26%] Built target LocalRegions Linking CXX executable NekMesh /ess01/homebgq/scinet/bgq/compilers/gcc/4.8.1/bin/../lib/gcc/powerpc64-bgq-linux/4.8.1/../../../../powerpc64-bgq-linux/bin/ld: warning: libboost_atomic.so.1.57.0, needed by /scinet/bgq/Applications/nektar/ThirdParty_compiled/boost_1_57_0/install/lib/libboost_thread.so, not found (try using -rpath or -rpath-link) CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) [clone .part.27]': NodeOpti.cpp:(.text+0x1784): undefined reference to `dgeev_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `std::ctype<char>::widen(char) const [clone .part.33]': NodeOpti.cpp:(.text+0x1e6c): undefined reference to `dgeev_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::exception_detail::error_info_injector<std::domain_error>::~error_info_injector()': NodeOpti.cpp:(.text+0x25b4): undefined reference to `dgemm_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::exception_detail::error_info_injector<std::overflow_error>::~error_info_injector()': NodeOpti.cpp:(.text+0x44a4): undefined reference to `dgemm_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::system::system_error::what() const': NodeOpti.cpp:(.text._ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi2EEEdRdb[_ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi2EEEdRdb]+0x2e8): undefined reference to `dgemm_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::io::basic_altstringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_altstringbuf()': NodeOpti.cpp:(.text._ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi3EEEdRdb[_ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi3EEEdRdb]+0x318): undefined reference to `dgemm_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dtptrs_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `daxpy_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpbtrf_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dsptrf_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `dgemv_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbmv_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `dscal_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dtpmv_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetrf_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrf_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpbtrs_' ../../library/StdRegions/libStdRegions.so.4.5.0: undefined reference to `dcopy_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dsptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrf_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `ddot_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dspmv_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dsptri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrs_' collect2: error: ld returned 1 exit status make[2]: *** [utilities/NekMesh/NekMesh] Error 1 make[1]: *** [utilities/NekMesh/CMakeFiles/NekMesh.dir/all] Error 2 make: *** [all] Error 2
However implementation for dgemm: grep -irn "dgemm" * nektar/utilities/NekMesh/ProcessModules/ProcessVarOpti/Evaluator.hxx: Blas::Dgemm('N', 'N', pts, DIM * nElmt, ptsStd, 1.0,
By the way, as suggest in earlier messages in this thread, I enabled both -DTHIRDPARTY_BUILD_BLAS_LAPACK and -DNEKTAR_USE_SYSTEM_BLAS_LAPACK. Just in case, I've attached CMakeLists.txt file as well for other settings.
So the question is, why is it trying to reference the mangled names? Is it cmake causing the mess or the compiler? And how this can be tackled? When I grep "dgemm_", I get nothing except the object files.
Moreover, in earlier versions of Nektar (4.0.0 and 4.3.5 as far as I checked), there is a preprocessor definition called "NEKTAR_LIB_UTILITIES_LINEAR_ALGEBRA_DGEMM_OVERRIDE_HPP". This definition doesn't exist in the most recent git repo. However, I still can find the Blas.hpp for wrapper functions under the "LibUtilities/LinearAlgebra". Also, "DgemmOverride.hpp" doesn't exist as well. So at some point, could it be possible that portability is broken?
Thanks very much
// Fatih
On Mon, Nov 19, 2018 at 8:12 AM Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote: Hi Fatih,
Sorry for the delay in getting back to you on this. I'm afraid I don't have an immediate answer to the problem you're experiencing but I've done some investigations into the issues you're having so hopefully some feedback on these may provide some helpful information:
On 15 Nov 2018, at 06:35, Fatih Ertinaz <fertinaz@gmail.com> wrote:
Hello everyone,
I am working on the same task as Amitvikram, on the same cluster and currently having the exact same problem: undefined reference to lapack libs even though they are compiled successfully.
I've read through this thread as well as some others, so here is a brief summary about what I've done so far before asking some questions.
-- System info: Using cmake-2.8.12, cross-compiled gcc-4.8 and mpich-2 on a Blue Gene Q cluster.
-- Nektar version: Decided to use the git repo.
While I can't replicate the environment you're working in directly, I've set up a clean Ubuntu linux container with only a very basic initial set of packages installed. (All of my build attempts using the details provided below complete successfully.)
Within my base container, I've installed gcc-4.8 from packages (gcc-5 is the default version for the Ubuntu version I'm using - 16.04) built and installed cmake 2.8.12 from source and built and installed MPICH2 (1.5) from source.
I'm working with Nektar++ from source, using the master branch.
-- Added "-dynamic" flag to the "CMakeLists.txt" as it was suggested here: https://www.nektar.info/nektar-on-mira-cluster/
-- Boost: I initially used system installed boost but then decided to stick to the third-party version shipped with nektar. It is because, some of the required libs (for instance boost_iostreams) weren't part of the central installation. To deal with that, I firstly set up a partial build by referencing each individual library file explicitly in cmake command. In fact, it seems to build the required libs successfully but later fails during the nektar compilation. I think it messes up the environment and basically links to the wrong files. So anyway, I am using "ThirdParty/boost_1_57_0".
-- Lapack: The reason that I am not using system lapack is simply because cmake says "dgemm_" is not found in the system blas version. Therefore, I am compiling the "ThirdParty/Lapack-3.7.0" which I downloaded from "http://www.netlib.org/lapack/lapack-3.7.0.tgz".
Note that compilation fails with the same error even when I use ThirdParty/lapack.
I initially tried without using the dynamic flag but have subsequently tried with the -dynamic flag too.
I'm using ThirdParty boost 1.57 and ThirdParty lapack 3.7.0.
I'm also using ThirdParty Scotch. TinyXML and GSMPI are also built from source as ThirdParty dependencies during the Nektar++ build.
-- FFTW: Using system installed version.
I'm using a system installed FFTW from packages.
-- Download process: I cancelled MD5 checks and downloading with "wget" due to the similar ssl error mentioned before. This is an easy workaround though, and probably has nothing to do with the error. I download all packages to the "nektar/ThirdParty" and copy them to "nektar/build/ThirdParty" as well. The reason of this copy operation is that when nektar extracts the downloaded packages, I see that uncompressed folders are somehow empty. I don't know if that's a cmake bug, or a problem from my side. So that's why I download, extract and copy third party sources to "nektar/build/ThirdParty" manually.
Following the discussion on the list with Amitvikram about building when there is no external Internet access to download third party dependencies, I wrote up some of the points that I made about downloading dependencies manually - if you hadn't already seen this, it's on the Nektar++ website at https://www.nektar.info/building-nektar-offline-deps/
-- CMake command:
cmake $src \ -DCMAKE_INSTALL_PREFIX=$prf \ -DNEKTAR_USE_MPI=ON \ -DNEKTAR_USE_SYSTEM_BLAS_LAPACK=ON \ -DNEKTAR_USE_FFTW=ON \ -DFFTW_INCLUDE_DIR=${SCINET_FFTW_INC} \ -DFFTW_LIBRARY=${SCINET_FFTW_LIB}/libfftw3.a \ -DTHIRDPARTY_BUILD_BLAS_LAPACK=ON
As you can see, I enabled both "DNEKTAR_USE_SYSTEM_BLAS_LAPACK" and "DTHIRDPARTY_BUILD_BLAS_LAPACK" due to the suggestions; however this didn't seem to make a difference for me. Compilation fails at the same step with both are enabled or not.
I tried configuring using a similar cmake command to that which you've shown here - the only difference for me was that I didn't need to set the FFTW include directory since my FFTW install is in the system include path. I explicitly specified the path to FFTW_LIBRARY although this shouldn't be necessary since the library is, again, in the standard system library path.
-- Build process for Third-Party: In general they are compiled without any errors. In particular, I checked cmake files for each package and Lapack is configured with "-DBUILD_SHARED_LIBS:STRING=ON". I can see that objects are compiled with "-fPIC" option, it is in the cmake. However, "lapack/CMakeLists.txt" contains this line: "option(BUILD_SHARED_LIBS "Build shared libraries" OFF)" which I set to "ON" in my build script.
This is how libraries look in the "nektar/build/ThirdParty/dist/lib" directory after compiling ThirdParty libraries: bgqdev-fen1-$ ls nektar/build/ThirdParty/dist/lib/ cmake libboost_program_options.so libgsmpi.a libtinyxml.a libblas.so libboost_program_options.so.1.57.0 liblapack.so libxxt.a libblas.so.3 libboost_regex.so liblapack.so.3 libz.a libblas.so.3.7.0 libboost_regex.so.1.57.0 liblapack.so.3.7.0 libz.so libboost_filesystem.so libboost_system.so libscotch.a libz.so.1 libboost_filesystem.so.1.57.0 libboost_system.so.1.57.0 libscotcherr.a libz.so.1.2.7 libboost_iostreams.so libboost_thread.so libscotcherrexit.a pkgconfig libboost_iostreams.so.1.57.0 libboost_thread.so.1.57.0 libscotchmetis.a
This folder is about 1.5GB by the way.
I have exactly the same contents in my ThirdParty/dist/lib directory after building of the third party dependencies. The resulting files are nowhere near as large as yours, I assume the very large size of the folder is something to do with the static libraries being very large although I'm not sure why they would be so big.
I think the point you make about building of shared libraries being set to OFF in the CMakeLists.txt file for lapack shouldn't be an issue. If you look in $src/cmake/ThirdPartyBlasLapack.cmake, you should see in the EXTERNAL_PROJECT_ADD command that it is configuring lapack using CMake and specifying -DBUILD_SHARED_LIBS:STRING=ON. You should also be able to verify that lapack was, indeed, configured with this parameter by looking in the CMakeCache.txt file in $build/ThirdParty/lapack-3.7.0/.
However, "nektar/build/ThirdParty/dist/include" folder doesn't have lapack related headers: bgqdev-fen1-$ ls boost scotchf.h scotch.h tinystr.h tinyxml.h zconf.h zlib.h bgqdev-fen1-$ pwd /scinet/bgq/Applications/nektar/nektar/build/ThirdParty/dist/include
I also see exactly the same in my include folder, there are no lapack related headers.
Also, I can share the initial parts of the lapack build - in this version I tried to reference to the system blas for lapack installation: [ 6%] Performing configure step for 'lapack-3.7.0' cd /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0 && /gpfs/home/scinet/bgq/tools/cmake/2.8.12.1/bin/cmake -G "Unix Makefiles" -DCMAKE_Fortran_COMPILER:FIL EPATH=/scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -DCMAKE_INSTALL_PREFIX:PATH=/scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/dist -DCMAKE_INSTALL_LIBDIR:PATH=/scine t/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/dist/lib -DBUILD_SHARED_LIBS:STRING=ON -DBUILD_TESTING:STRING=OFF /scinet/bgq/Applications/nektar/nektar/ThirdParty/lapack-3.7 .0 Re-run cmake no build system arguments -- Setting build type to 'Release' as none was specified. -- The Fortran compiler identification is GNU -- Check for working Fortran compiler: /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -- Check for working Fortran compiler: /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -- works -- Detecting Fortran compiler ABI info -- Detecting Fortran compiler ABI info - done -- Checking whether /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 supports Fortran 90 -- Checking whether /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 supports Fortran 90 -- yes -- Looking for Python greater than 2.6 - -- Could NOT find PythonInterp: Found unsuitable version "2.6.6", but required is at least "2.7" (found /usr/bin/python2) -- No suitable Python version found, so skipping summary tests. -- Reducing RELEASE optimization level to O2 -- Looking for Fortran NONE - found -- Looking for Fortran INT_CPU_TIME - found -- Looking for Fortran EXT_ETIME - not found -- Looking for Fortran EXT_ETIME_ - not found -- Looking for Fortran INT_ETIME - found -- --> Will use second_INT_ETIME.f and dsecnd_INT_ETIME.f as timing function. -- Using supplied NETLIB BLAS implementation -- Using supplied NETLIB LAPACK implementation -- Building Single Precision -- Building Double Precision -- Building Complex Precision -- Building Double Complex Precision -- BUILD TESTING : OFF -- Configuring done -- Generating done -- Build files have been written to: /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0
Again, I see exactly the same output for configuration of lapack. However, when I initially ran this, the build system was picking up my standard C/C++/Fortran compilers so it was using gfortran rather than the MPI version. I reconfigured/rebuilt from scratch specifically telling the build system to use mpicc and mpic++ and setting -DCMAKE_Fortran_COMPILER to point to mpif90, after this I see the same as you have shown above and build again completes successfully.
Additionally, I can see "dgemm" in the log.make: bgqdev-fen1-$ grep -rn "dgemm" nektar/build-gcc/log.make.2 13756:[ 3%] Building Fortran object BLAS/SRC/CMakeFiles/blas.dir/dgemm.f.o 13757:cd /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0/BLAS/SRC && /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -Dblas_EXPORTS -O2 -fPIC -c /scinet/bgq/Applications/nektar/nektar/ThirdParty/lapack-3.7.0/BLAS/SRC/dgemm.f -o CMakeFiles/blas.dir/dgemm.f.o 14018:/scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -fPIC -O2 -Wl,-rpath=/bgsys/drivers/ppcfloor/comm/lib/libmpichf90-gcc.so.8 -shared -Wl,-soname,libblas.so.3 -o ../../lib/libblas.so.3.7.0 CMakeFiles/blas.dir/isamax.f.o CMakeFiles/blas.dir/sasum.f.o CMakeFiles/blas.dir/saxpy.f.o CMakeFiles/blas.dir/scopy.f.o ............... CMakeFiles/blas.dir/dgemm.f.o
This is the part that compilation fails:
[ 34%] Building CXX object utilities/NekMesh/CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/ElUtil.cpp.o ......... ......... /........./bgq/compilers/gcc/4.8.1/bin/../lib/gcc/powerpc64-bgq-linux/4.8.1/../../../../powerpc64-bgq-linux/bin/ld: warning: libmpichf90-gcc.so.8, needed by /scinet/bgq/Applications/nektar/nektar/build/ThirdParty/dist/lib/libblas.so, not found (try using -rpath or -rpath-link) ......... NodeOpti.cpp:(.text+0x1784): undefined reference to `dgeev_' ......... NodeOpti.cpp:(.text+0x25b4): undefined reference to `dgemm_' ......... ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrs_' collect2: error: ld returned 1 exit status make[2]: *** [utilities/NekMesh/NekMesh] Error 1
I've tried many things; xx enabled configure option "build_shared_libs" in CMakeLists.txt in "ThirdParty/lapack" xx made a copy of "make.inc.example" in ThirdParty/lapack and reduced optimization levels xx since this is a Blue Gene environment made reference to ESSL instead of BLAS
But none of it seems to makes a difference. It always fails in the exact same step.
This "libmpichf90-gcc.so.8" warning seems a bit odd to me and I am not sure if that has anything to do with the undefined ref err. I created a symlink to this library and added it to "LD_LIBRARY_PATH" as well, but then it failed with the following message "undefined symbol: _cnkspi_MemoryRegionCacheLastAccessedElementNumber" by "libpami-gcc.so" where PAMI is a lower level messaging api by IBM. Also, "cnkspi" sound far too low level because "CNK" is the kernel on the compute nodes and "SPI" is the implementation that allows communication with that kernel. I added a linker flag "-Wl,-rpath" but I guess it only makes things go uglier.
bgqdev-fen1-$ readelf -d nektar/build-gcc/ThirdParty/dist/lib/libblas.so | grep NEEDED 0x0000000000000001 (NEEDED) Shared library: [libmpichf90-gcc.so.8] 0x0000000000000001 (NEEDED) Shared library: [libmpich-gcc.so.8] 0x0000000000000001 (NEEDED) Shared library: [libopa-gcc.so.0] 0x0000000000000001 (NEEDED) Shared library: [libmpl-gcc.so.1] 0x0000000000000001 (NEEDED) Shared library: [libpami-gcc.so] 0x0000000000000001 (NEEDED) Shared library: [librt.so.1] 0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0] 0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6] 0x0000000000000001 (NEEDED) Shared library: [libc.so.6] 0x0000000000000001 (NEEDED) Shared library: [libnss_files.so.2] 0x0000000000000001 (NEEDED) Shared library: [libnss_dns.so.2] 0x0000000000000001 (NEEDED) Shared library: [libresolv.so.2] 0x0000000000000001 (NEEDED) Shared library: [libgfortran.so.3] 0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
xx As an alternative, I switched to static linking. I initially changed "NEKTAR_LIBRARY_TYPE" to "STATIC" in the "CMakeLists.txt".
This is the one area where I have a number of differences to you. I'm not sure that switching to static linking is likely to make much difference (although I could be wrong) however I note that your blas library seems to require various mpich libraries. My libblas.so library only lists libm.so.6 and libgfortran.so.3 as "NEEDED". Can you also provide the RPATH value that you get from readelf -d for this library...
If you could also provide your output of readelf -d for library/LibUtilities/libLibUtilities.so, that would be useful. My libLibUtilities.so needs a few boost libraries as well as libz, libblas, liblapack, libmpich, libpthread, libgcc_s libc, libstdc++ and libm.
xx It seems that some of the ThirdParty libraries are configured with the assumption of shared objects so I changed them as well. For instance, boost is configured with options "link=shared" and "runtime-link=shared" which I set to static. I can see all required boost libs are successfully compiled and written to "build/ThirdParty/dist/lib".
Now this is the cmake command: cmake $src \ -DCMAKE_INSTALL_PREFIX=$prf \ -DNEKTAR_USE_MPI=ON \ -DNEKTAR_USE_SYSTEM_BLAS_LAPACK=ON \ -DNATIVE_BLAS:FILEPATH=${SCINET_LAPACK_LIB}/libblas.a \ -DNATIVE_BLAS_LIB_DIR:FILEPATH=${SCINET_LAPACK_LIB} \ -DNATIVE_LAPACK:FILEPATH=${SCINET_LAPACK_LIB}/liblapack.a \ -DNATIVE_LAPACK_LIB_DIR:FILEPATH=${SCINET_LAPACK_LIB} \ -DNEKTAR_USE_FFTW=ON \ -DFFTW_INCLUDE_DIR=${SCINET_FFTW_INC} \ -DFFTW_LIBRARY=${SCINET_FFTW_LIB}/libfftw3.a \ -DBoost_NO_SYSTEM_PATHS:BOOL=TRUE \ -DZLIB_INCLUDE_DIR:PATH=${SCINET_ZLIB_INC} \ -DZLIB_LIBRARY:FILEPATH=${SCINET_ZLIB_LIB}/libz.a
The issue now is installer seems to ignore "-DBoost_NO_SYSTEM_PATHS:BOOL=TRUE" and seeks locations other than "BOOST_ROOT" which I set to "nektar/build-gcc/dist".
See for instance: [ 5%] Building CXX object library/LibUtilities/CMakeFiles/LibUtilities.dir/BasicUtils/ArrayEqualityCompa rison.cpp.o cd /scinet/bgq/Applications/nektar/nektar++-4.4.1/build-gcc-static/library/LibUtilities && /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpicxx -DLIB_UTILITIES_EXPORTS -DNEKTAR_MEMORY_POOL_ENABLED -DNEKTAR_USE_MPI -DNEKTAR_USING_BLAS -DNEKTAR_USING_LAPACK -DNEKTAR_VERSION=\"4.4.1\" -DTIXML_USE_STL -O3 -DNDEBUG -Wall -Wno-deprecated -Wno-sign-compare -DNEKTAR_RELEASE -isystem /scinet/bgq/Applications/nektar/nektar++-4.4.1/build-gcc-static/ThirdParty/dist/include -isystem /scinet/bgq/Libraries/zlib-1.2.7-gcc4.8.1/include -isystem /scinet/bgq/Applications/nektar/nektar++-4.4.1/ThirdParty/dist/include -isystem /scinet/bgq/Libraries/fftw-3.3.5-gcc/include -I/scinet/bgq/Applications/nektar/nektar++-4.4.1 -I/scinet/bgq/Applications/nektar/nektar++-4.4.1/library -o CMakeFiles/LibUtilities.dir/BasicUtils/ArrayEqualityComparison.cpp.o -c /scinet/bgq/Applications/nektar/nektar++-4.4.1/library/LibUtilities/BasicUtils/ArrayEqualityComparison.cpp In file included from /bgsys/linux/ionfloor/usr/include/boost/config.hpp:57:0, from /bgsys/linux/ionfloor/usr/include/boost/cstdint.hpp:26, from /scinet/bgq/Applications/nektar/nektar++-4.4.1/library/LibUtilities/BasicConst/NektarUnivTypeDefs.hpp:40,
So the main quiestion is: Why does it check "/usr/include/boost" when "cstdint.hpp" already exists in the "build/dist/include/boost/"? bgqdev-fen1-$ ls build-gcc/dist/include/boost/cstdint.hpp -l -rw-r--r-- 1 fertinaz scinet 18017 Nov 14 19:00 build-gcc/dist/include/boost/cstdint.hpp
This is how it finally fails: /bgsys/linux/ionfloor/usr/include/boost/archive/iterators/binary_from_base64.hpp:52:9: warning: narrowing conversion of ‘-1’ from ‘int’ to ‘const char’ inside { } is ill-formed in C++11 [-Wnarrowing] make[2]: *** [library/LibUtilities/CMakeFiles/LibUtilities.dir/BasicUtils/CompressData.cpp.o] Error 1
It doesn't help to change the boost code from "const char lookup_table" to "signed char lookup_table" because then "switch-case" statement that returns the endianness information fails in the following file: "nektar/library/LibUtilities/BasicUtils/CompressData.cpp"
As you can guess, I disabled the switch-case block, and returned the value, but it fails anyway...
Sorry for the long message, hope you could follow. I've run out of ideas and any suggestion is highly appreciated....
// Fatih
I'd be inclined to stick with the third party boost and lapack and see if we can find a solution to that. I can't see what C/C++ compilers you're using but have you tried forcing the use of mpicc and mpic++ as shown when running cmake in the instructions at https://www.nektar.info/nektar-on-mira-cluster/? (in fact, I see above that in your log output for the build command for ArrayEqualityComparison.cpp, it looks like it's using mpicxx)
I'm not sure why libblas.so is linking in libmpichf90-gcc.so.8 but I'm assuming this is the core of the problem. Can you confirm what settings you're using to get the additional logging output that you're showing (which the build commands) - is it just -vv? I can then try and run the same and see if I can provide any further suggestions. I'm not clear at the moment but I'm assuming the undefined reference errors are a result of trying to link in libblas.so and that library itself having an undefined reference to libmpichf90. It might be that the rpath settings can be modified to take account of this.
Can you provide the output of running ldd on libblas.so, liblapck.so and libLibUtilities.so?
Thanks,
Jeremy
On Sun, Oct 14, 2018 at 7:42 AM Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote:
Hi Amitvikram,
I would certainly try Chris's suggestion. However, something else to check is where you're getting the third party downloads from.
If you take a clean Nektar++ source tree and place the standard netlib lapack-3.7.0.tgz source file that build system downloads into $NEKTAR_HOME/ThirdParty (i.e. the download from http://www.netlib.org/lapack/lapack-3.7.0.tgz), the build should proceed successfully.
It looks like the lapack tar file that you're using may already have some build artefacts in it - did you tar the content from $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0 into a lapack-3.7.0.tgz file or you're working with the standard .tgz file from the netlib.org site?
Cheers, Jeremy
On 13 Oct 2018, at 21:19, Chris Cantwell <c.cantwell@imperial.ac.uk> wrote:
Hi Amitvikram,
Some sites block non-SSL enabled HTTP traffic, returning a webpage reporting the error rather than the actual file (hence the hash mismatch).
You could try turning on the THIRDPARTY_USE_SSL option to see if that is allowed.
Cheers, Chris
On Fri, 12 Oct 2018 15:15:10 -0400, Amitvikram Dutta <amitvdutta23@gmail.com> wrote:
Hi all,
I had to use a slight workaround because the platform I was compiling on could not download the thirdparty files correctly and would always give a hash mismatched error. I downloaded the third party files on my home machine and re-uploaded them into the $NEKTAR_HOME folder. As a result when I begin the build process the ThirdParty folder is populated with the zipped versions of the third party software. Is this what what might be causing the problem? Is there any way to edit the cmake file and to get the compilation process working correctly?
On Fri, Oct 12, 2018 at 2:57 PM Amitvikram Dutta < amitvikram.dutta@uwaterloo.ca> wrote:
> *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen > *Sent:* October 12, 2018 2:56:43 PM (UTC-05:00) Eastern Time (US & Canada) > > *To:* Amitvikram Dutta > *Cc:* nektar-users > *Subject:* Re: [Nektar-users] Problem while installing nektar++ with > lapack > > Hi Amitvikram, > > Have you attempted to build lapack separately at any point? It's probably > worth clearing out your build directory and also all the contents of the > ThirdParty directory in the base nektar++ source directory, which I'll call > $NEKTAR_HOME, so $NEKTAR_HOME/ThirdParty and then re-running cmake in an > empty $NEKTAR_HOME/build directory and trying the build again. > > It looks like the build step is encountering a previous source tree in the > location where it's trying to build which seems strange. > > I've just had a look at the log from my clean build and I see exactly the > same messages as you in relation to lapack-3.7.0 in the same order as far > as "Checking whether /usr/bin/f95 supports Fortran 90 -- yes", however, I > then see "-- Looking for Python greater than 2.6 - " and the build of > lapack completes successfully. > > Just to confirm, I am running cmake and make in a separate build directory > under the main nektar++ source tree directory, so I'm building in > $NEKTAR_HOME/build - I assume you're doing something similar? You should > see a ThirdParty directory in $NEKTAR_HOME and another ThirdParty directory > in $NEKTAR_HOME/build/ > > I believe that the initial download of the lapack-3.7.0.tar.gz should be > placed in $NEKTAR_HOME/ThirdParty and unpacked there. Then, when the build > succeeds or stops, in $NEKTAR_HOME/build/ThirdParty, you should see > lapack-3.7.0/ where I think the build actually takes place, and then a > separate $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0-tmp which should > contain a couple of lapack-3.7.0-cfgcmd.txt files that I think contain the > build command that is used - you could perhaps paste the contents of the > lapack-3.7.0-cfgcmd.txt into an email if you're still having issues and we > can see if that looks correct. > > It is, of course, possible that this is something related to the specific > configuration of the platform that you're building on, but I think the > third party lapack build should be straightforward and it sounds like for > some reason, it's attempting to build in the wrong location, or a location > where an existing source tree has ended up for some reason. > > I'm afraid I don't have a very detailed knowledge of the build system > beyond this so if none of the suggestions so far help you to resolve the > problem, maybe someone with more knowledge of the build system can provide > some advice. > > Cheers, > Jeremy > > On 12 Oct 2018, at 19:34, Amitvikram Dutta <amitvdutta23@gmail.com> wrote: > > Hi all, > > I tried to compile nektar using Jeremy's latest suggestions having both > THIRDPARTY_BUILD_BLAS_LAPACK and NEKTAR_USE_SYSTEM_BLAS_LAPACK turned on. > The following error occured. It seems that I might have to compile lapack > separately. Is this unusual? > > <image.png> > > On Fri, Oct 12, 2018 at 1:16 PM Amitvikram Dutta < > amitvikram.dutta@uwaterloo.ca> wrote: > >> *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen >> *Sent:* October 12, 2018 1:16:14 PM (UTC-05:00) Eastern Time (US & >> Canada) >> >> *To:* Amitvikram Dutta >> *Cc:* nektar-users >> *Subject:* Re: [Nektar-users] Problem while installing nektar++ with >> lapack >> >> Hi Amitvikram, >> >> As Chris suggests, it's probably better to use vendor supplied libraries >> if you can get those working. >> >> In addition to the further information Chris has asked to take a look at, >> one thing you could check is to whether there are any files in your >> nektar++/build/ThirdParty/lapack-3.7.0 directory (if that directory exists >> at all). >> >> I've been trying to see if I can recreate the problem and I was able to >> see something similar when setting THIRDPARTY_BUILD_BLAS_LAPACK=ON and >> NEKTAR_USE_SYSTEM_BLAS_LAPACK=OFF. In this case, I can configure >> successfully and start the build but it fails with a large number of >> undefined references that are similar to, and include, the dtpmv_ symbol >> that you mentioned. When I look in nektar++/build/ThirdParty/lapack-3.7.0, >> the directory is empty so it looks like the build system has configured on >> the basis of building its own blas/lapack but the build hasn't been carried >> out and therefore LibUtilities can't be linked against it. >> >> As a test, you could try running the build with both >> THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=ON, if >> this isn't the setting you've been using already. When I tried this, the >> build of blas/lapack is carried out successfully and the linking is fine >> with the full build of Nektar++ completing successfully. I removed the >> system blas/lapack on my test system to be sure it was linking against the >> correct instance. >> >> Cheers, >> Jeremy >> >> On 12 Oct 2018, at 17:50, Chris Cantwell <c.cantwell@imperial.ac.uk> >> wrote: >> >> Hi Amitvikram, >> >> Could you send us your CMakeCache.txt file from your build directory and >> the output from running: >> make VERBOSE=1 >> for both cases. >> >> In the case of using ThirdParty LAPACK, it seems to not be linking to it. >> Probably you should be using vendor-supplied libraries if possible though >> so better if ee can get those working. >> >> Thanks, >> Chris >> >> >> >> On 12 October 2018 14:08:55 BST, Amitvikram Dutta <amitvdutta23@gmail.com> >> wrote: >>> >>> Hi Jeremy, >>> >>> I'm actually trying to build nektar++ on a BGQ cluster similar to Mira. >>> >>> I'm trying to build nektar++-4.4.1 and the system lapacek version is >>> 3.4.2 >>> >>> Sincerely, >>> On Fri, Oct 12, 2018 at 4:24 AM Amitvikram Dutta < >>> amitvikram.dutta@uwaterloo.ca> wrote: >>> >>>> >>>> ------------------------------ >>>> *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen >>>> *Sent:* October 12, 2018 4:24:33 AM (UTC-05:00) Eastern Time (US & >>>> Canada) >>>> *To:* Amitvikram Dutta >>>> *Cc:* nektar-users >>>> *Subject:* Re: [Nektar-users] Problem while installing nektar++ with >>>> lapack >>>> >>>> Hi Amitvikram, >>>> >>>> Can you provide some further details of the problem you're encountering. >>>> >>>> Specifically, can you confirm what platform (including version) you're >>>> building on, and if Linux, which I assume is the platform you're using, >>>> which distribution. >>>> >>>> Can you also confirm what version of Nektar++ you're trying to build, >>>> and the version of the system Lapack distribution that you're using. >>>> >>>> Thanks, >>>> >>>> Jeremy >>>> >>>> On 12 Oct 2018, at 01:05, Amitvikram Dutta <amitvdutta23@gmail.com> >>>> wrote: >>>> >>>> Hi all, >>>> >>>> I keep having the same problem while trying to install nektar++ with >>>> regards to the Lapack libraries. >>>> >>>> When I try to use the system Lapack installation I get the following >>>> message >>>> >>>> */scinet/bgq/Libraries/lapack/lib/liblapack.so: undefined reference to >>>> `_xlfEndIO@XLF_1.0'* >>>> >>>> while when I try to install using the ThirdParty Lapack supplied with >>>> the nektar++ source directory I get the following error >>>> >>>> *../../library/LibUtilities/libLibUtilities.so.4.4.1: undefined >>>> reference to `dtpmv_'* >>>> >>>> I have a feeling these errors have been encountered by the community at >>>> large before. Could someone point out where I'm going wrong? >>>> >>>> Sincerely, >>>> -- >>>> >>>> *Amitvikram Dutta* >>>> >>>> Graduate Research Assistant >>>> >>>> Fluid Mechanics Research Lab >>>> >>>> Multi-Physics Interaction Lab >>>> >>>> University of Waterloo >>>> _______________________________________________ >>>> Nektar-users mailing list >>>> Nektar-users@imperial.ac.uk >>>> https://mailman.ic.ac.uk/mailman/listinfo/nektar-users >>>> >>>> >>>> -- >>> >>> *Amitvikram Dutta* >>> >>> Graduate Research Assistant >>> >>> Fluid Mechanics Research Lab >>> >>> Multi-Physics Interaction Lab >>> >>> University of Waterloo >>> >> >> -- >> Chris Cantwell >> Imperial College London >> South Kensington Campus >> London SW7 2AZ >> Email: c.cantwell@imperial.ac.uk >> www.imperial.ac.uk/people/c.cantwell >> >> >> -- > > *Amitvikram Dutta* > > Graduate Research Assistant > > Fluid Mechanics Research Lab > > Multi-Physics Interaction Lab > > University of Waterloo > > > --
*Amitvikram Dutta*
Graduate Research Assistant
Fluid Mechanics Research Lab
Multi-Physics Interaction Lab
University of Waterloo
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
_______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/nektar-users
<CMakeLists.txt>
Hi Fatih, Glad you have been able to solve your problem. Would you be able to provide a patch with your changes so we can include them in the code for future users of the architecture? Kind regards, Chris On Thu, 21 Feb 2019 19:36:22 +0000, Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote:
Hi Fatih,
Many thanks for the feedback on how you solved this issue - it's good to hear that it's now resolved.
Kind regards,
Jeremy
On 21 Feb 2019, at 17:58, Fatih Ertinaz <fertinaz@gmail.com> wrote:
Hello Jeremy
I think this has been resolved now.
I never saw the issue ticketed you sent, however I'm impressed how similar the way that the ticket owner expresses the problem and seeks workaround.
So thank you for sharing it because it inspired a lot. I didn't find the exact solution there, since mfem's implementation is different than Nektar. However at some point, it made me realize that the code responsible for naming transformation is defined under -- nektar++-4.4.1: $NEK_PATH/library/LibUtilities/LinearAlgebra/TransF77.hpp
I basically added the power pc to the macro, and that seems to be the solution. There was one single Lapack function call in the "DriverArnoldiModified.cpp" with an underscore that needed an explicit modification but other than that these two things were only source modifications I had to make.
I had to make lots of changes to the CMake and eventually decided to compile Third-Party libraries completely separately from scratch.
Thank you for your help.
// Fatih
On Wed, Feb 20, 2019 at 10:14 AM Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote: Hi Fatih,
As you suggest, it looks like this is a name mangling problem and I suspect it must be something specific to the BGQ platform. There are others on the nektar list who are much more experienced with CMake than me and have more knowledge of the history of the codebase so maybe someone else can explain the now missing "NEKTAR_LIB_UTILITIES_LINEAR_ALGEBRA_DGEMM_OVERRIDE_HPP" preprocessor directive that you mention and the missing DgemmOverride.hpp header.
Aside from this, I don't know if you've seen the following: https://github.com/mfem/mfem/issues/397 While this is for a completely unrelated library, it looks like it's describing a similar problem to what you're experiencing. That thread also links to the hypre repository, providing an example of using "configureable macros" for name mangling. Don't know if this is of any help but it might be worth investigating.
Regards,
Jeremy
On 12 Feb 2019, at 22:03, Fatih Ertinaz <fertinaz@gmail.com> wrote:
Hello Jeremy
I appreciate your detailed response and sorry for my late reply. I came back to this issue over the weekend, made certain changes and achieved some progress. Thus wanted to share current status.
I am convinced that the problem is related to linking BLAS and LAPACK, and probably it is specific to this platform -- Blue Gene Q. Because I managed to run nektar successfully on many different platforms and never encountered an issue. I am using latest git repo for nektar by the way.
I compiled some of the third-party libraries (boost, scotch and blas & lapack) separately under the directory I created "nektar/ThirdParty_compiled". This resolved earlier problems with boost. Also, when I check blas & lapack functions that are used by nektar, I can find their references in their libraries: bgqdev-fen1-$ nm libblas.a | grep -i dgemm dgemm.f.o: 0000000000000000 D dgemm bgqdev-fen1-$ nm liblapack.a | grep -i dgeev dgeev.f.o: 0000000000000000 D dgeev dgeevx.f.o: 0000000000000000 D dgeevx
As you see those are static libs because when shared objects are used, cmake doesn't detect BLAS (don't know why -- maybe BGQ) even though full-paths are provided. Same thing with static libs seems to be at least detected by cmake (prints out BLAS API found -- see below).
On the other hand, I compiled boost with shared libs, and cmake recognizes them correctly.
I also want to emphasize that, mangled names don't appear in the BLAS-LAPACK libs. So for instance "dgemm_" doesn't exist. And this is why the installation fails.
This is from cmake -- I hacked cmake to seek "dgemm" as well, but it is also not found: -- Looking for dgemm_ -- Looking for dgemm_ - not found -- Looking for Fortran sgemm -- Looking for Fortran sgemm - found -- A library with BLAS API found. -- Looking for Fortran cheev -- Looking for Fortran cheev - found
Before going forward, I'd like to make a suggestion for these types of checks. CMake documentation says, "Prefer using CHECK_SYMBOL_EXISTS instead of this module..." referring to CHECK_FUNCTION_EXISTS which is used by Nektar at the moment. There are certain types of implementations which cannot be detected by CHECK_FUNCTION_EXISTS. I didn't test it though. For further details: https://cmake.org/cmake/help/v3.8/module/CheckFunctionExists.html
Anyway this is where installation fails: [ 26%] Built target LocalRegions Linking CXX executable NekMesh /ess01/homebgq/scinet/bgq/compilers/gcc/4.8.1/bin/../lib/gcc/powerpc64-bgq-linux/4.8.1/../../../../powerpc64-bgq-linux/bin/ld: warning: libboost_atomic.so.1.57.0, needed by /scinet/bgq/Applications/nektar/ThirdParty_compiled/boost_1_57_0/install/lib/libboost_thread.so, not found (try using -rpath or -rpath-link) CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) [clone .part.27]': NodeOpti.cpp:(.text+0x1784): undefined reference to `dgeev_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `std::ctype<char>::widen(char) const [clone .part.33]': NodeOpti.cpp:(.text+0x1e6c): undefined reference to `dgeev_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::exception_detail::error_info_injector<std::domain_error>::~error_info_injector()': NodeOpti.cpp:(.text+0x25b4): undefined reference to `dgemm_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::exception_detail::error_info_injector<std::overflow_error>::~error_info_injector()': NodeOpti.cpp:(.text+0x44a4): undefined reference to `dgemm_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::system::system_error::what() const': NodeOpti.cpp:(.text._ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi2EEEdRdb[_ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi2EEEdRdb]+0x2e8): undefined reference to `dgemm_' CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/NodeOpti.cpp.o: In function `boost::io::basic_altstringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_altstringbuf()': NodeOpti.cpp:(.text._ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi3EEEdRdb[_ZN6Nektar9Utilities8NodeOpti13GetFunctionalILi3EEEdRdb]+0x318): undefined reference to `dgemm_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dtptrs_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `daxpy_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpbtrf_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dsptrf_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `dgemv_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbmv_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `dscal_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dtpmv_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetrf_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrf_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpbtrs_' ../../library/StdRegions/libStdRegions.so.4.5.0: undefined reference to `dcopy_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dsptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrf_' ../../library/LocalRegions/libLocalRegions.so.4.5.0: undefined reference to `ddot_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dspmv_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dsptri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrs_' collect2: error: ld returned 1 exit status make[2]: *** [utilities/NekMesh/NekMesh] Error 1 make[1]: *** [utilities/NekMesh/CMakeFiles/NekMesh.dir/all] Error 2 make: *** [all] Error 2
However implementation for dgemm: grep -irn "dgemm" * nektar/utilities/NekMesh/ProcessModules/ProcessVarOpti/Evaluator.hxx: Blas::Dgemm('N', 'N', pts, DIM * nElmt, ptsStd, 1.0,
By the way, as suggest in earlier messages in this thread, I enabled both -DTHIRDPARTY_BUILD_BLAS_LAPACK and -DNEKTAR_USE_SYSTEM_BLAS_LAPACK. Just in case, I've attached CMakeLists.txt file as well for other settings.
So the question is, why is it trying to reference the mangled names? Is it cmake causing the mess or the compiler? And how this can be tackled? When I grep "dgemm_", I get nothing except the object files.
Moreover, in earlier versions of Nektar (4.0.0 and 4.3.5 as far as I checked), there is a preprocessor definition called "NEKTAR_LIB_UTILITIES_LINEAR_ALGEBRA_DGEMM_OVERRIDE_HPP". This definition doesn't exist in the most recent git repo. However, I still can find the Blas.hpp for wrapper functions under the "LibUtilities/LinearAlgebra". Also, "DgemmOverride.hpp" doesn't exist as well. So at some point, could it be possible that portability is broken?
Thanks very much
// Fatih
On Mon, Nov 19, 2018 at 8:12 AM Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote: Hi Fatih,
Sorry for the delay in getting back to you on this. I'm afraid I don't have an immediate answer to the problem you're experiencing but I've done some investigations into the issues you're having so hopefully some feedback on these may provide some helpful information:
On 15 Nov 2018, at 06:35, Fatih Ertinaz <fertinaz@gmail.com> wrote:
Hello everyone,
I am working on the same task as Amitvikram, on the same cluster and currently having the exact same problem: undefined reference to lapack libs even though they are compiled successfully.
I've read through this thread as well as some others, so here is a brief summary about what I've done so far before asking some questions.
-- System info: Using cmake-2.8.12, cross-compiled gcc-4.8 and mpich-2 on a Blue Gene Q cluster.
-- Nektar version: Decided to use the git repo.
While I can't replicate the environment you're working in directly, I've set up a clean Ubuntu linux container with only a very basic initial set of packages installed. (All of my build attempts using the details provided below complete successfully.)
Within my base container, I've installed gcc-4.8 from packages (gcc-5 is the default version for the Ubuntu version I'm using - 16.04) built and installed cmake 2.8.12 from source and built and installed MPICH2 (1.5) from source.
I'm working with Nektar++ from source, using the master branch.
-- Added "-dynamic" flag to the "CMakeLists.txt" as it was suggested here: https://www.nektar.info/nektar-on-mira-cluster/
-- Boost: I initially used system installed boost but then decided to stick to the third-party version shipped with nektar. It is because, some of the required libs (for instance boost_iostreams) weren't part of the central installation. To deal with that, I firstly set up a partial build by referencing each individual library file explicitly in cmake command. In fact, it seems to build the required libs successfully but later fails during the nektar compilation. I think it messes up the environment and basically links to the wrong files. So anyway, I am using "ThirdParty/boost_1_57_0".
-- Lapack: The reason that I am not using system lapack is simply because cmake says "dgemm_" is not found in the system blas version. Therefore, I am compiling the "ThirdParty/Lapack-3.7.0" which I downloaded from "http://www.netlib.org/lapack/lapack-3.7.0.tgz".
Note that compilation fails with the same error even when I use ThirdParty/lapack.
I initially tried without using the dynamic flag but have subsequently tried with the -dynamic flag too.
I'm using ThirdParty boost 1.57 and ThirdParty lapack 3.7.0.
I'm also using ThirdParty Scotch. TinyXML and GSMPI are also built from source as ThirdParty dependencies during the Nektar++ build.
-- FFTW: Using system installed version.
I'm using a system installed FFTW from packages.
-- Download process: I cancelled MD5 checks and downloading with "wget" due to the similar ssl error mentioned before. This is an easy workaround though, and probably has nothing to do with the error. I download all packages to the "nektar/ThirdParty" and copy them to "nektar/build/ThirdParty" as well. The reason of this copy operation is that when nektar extracts the downloaded packages, I see that uncompressed folders are somehow empty. I don't know if that's a cmake bug, or a problem from my side. So that's why I download, extract and copy third party sources to "nektar/build/ThirdParty" manually.
Following the discussion on the list with Amitvikram about building when there is no external Internet access to download third party dependencies, I wrote up some of the points that I made about downloading dependencies manually - if you hadn't already seen this, it's on the Nektar++ website at https://www.nektar.info/building-nektar-offline-deps/
-- CMake command:
cmake $src \ -DCMAKE_INSTALL_PREFIX=$prf \ -DNEKTAR_USE_MPI=ON \ -DNEKTAR_USE_SYSTEM_BLAS_LAPACK=ON \ -DNEKTAR_USE_FFTW=ON \ -DFFTW_INCLUDE_DIR=${SCINET_FFTW_INC} \ -DFFTW_LIBRARY=${SCINET_FFTW_LIB}/libfftw3.a \ -DTHIRDPARTY_BUILD_BLAS_LAPACK=ON
As you can see, I enabled both "DNEKTAR_USE_SYSTEM_BLAS_LAPACK" and "DTHIRDPARTY_BUILD_BLAS_LAPACK" due to the suggestions; however this didn't seem to make a difference for me. Compilation fails at the same step with both are enabled or not.
I tried configuring using a similar cmake command to that which you've shown here - the only difference for me was that I didn't need to set the FFTW include directory since my FFTW install is in the system include path. I explicitly specified the path to FFTW_LIBRARY although this shouldn't be necessary since the library is, again, in the standard system library path.
-- Build process for Third-Party: In general they are compiled without any errors. In particular, I checked cmake files for each package and Lapack is configured with "-DBUILD_SHARED_LIBS:STRING=ON". I can see that objects are compiled with "-fPIC" option, it is in the cmake. However, "lapack/CMakeLists.txt" contains this line: "option(BUILD_SHARED_LIBS "Build shared libraries" OFF)" which I set to "ON" in my build script.
This is how libraries look in the "nektar/build/ThirdParty/dist/lib" directory after compiling ThirdParty libraries: bgqdev-fen1-$ ls nektar/build/ThirdParty/dist/lib/ cmake libboost_program_options.so libgsmpi.a libtinyxml.a libblas.so libboost_program_options.so.1.57.0 liblapack.so libxxt.a libblas.so.3 libboost_regex.so liblapack.so.3 libz.a libblas.so.3.7.0 libboost_regex.so.1.57.0 liblapack.so.3.7.0 libz.so libboost_filesystem.so libboost_system.so libscotch.a libz.so.1 libboost_filesystem.so.1.57.0 libboost_system.so.1.57.0 libscotcherr.a libz.so.1.2.7 libboost_iostreams.so libboost_thread.so libscotcherrexit.a pkgconfig libboost_iostreams.so.1.57.0 libboost_thread.so.1.57.0 libscotchmetis.a
This folder is about 1.5GB by the way.
I have exactly the same contents in my ThirdParty/dist/lib directory after building of the third party dependencies. The resulting files are nowhere near as large as yours, I assume the very large size of the folder is something to do with the static libraries being very large although I'm not sure why they would be so big.
I think the point you make about building of shared libraries being set to OFF in the CMakeLists.txt file for lapack shouldn't be an issue. If you look in $src/cmake/ThirdPartyBlasLapack.cmake, you should see in the EXTERNAL_PROJECT_ADD command that it is configuring lapack using CMake and specifying -DBUILD_SHARED_LIBS:STRING=ON. You should also be able to verify that lapack was, indeed, configured with this parameter by looking in the CMakeCache.txt file in $build/ThirdParty/lapack-3.7.0/.
However, "nektar/build/ThirdParty/dist/include" folder doesn't have lapack related headers: bgqdev-fen1-$ ls boost scotchf.h scotch.h tinystr.h tinyxml.h zconf.h zlib.h bgqdev-fen1-$ pwd /scinet/bgq/Applications/nektar/nektar/build/ThirdParty/dist/include
I also see exactly the same in my include folder, there are no lapack related headers.
Also, I can share the initial parts of the lapack build - in this version I tried to reference to the system blas for lapack installation: [ 6%] Performing configure step for 'lapack-3.7.0' cd /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0 && /gpfs/home/scinet/bgq/tools/cmake/2.8.12.1/bin/cmake -G "Unix Makefiles" -DCMAKE_Fortran_COMPILER:FIL EPATH=/scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -DCMAKE_INSTALL_PREFIX:PATH=/scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/dist -DCMAKE_INSTALL_LIBDIR:PATH=/scine t/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/dist/lib -DBUILD_SHARED_LIBS:STRING=ON -DBUILD_TESTING:STRING=OFF /scinet/bgq/Applications/nektar/nektar/ThirdParty/lapack-3.7 .0 Re-run cmake no build system arguments -- Setting build type to 'Release' as none was specified. -- The Fortran compiler identification is GNU -- Check for working Fortran compiler: /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -- Check for working Fortran compiler: /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -- works -- Detecting Fortran compiler ABI info -- Detecting Fortran compiler ABI info - done -- Checking whether /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 supports Fortran 90 -- Checking whether /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 supports Fortran 90 -- yes -- Looking for Python greater than 2.6 - -- Could NOT find PythonInterp: Found unsuitable version "2.6.6", but required is at least "2.7" (found /usr/bin/python2) -- No suitable Python version found, so skipping summary tests. -- Reducing RELEASE optimization level to O2 -- Looking for Fortran NONE - found -- Looking for Fortran INT_CPU_TIME - found -- Looking for Fortran EXT_ETIME - not found -- Looking for Fortran EXT_ETIME_ - not found -- Looking for Fortran INT_ETIME - found -- --> Will use second_INT_ETIME.f and dsecnd_INT_ETIME.f as timing function. -- Using supplied NETLIB BLAS implementation -- Using supplied NETLIB LAPACK implementation -- Building Single Precision -- Building Double Precision -- Building Complex Precision -- Building Double Complex Precision -- BUILD TESTING : OFF -- Configuring done -- Generating done -- Build files have been written to: /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0
Again, I see exactly the same output for configuration of lapack. However, when I initially ran this, the build system was picking up my standard C/C++/Fortran compilers so it was using gfortran rather than the MPI version. I reconfigured/rebuilt from scratch specifically telling the build system to use mpicc and mpic++ and setting -DCMAKE_Fortran_COMPILER to point to mpif90, after this I see the same as you have shown above and build again completes successfully.
Additionally, I can see "dgemm" in the log.make: bgqdev-fen1-$ grep -rn "dgemm" nektar/build-gcc/log.make.2 13756:[ 3%] Building Fortran object BLAS/SRC/CMakeFiles/blas.dir/dgemm.f.o 13757:cd /scinet/bgq/Applications/nektar/nektar/build-gcc/ThirdParty/lapack-3.7.0/BLAS/SRC && /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -Dblas_EXPORTS -O2 -fPIC -c /scinet/bgq/Applications/nektar/nektar/ThirdParty/lapack-3.7.0/BLAS/SRC/dgemm.f -o CMakeFiles/blas.dir/dgemm.f.o 14018:/scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpif90 -fPIC -O2 -Wl,-rpath=/bgsys/drivers/ppcfloor/comm/lib/libmpichf90-gcc.so.8 -shared -Wl,-soname,libblas.so.3 -o ../../lib/libblas.so.3.7.0 CMakeFiles/blas.dir/isamax.f.o CMakeFiles/blas.dir/sasum.f.o CMakeFiles/blas.dir/saxpy.f.o CMakeFiles/blas.dir/scopy.f.o ............... CMakeFiles/blas.dir/dgemm.f.o
This is the part that compilation fails:
[ 34%] Building CXX object utilities/NekMesh/CMakeFiles/NekMesh.dir/ProcessModules/ProcessVarOpti/ElUtil.cpp.o ......... ......... /........./bgq/compilers/gcc/4.8.1/bin/../lib/gcc/powerpc64-bgq-linux/4.8.1/../../../../powerpc64-bgq-linux/bin/ld: warning: libmpichf90-gcc.so.8, needed by /scinet/bgq/Applications/nektar/nektar/build/ThirdParty/dist/lib/libblas.so, not found (try using -rpath or -rpath-link) ......... NodeOpti.cpp:(.text+0x1784): undefined reference to `dgeev_' ......... NodeOpti.cpp:(.text+0x25b4): undefined reference to `dgemm_' ......... ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgetri_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dpptrs_' ../../library/LibUtilities/libLibUtilities.so.4.5.0: undefined reference to `dgbtrs_' collect2: error: ld returned 1 exit status make[2]: *** [utilities/NekMesh/NekMesh] Error 1
I've tried many things; xx enabled configure option "build_shared_libs" in CMakeLists.txt in "ThirdParty/lapack" xx made a copy of "make.inc.example" in ThirdParty/lapack and reduced optimization levels xx since this is a Blue Gene environment made reference to ESSL instead of BLAS
But none of it seems to makes a difference. It always fails in the exact same step.
This "libmpichf90-gcc.so.8" warning seems a bit odd to me and I am not sure if that has anything to do with the undefined ref err. I created a symlink to this library and added it to "LD_LIBRARY_PATH" as well, but then it failed with the following message "undefined symbol: _cnkspi_MemoryRegionCacheLastAccessedElementNumber" by "libpami-gcc.so" where PAMI is a lower level messaging api by IBM. Also, "cnkspi" sound far too low level because "CNK" is the kernel on the compute nodes and "SPI" is the implementation that allows communication with that kernel. I added a linker flag "-Wl,-rpath" but I guess it only makes things go uglier.
bgqdev-fen1-$ readelf -d nektar/build-gcc/ThirdParty/dist/lib/libblas.so | grep NEEDED 0x0000000000000001 (NEEDED) Shared library: [libmpichf90-gcc.so.8] 0x0000000000000001 (NEEDED) Shared library: [libmpich-gcc.so.8] 0x0000000000000001 (NEEDED) Shared library: [libopa-gcc.so.0] 0x0000000000000001 (NEEDED) Shared library: [libmpl-gcc.so.1] 0x0000000000000001 (NEEDED) Shared library: [libpami-gcc.so] 0x0000000000000001 (NEEDED) Shared library: [librt.so.1] 0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0] 0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6] 0x0000000000000001 (NEEDED) Shared library: [libc.so.6] 0x0000000000000001 (NEEDED) Shared library: [libnss_files.so.2] 0x0000000000000001 (NEEDED) Shared library: [libnss_dns.so.2] 0x0000000000000001 (NEEDED) Shared library: [libresolv.so.2] 0x0000000000000001 (NEEDED) Shared library: [libgfortran.so.3] 0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
xx As an alternative, I switched to static linking. I initially changed "NEKTAR_LIBRARY_TYPE" to "STATIC" in the "CMakeLists.txt".
This is the one area where I have a number of differences to you. I'm not sure that switching to static linking is likely to make much difference (although I could be wrong) however I note that your blas library seems to require various mpich libraries. My libblas.so library only lists libm.so.6 and libgfortran.so.3 as "NEEDED". Can you also provide the RPATH value that you get from readelf -d for this library...
If you could also provide your output of readelf -d for library/LibUtilities/libLibUtilities.so, that would be useful. My libLibUtilities.so needs a few boost libraries as well as libz, libblas, liblapack, libmpich, libpthread, libgcc_s libc, libstdc++ and libm.
xx It seems that some of the ThirdParty libraries are configured with the assumption of shared objects so I changed them as well. For instance, boost is configured with options "link=shared" and "runtime-link=shared" which I set to static. I can see all required boost libs are successfully compiled and written to "build/ThirdParty/dist/lib".
Now this is the cmake command: cmake $src \ -DCMAKE_INSTALL_PREFIX=$prf \ -DNEKTAR_USE_MPI=ON \ -DNEKTAR_USE_SYSTEM_BLAS_LAPACK=ON \ -DNATIVE_BLAS:FILEPATH=${SCINET_LAPACK_LIB}/libblas.a \ -DNATIVE_BLAS_LIB_DIR:FILEPATH=${SCINET_LAPACK_LIB} \ -DNATIVE_LAPACK:FILEPATH=${SCINET_LAPACK_LIB}/liblapack.a \ -DNATIVE_LAPACK_LIB_DIR:FILEPATH=${SCINET_LAPACK_LIB} \ -DNEKTAR_USE_FFTW=ON \ -DFFTW_INCLUDE_DIR=${SCINET_FFTW_INC} \ -DFFTW_LIBRARY=${SCINET_FFTW_LIB}/libfftw3.a \ -DBoost_NO_SYSTEM_PATHS:BOOL=TRUE \ -DZLIB_INCLUDE_DIR:PATH=${SCINET_ZLIB_INC} \ -DZLIB_LIBRARY:FILEPATH=${SCINET_ZLIB_LIB}/libz.a
The issue now is installer seems to ignore "-DBoost_NO_SYSTEM_PATHS:BOOL=TRUE" and seeks locations other than "BOOST_ROOT" which I set to "nektar/build-gcc/dist".
See for instance: [ 5%] Building CXX object library/LibUtilities/CMakeFiles/LibUtilities.dir/BasicUtils/ArrayEqualityCompa rison.cpp.o cd /scinet/bgq/Applications/nektar/nektar++-4.4.1/build-gcc-static/library/LibUtilities && /scinet/bgq/compilers/gcc/4.8.1/mpi/bin/mpicxx -DLIB_UTILITIES_EXPORTS -DNEKTAR_MEMORY_POOL_ENABLED -DNEKTAR_USE_MPI -DNEKTAR_USING_BLAS -DNEKTAR_USING_LAPACK -DNEKTAR_VERSION=\"4.4.1\" -DTIXML_USE_STL -O3 -DNDEBUG -Wall -Wno-deprecated -Wno-sign-compare -DNEKTAR_RELEASE -isystem /scinet/bgq/Applications/nektar/nektar++-4.4.1/build-gcc-static/ThirdParty/dist/include -isystem /scinet/bgq/Libraries/zlib-1.2.7-gcc4.8.1/include -isystem /scinet/bgq/Applications/nektar/nektar++-4.4.1/ThirdParty/dist/include -isystem /scinet/bgq/Libraries/fftw-3.3.5-gcc/include -I/scinet/bgq/Applications/nektar/nektar++-4.4.1 -I/scinet/bgq/Applications/nektar/nektar++-4.4.1/library -o CMakeFiles/LibUtilities.dir/BasicUtils/ArrayEqualityComparison.cpp.o -c /scinet/bgq/Applications/nektar/nektar++-4.4.1/library/LibUtilities/BasicUtils/ArrayEqualityComparison.cpp In file included from /bgsys/linux/ionfloor/usr/include/boost/config.hpp:57:0, from /bgsys/linux/ionfloor/usr/include/boost/cstdint.hpp:26, from /scinet/bgq/Applications/nektar/nektar++-4.4.1/library/LibUtilities/BasicConst/NektarUnivTypeDefs.hpp:40,
So the main quiestion is: Why does it check "/usr/include/boost" when "cstdint.hpp" already exists in the "build/dist/include/boost/"? bgqdev-fen1-$ ls build-gcc/dist/include/boost/cstdint.hpp -l -rw-r--r-- 1 fertinaz scinet 18017 Nov 14 19:00 build-gcc/dist/include/boost/cstdint.hpp
This is how it finally fails: /bgsys/linux/ionfloor/usr/include/boost/archive/iterators/binary_from_base64.hpp:52:9: warning: narrowing conversion of ‘-1’ from ‘int’ to ‘const char’ inside { } is ill-formed in C++11 [-Wnarrowing] make[2]: *** [library/LibUtilities/CMakeFiles/LibUtilities.dir/BasicUtils/CompressData.cpp.o] Error 1
It doesn't help to change the boost code from "const char lookup_table" to "signed char lookup_table" because then "switch-case" statement that returns the endianness information fails in the following file: "nektar/library/LibUtilities/BasicUtils/CompressData.cpp"
As you can guess, I disabled the switch-case block, and returned the value, but it fails anyway...
Sorry for the long message, hope you could follow. I've run out of ideas and any suggestion is highly appreciated....
// Fatih
I'd be inclined to stick with the third party boost and lapack and see if we can find a solution to that. I can't see what C/C++ compilers you're using but have you tried forcing the use of mpicc and mpic++ as shown when running cmake in the instructions at https://www.nektar.info/nektar-on-mira-cluster/? (in fact, I see above that in your log output for the build command for ArrayEqualityComparison.cpp, it looks like it's using mpicxx)
I'm not sure why libblas.so is linking in libmpichf90-gcc.so.8 but I'm assuming this is the core of the problem. Can you confirm what settings you're using to get the additional logging output that you're showing (which the build commands) - is it just -vv? I can then try and run the same and see if I can provide any further suggestions. I'm not clear at the moment but I'm assuming the undefined reference errors are a result of trying to link in libblas.so and that library itself having an undefined reference to libmpichf90. It might be that the rpath settings can be modified to take account of this.
Can you provide the output of running ldd on libblas.so, liblapck.so and libLibUtilities.so?
Thanks,
Jeremy
On Sun, Oct 14, 2018 at 7:42 AM Jeremy Cohen <jeremy.cohen@imperial.ac.uk> wrote:
Hi Amitvikram,
I would certainly try Chris's suggestion. However, something else to check is where you're getting the third party downloads from.
If you take a clean Nektar++ source tree and place the standard netlib lapack-3.7.0.tgz source file that build system downloads into $NEKTAR_HOME/ThirdParty (i.e. the download from http://www.netlib.org/lapack/lapack-3.7.0.tgz), the build should proceed successfully.
It looks like the lapack tar file that you're using may already have some build artefacts in it - did you tar the content from $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0 into a lapack-3.7.0.tgz file or you're working with the standard .tgz file from the netlib.org site?
Cheers, Jeremy
On 13 Oct 2018, at 21:19, Chris Cantwell <c.cantwell@imperial.ac.uk> wrote:
Hi Amitvikram,
Some sites block non-SSL enabled HTTP traffic, returning a webpage reporting the error rather than the actual file (hence the hash mismatch).
You could try turning on the THIRDPARTY_USE_SSL option to see if that is allowed.
Cheers, Chris
On Fri, 12 Oct 2018 15:15:10 -0400, Amitvikram Dutta <amitvdutta23@gmail.com> wrote: > Hi all, > > I had to use a slight workaround because the platform I was compiling on > could not download the thirdparty files correctly and would always give a > hash mismatched error. I downloaded the third party files on my home > machine and re-uploaded them into the $NEKTAR_HOME folder. As a result when > I begin the build process the ThirdParty folder is populated with the > zipped versions of the third party software. Is this what what might be > causing the problem? Is there any way to edit the cmake file and to get the > compilation process working correctly? > > On Fri, Oct 12, 2018 at 2:57 PM Amitvikram Dutta < > amitvikram.dutta@uwaterloo.ca> wrote: > >> *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen >> *Sent:* October 12, 2018 2:56:43 PM (UTC-05:00) Eastern Time (US & Canada) >> >> *To:* Amitvikram Dutta >> *Cc:* nektar-users >> *Subject:* Re: [Nektar-users] Problem while installing nektar++ with >> lapack >> >> Hi Amitvikram, >> >> Have you attempted to build lapack separately at any point? It's probably >> worth clearing out your build directory and also all the contents of the >> ThirdParty directory in the base nektar++ source directory, which I'll call >> $NEKTAR_HOME, so $NEKTAR_HOME/ThirdParty and then re-running cmake in an >> empty $NEKTAR_HOME/build directory and trying the build again. >> >> It looks like the build step is encountering a previous source tree in the >> location where it's trying to build which seems strange. >> >> I've just had a look at the log from my clean build and I see exactly the >> same messages as you in relation to lapack-3.7.0 in the same order as far >> as "Checking whether /usr/bin/f95 supports Fortran 90 -- yes", however, I >> then see "-- Looking for Python greater than 2.6 - " and the build of >> lapack completes successfully. >> >> Just to confirm, I am running cmake and make in a separate build directory >> under the main nektar++ source tree directory, so I'm building in >> $NEKTAR_HOME/build - I assume you're doing something similar? You should >> see a ThirdParty directory in $NEKTAR_HOME and another ThirdParty directory >> in $NEKTAR_HOME/build/ >> >> I believe that the initial download of the lapack-3.7.0.tar.gz should be >> placed in $NEKTAR_HOME/ThirdParty and unpacked there. Then, when the build >> succeeds or stops, in $NEKTAR_HOME/build/ThirdParty, you should see >> lapack-3.7.0/ where I think the build actually takes place, and then a >> separate $NEKTAR_HOME/build/ThirdParty/lapack-3.7.0-tmp which should >> contain a couple of lapack-3.7.0-cfgcmd.txt files that I think contain the >> build command that is used - you could perhaps paste the contents of the >> lapack-3.7.0-cfgcmd.txt into an email if you're still having issues and we >> can see if that looks correct. >> >> It is, of course, possible that this is something related to the specific >> configuration of the platform that you're building on, but I think the >> third party lapack build should be straightforward and it sounds like for >> some reason, it's attempting to build in the wrong location, or a location >> where an existing source tree has ended up for some reason. >> >> I'm afraid I don't have a very detailed knowledge of the build system >> beyond this so if none of the suggestions so far help you to resolve the >> problem, maybe someone with more knowledge of the build system can provide >> some advice. >> >> Cheers, >> Jeremy >> >> On 12 Oct 2018, at 19:34, Amitvikram Dutta <amitvdutta23@gmail.com> wrote: >> >> Hi all, >> >> I tried to compile nektar using Jeremy's latest suggestions having both >> THIRDPARTY_BUILD_BLAS_LAPACK and NEKTAR_USE_SYSTEM_BLAS_LAPACK turned on. >> The following error occured. It seems that I might have to compile lapack >> separately. Is this unusual? >> >> <image.png> >> >> On Fri, Oct 12, 2018 at 1:16 PM Amitvikram Dutta < >> amitvikram.dutta@uwaterloo.ca> wrote: >> >>> *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen >>> *Sent:* October 12, 2018 1:16:14 PM (UTC-05:00) Eastern Time (US & >>> Canada) >>> >>> *To:* Amitvikram Dutta >>> *Cc:* nektar-users >>> *Subject:* Re: [Nektar-users] Problem while installing nektar++ with >>> lapack >>> >>> Hi Amitvikram, >>> >>> As Chris suggests, it's probably better to use vendor supplied libraries >>> if you can get those working. >>> >>> In addition to the further information Chris has asked to take a look at, >>> one thing you could check is to whether there are any files in your >>> nektar++/build/ThirdParty/lapack-3.7.0 directory (if that directory exists >>> at all). >>> >>> I've been trying to see if I can recreate the problem and I was able to >>> see something similar when setting THIRDPARTY_BUILD_BLAS_LAPACK=ON and >>> NEKTAR_USE_SYSTEM_BLAS_LAPACK=OFF. In this case, I can configure >>> successfully and start the build but it fails with a large number of >>> undefined references that are similar to, and include, the dtpmv_ symbol >>> that you mentioned. When I look in nektar++/build/ThirdParty/lapack-3.7.0, >>> the directory is empty so it looks like the build system has configured on >>> the basis of building its own blas/lapack but the build hasn't been carried >>> out and therefore LibUtilities can't be linked against it. >>> >>> As a test, you could try running the build with both >>> THIRDPARTY_BUILD_BLAS_LAPACK=ON and NEKTAR_USE_SYSTEM_BLAS_LAPACK=ON, if >>> this isn't the setting you've been using already. When I tried this, the >>> build of blas/lapack is carried out successfully and the linking is fine >>> with the full build of Nektar++ completing successfully. I removed the >>> system blas/lapack on my test system to be sure it was linking against the >>> correct instance. >>> >>> Cheers, >>> Jeremy >>> >>> On 12 Oct 2018, at 17:50, Chris Cantwell <c.cantwell@imperial.ac.uk> >>> wrote: >>> >>> Hi Amitvikram, >>> >>> Could you send us your CMakeCache.txt file from your build directory and >>> the output from running: >>> make VERBOSE=1 >>> for both cases. >>> >>> In the case of using ThirdParty LAPACK, it seems to not be linking to it. >>> Probably you should be using vendor-supplied libraries if possible though >>> so better if ee can get those working. >>> >>> Thanks, >>> Chris >>> >>> >>> >>> On 12 October 2018 14:08:55 BST, Amitvikram Dutta <amitvdutta23@gmail.com> >>> wrote: >>>> >>>> Hi Jeremy, >>>> >>>> I'm actually trying to build nektar++ on a BGQ cluster similar to Mira. >>>> >>>> I'm trying to build nektar++-4.4.1 and the system lapacek version is >>>> 3.4.2 >>>> >>>> Sincerely, >>>> On Fri, Oct 12, 2018 at 4:24 AM Amitvikram Dutta < >>>> amitvikram.dutta@uwaterloo.ca> wrote: >>>> >>>>> >>>>> ------------------------------ >>>>> *From:* nektar-users-bounces@imperial.ac.ukOn Behalf OfJeremy Cohen >>>>> *Sent:* October 12, 2018 4:24:33 AM (UTC-05:00) Eastern Time (US & >>>>> Canada) >>>>> *To:* Amitvikram Dutta >>>>> *Cc:* nektar-users >>>>> *Subject:* Re: [Nektar-users] Problem while installing nektar++ with >>>>> lapack >>>>> >>>>> Hi Amitvikram, >>>>> >>>>> Can you provide some further details of the problem you're encountering. >>>>> >>>>> Specifically, can you confirm what platform (including version) you're >>>>> building on, and if Linux, which I assume is the platform you're using, >>>>> which distribution. >>>>> >>>>> Can you also confirm what version of Nektar++ you're trying to build, >>>>> and the version of the system Lapack distribution that you're using. >>>>> >>>>> Thanks, >>>>> >>>>> Jeremy >>>>> >>>>> On 12 Oct 2018, at 01:05, Amitvikram Dutta <amitvdutta23@gmail.com> >>>>> wrote: >>>>> >>>>> Hi all, >>>>> >>>>> I keep having the same problem while trying to install nektar++ with >>>>> regards to the Lapack libraries. >>>>> >>>>> When I try to use the system Lapack installation I get the following >>>>> message >>>>> >>>>> */scinet/bgq/Libraries/lapack/lib/liblapack.so: undefined reference to >>>>> `_xlfEndIO@XLF_1.0'* >>>>> >>>>> while when I try to install using the ThirdParty Lapack supplied with >>>>> the nektar++ source directory I get the following error >>>>> >>>>> *../../library/LibUtilities/libLibUtilities.so.4.4.1: undefined >>>>> reference to `dtpmv_'* >>>>> >>>>> I have a feeling these errors have been encountered by the community at >>>>> large before. Could someone point out where I'm going wrong? >>>>> >>>>> Sincerely, >>>>> -- >>>>> >>>>> *Amitvikram Dutta* >>>>> >>>>> Graduate Research Assistant >>>>> >>>>> Fluid Mechanics Research Lab >>>>> >>>>> Multi-Physics Interaction Lab >>>>> >>>>> University of Waterloo >>>>> _______________________________________________ >>>>> Nektar-users mailing list >>>>> Nektar-users@imperial.ac.uk >>>>> https://mailman.ic.ac.uk/mailman/listinfo/nektar-users >>>>> >>>>> >>>>> -- >>>> >>>> *Amitvikram Dutta* >>>> >>>> Graduate Research Assistant >>>> >>>> Fluid Mechanics Research Lab >>>> >>>> Multi-Physics Interaction Lab >>>> >>>> University of Waterloo >>>> >>> >>> -- >>> Chris Cantwell >>> Imperial College London >>> South Kensington Campus >>> London SW7 2AZ >>> Email: c.cantwell@imperial.ac.uk >>> www.imperial.ac.uk/people/c.cantwell >>> >>> >>> -- >> >> *Amitvikram Dutta* >> >> Graduate Research Assistant >> >> Fluid Mechanics Research Lab >> >> Multi-Physics Interaction Lab >> >> University of Waterloo >> >> >> -- > > *Amitvikram Dutta* > > Graduate Research Assistant > > Fluid Mechanics Research Lab > > Multi-Physics Interaction Lab > > University of Waterloo
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
_______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/nektar-users
<CMakeLists.txt>
-- Chris Cantwell Imperial College London South Kensington Campus London SW7 2AZ Email: c.cantwell@imperial.ac.uk www.imperial.ac.uk/people/c.cantwell
participants (4)
- 
                
                Amitvikram Dutta
- 
                
                Chris Cantwell
- 
                
                Fatih Ertinaz
- 
                
                Jeremy Cohen