[petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

Junchao Zhang junchao.zhang at gmail.com
Sun Oct 9 22:02:00 CDT 2022


OK, let's walk back and don't use -DCMAKE_C_COMPILER=/path/to/mpicc

libompitrace.so.40.30.0 is not the OpenMP library; it is the tracing
library for OpenMPI, https://github.com/open-mpi/ompi/issues/10036

In your previous email, there was

/path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_link_script
CMakeFiles/wtm.x.dir/link.txt --verbose=1
/cm/local/apps/gcc/10.2.0/bin/c++ -isystem -O3 -g -Wall -Wextra -pedantic
-Wshadow CMakeFiles/wtm.x.dir/src/WTM.cpp.o -o wtm.x
 -Wl,-rpath,/path/to/WTM/build/common/richdem:/path/to/
gdal-3.3.0/lib:/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_
support/lib:/path/to/petsc/arch-linux-cxx-debug/lib libwtm.a
common/richdem/librichdem.so /path/to/gdal-3.3.0/lib/libgdal.so
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0
common/fmt/libfmt.a /path/to/petsc/arch-linux-cxx-debug/lib/libpetsc.so
/usr/bin/ld: CMakeFiles/wtm.x.dir/src/WTM.cpp.o: undefined reference to
symbol 'ompi_mpi_comm_self'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40: error
adding symbols: DSO missing from command line


Let's try to add -lmpi (or /path/to/openmpi-4.1.1_ucx_
cuda_11.0.3_support/lib/libmpi.so) manually to see if it links

/cm/local/apps/gcc/10.2.0/bin/c++ -isystem -O3 -g -Wall -Wextra -pedantic
-Wshadow CMakeFiles/wtm.x.dir/src/WTM.cpp.o -o wtm.x
 -Wl,-rpath,/path/to/WTM/build/common/richdem:/path/to/
gdal-3.3.0/lib:/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_
support/lib:/path/to/petsc/arch-linux-cxx-debug/lib libwtm.a
common/richdem/librichdem.so /path/to/gdal-3.3.0/lib/libgdal.so
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0
common/fmt/libfmt.a /path/to/petsc/arch-linux-cxx-debug/lib/libpetsc.so
-lmpi


On Sun, Oct 9, 2022 at 9:28 PM Rob Kudyba <rk3199 at columbia.edu> wrote:

> I did have -DMPI_CXX_COMPILER set, so I added -DCMAKE_C_COMPILER and now
> get these errors:
>
> [ 25%] Linking CXX shared library librichdem.so
> /lib/../lib64/crt1.o: In function `_start':
> (.text+0x24): undefined reference to `main'
> CMakeFiles/richdem.dir/src/random.cpp.o: In function
> `richdem::rand_engine()':
> random.cpp:(.text+0x45): undefined reference to `omp_get_thread_num'
> CMakeFiles/richdem.dir/src/random.cpp.o: In function
> `richdem::seed_rand(unsigned long)':
> random.cpp:(.text+0xb6): undefined reference to `GOMP_parallel'
> CMakeFiles/richdem.dir/src/random.cpp.o: In function
> `richdem::uniform_rand_int(int, int)':
> random.cpp:(.text+0x10c): undefined reference to `omp_get_thread_num'
> CMakeFiles/richdem.dir/src/random.cpp.o: In function
> `richdem::uniform_rand_real(double, double)':
> random.cpp:(.text+0x1cb): undefined reference to `omp_get_thread_num'
> CMakeFiles/richdem.dir/src/random.cpp.o: In function
> `richdem::normal_rand(double, double)':
> random.cpp:(.text+0x29e): undefined reference to `omp_get_thread_num'
> CMakeFiles/richdem.dir/src/random.cpp.o: In function
> `richdem::seed_rand(unsigned long) [clone ._omp_fn.0]':
> random.cpp:(.text+0x4a3): undefined reference to `GOMP_critical_start'
> random.cpp:(.text+0x4b1): undefined reference to `GOMP_critical_end'
> random.cpp:(.text+0x4c3): undefined reference to `omp_get_thread_num'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Comm_rank'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Get_address'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Comm_get_name'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Add_error_string'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Type_get_name'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Abort'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Alloc_mem'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Isend'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Barrier'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Allgather'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Reduce'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Send'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Init'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Type_size'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Accumulate'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Add_error_class'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Finalize'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Allgatherv'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Bcast'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Recv'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Request_free'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Allreduce'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `ompi_mpi_comm_world'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Sendrecv'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Add_error_code'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Win_get_name'
> collect2: error: ld returned 1 exit status
> make[2]: *** [common/richdem/CMakeFiles/richdem.dir/build.make:163:
> common/richdem/librichdem.so] Error 1
> make[1]: *** [CMakeFiles/Makefile2:306:
> common/richdem/CMakeFiles/richdem.dir/all] Error 2
> make: *** [Makefile:136: all] Error 2
>
> I took a guess at using -DOpenMP_libomp_LIBRARY="/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0"
> as
> as otherwise I'd get:
> CMake Error at
> /path/to/cmake/cmake-3.22.1-linux-x86_64/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:230
> (message):
>   Could NOT find OpenMP_CXX (missing: OpenMP_libomp_LIBRARY
>   OpenMP_libomp_LIBRARY) (found version "4.5")
>
> So perhaps that's the real problem?
>
> On Sun, Oct 9, 2022 at 9:31 PM Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> In the last link step to generate the executable
>> /cm/local/apps/gcc/10.2.0/bin/c++ -isystem -O3 -g -Wall -Wextra
>> -pedantic -Wshadow CMakeFiles/wtm.x.dir/src/WTM.cpp.o -o wtm.x
>>  -Wl,-rpath,/path/to/WTM/build/common/richdem:/path/to/
>> gdal-3.3.0/lib:/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_
>> support/lib:/path/to/petsc/arch-linux-cxx-debug/lib libwtm.a
>> common/richdem/librichdem.so /path/to/gdal-3.3.0/lib/libgdal.so
>> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0
>> common/fmt/libfmt.a /path/to/petsc/arch-linux-cxx-debug/lib/libpetsc.so
>>
>> I did not find -lmpi to link in the mpi library.  You can try to use  cmake
>> -DCMAKE_C_COMPILER=/path/to/mpicc  -DCMAKE_CXX_COMPILER=/path/to/mpicxx to
>> build your code
>>
>> On Sat, Oct 8, 2022 at 9:32 PM Rob Kudyba <rk3199 at columbia.edu> wrote:
>>
>>> Perhaps we can back one step:
>>>> Use your mpicc to build a "hello world" mpi test, then run it on a
>>>> compute node (with GPU) to see if it works.
>>>> If no, then your MPI environment has problems;
>>>> If yes, then use it to build petsc (turn on petsc's gpu support,
>>>>  --with-cuda  --with-cudac=nvcc), and then your code.
>>>> --Junchao Zhang
>>>
>>> OK tried this just to eliminate that the CUDA-capable OpenMPI is a
>>> factor:
>>> ./configure --with-debugging=0 --with-cmake=true   --with-mpi=true
>>>  --with-mpi-dir=/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support --with-fc=0
>>>   --with-cuda=1
>>> [..]
>>> cuda:
>>>   Version:    11.7
>>>   Includes:   -I/path/to/cuda11.7/toolkit/11.7.1/include
>>>   Libraries:  -Wl,-rpath,/path/to/cuda11.7/toolkit/11.7.1/lib64
>>> -L/cm/shared/apps/cuda11.7/toolkit/11.7.1/lib64
>>> -L/path/to/cuda11.7/toolkit/11.7.1/lib64/stubs -lcudart -lnvToolsExt
>>> -lcufft -lcublas -lcusparse -lcusolver -lcurand -lcuda
>>>   CUDA SM 75
>>>   CUDA underlying compiler:
>>> CUDA_CXX="/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/bin"/mpicxx
>>>   CUDA underlying compiler flags: CUDA_CXXFLAGS=
>>>   CUDA underlying linker libraries: CUDA_CXXLIBS=
>>> [...]
>>>  Configure stage complete. Now build PETSc libraries with:
>>>    make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-opt all
>>>
>>> C++ compiler version: g++ (GCC) 10.2.0
>>> Using C++ compiler to compile PETSc
>>> -----------------------------------------
>>> Using C/C++ linker:
>>> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/bin/mpicxx
>>> Using C/C++ flags: -Wall -Wwrite-strings -Wno-strict-aliasing
>>> -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector
>>> -fvisibility=hidden -g -O0
>>> -----------------------------------------
>>> Using system modules:
>>> shared:slurm/20.02.6:DefaultModules:openmpi/gcc/64/4.1.1_cuda_11.0.3_aware:gdal/3.3.0:cmake/3.22.1:cuda11.7/toolkit/11.7.1:openblas/dynamic/0.3.7:gcc/10.2.0
>>> Using mpi.h: # 1
>>> "/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/include/mpi.h" 1
>>> -----------------------------------------
>>> Using libraries: -Wl,-rpath,/path/to/petsc/arch-linux-cxx-debug/lib
>>> -L/path/to/petsc/arch-linux-cxx-debug/lib -lpetsc -lopenblas -lm -lX11
>>> -lquadmath -lstdc++ -ldl
>>> ------------------------------------------
>>> Using mpiexec: mpiexec -mca orte_base_help_aggregate 0  -mca pml ucx
>>> --mca btl '^openib'
>>> ------------------------------------------
>>> Using MAKE: /path/to/petsc/arch-linux-cxx-debug/bin/make
>>> Using MAKEFLAGS: -j24 -l48.0  --no-print-directory -- MPIEXEC=mpiexec\
>>> -mca\ orte_base_help_aggregate\ 0\ \ -mca\ pml\ ucx\ --mca\ btl\ '^openib'
>>> PETSC_ARCH=arch-linux-cxx-debug PETSC_DIR=/path/to/petsc
>>> ==========================================
>>> make[3]: Nothing to be done for 'libs'.
>>> =========================================
>>> Now to check if the libraries are working do:
>>> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-cxx-debug check
>>> =========================================
>>> [me at xxx petsc]$ make PETSC_DIR=/path/to/petsc
>>> PETSC_ARCH=arch-linux-cxx-debug MPIEXEC="mpiexec -mca
>>> orte_base_help_aggregate 0  -mca pml ucx --mca btl '^openib'" check
>>> Running check examples to verify correct installation
>>> Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-cxx-debug
>>> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
>>> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI
>>> processes
>>>
>>> ./bandwidthTest
>>> [CUDA Bandwidth Test] - Starting...
>>> Running on...
>>>
>>>  Device 0: Quadro RTX 8000
>>>  Quick Mode
>>>
>>>  Host to Device Bandwidth, 1 Device(s)
>>>  PINNED Memory Transfers
>>>    Transfer Size (Bytes) Bandwidth(GB/s)
>>>    32000000 12.3
>>>
>>>  Device to Host Bandwidth, 1 Device(s)
>>>  PINNED Memory Transfers
>>>    Transfer Size (Bytes) Bandwidth(GB/s)
>>>    32000000 13.2
>>>
>>>  Device to Device Bandwidth, 1 Device(s)
>>>  PINNED Memory Transfers
>>>    Transfer Size (Bytes) Bandwidth(GB/s)
>>>    32000000 466.2
>>>
>>> Result = PASS
>>>
>>> On Sat, Oct 8, 2022 at 7:56 PM Barry Smith <bsmith at petsc.dev> wrote:
>>>
>>>>
>>>>   True, but when users send reports back to us they will never have
>>>> used the VERBOSE=1 option, so it requires one more round trip of email to
>>>> get this additional information.
>>>>
>>>> > On Oct 8, 2022, at 6:48 PM, Jed Brown <jed at jedbrown.org> wrote:
>>>> >
>>>> > Barry Smith <bsmith at petsc.dev> writes:
>>>> >
>>>> >>   I hate these kinds of make rules that hide what the compiler is
>>>> doing (in the name of having less output, I guess) it makes it difficult to
>>>> figure out what is going wrong.
>>>> >
>>>> > You can make VERBOSE=1 with CMake-generated makefiles.
>>>>
>>>
>>>
>>>> Anyways, either some of the MPI libraries are missing from the link
>>>> line or they are in the wrong order and thus it is not able to search them
>>>> properly. Here is a bunch of discussions on why that error message can
>>>> appear
>>>> https://stackoverflow.com/questions/19901934/libpthread-so-0-error-adding-symbols-dso-missing-from-command-line
>>>>
>>>
>>>
>>> Still same but more noise and I have been using the suggestion of
>>> LDFLAGS="-Wl,--copy-dt-needed-entries" along with make:
>>> make[2]: Entering directory '/path/to/WTM/build'
>>> cd /path/to/WTM/build &&
>>> /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_depends "Unix
>>> Makefiles" /path/to/WTM /path/to/WTM /path/to/WTM/build /path/to/WTM/build
>>> /path/to/WTM/build/CMakeFiles/wtm.x.dir/DependInfo.cmake --color=
>>> make[2]: Leaving directory '/path/to/WTM/build'
>>> make  -f CMakeFiles/wtm.x.dir/build.make CMakeFiles/wtm.x.dir/build
>>> make[2]: Entering directory '/path/to/WTM/build'
>>> [ 66%] Building CXX object CMakeFiles/wtm.x.dir/src/WTM.cpp.o
>>> /cm/local/apps/gcc/10.2.0/bin/c++  -I/path/to/WTM/common/richdem/include
>>> -I/path/to/gdal-3.3.0/include -I/path/to/WTM/common/fmt/include -isystem
>>> /path/to/petsc/arch-linux-cxx-debug/include -isystem /path/to/petsc/include
>>> -isystem -O3 -g -Wall -Wextra -pedantic -Wshadow -Wfloat-conversion -Wall
>>> -Wextra -pedantic -Wshadow -DRICHDEM_GIT_HASH=\"xxx\"
>>> -DRICHDEM_COMPILE_TIME=\"2022-10-09T02:21:11Z\" -DUSEGDAL -Xpreprocessor
>>> -fopenmp
>>> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40.30.1
>>> -I/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/include -std=gnu++2a -MD
>>> -MT CMakeFiles/wtm.x.dir/src/WTM.cpp.o -MF
>>> CMakeFiles/wtm.x.dir/src/WTM.cpp.o.d -o CMakeFiles/wtm.x.dir/src/WTM.cpp.o
>>> -c /path/to/WTM/src/WTM.cpp
>>> c++: warning:
>>> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40.30.1:
>>> linker input file unused because linking not done
>>> [ 70%] Linking CXX executable wtm.x
>>> /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_link_script
>>> CMakeFiles/wtm.x.dir/link.txt --verbose=1
>>> /cm/local/apps/gcc/10.2.0/bin/c++ -isystem -O3 -g -Wall -Wextra
>>> -pedantic -Wshadow CMakeFiles/wtm.x.dir/src/WTM.cpp.o -o wtm.x
>>>  -Wl,-rpath,/path/to/WTM/build/common/richdem:/path/to/gdal-3.3.0/lib:/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib:/path/to/petsc/arch-linux-cxx-debug/lib
>>> libwtm.a common/richdem/librichdem.so /path/to/gdal-3.3.0/lib/libgdal.so
>>> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0
>>> common/fmt/libfmt.a /path/to/petsc/arch-linux-cxx-debug/lib/libpetsc.so
>>> /usr/bin/ld: CMakeFiles/wtm.x.dir/src/WTM.cpp.o: undefined reference to
>>> symbol 'ompi_mpi_comm_self'
>>> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40: error
>>> adding symbols: DSO missing from command line
>>> collect2: error: ld returned 1 exit status
>>> make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1
>>> make[2]: Leaving directory '/path/to/WTM/build'
>>> make[1]: *** [CMakeFiles/Makefile2:225: CMakeFiles/wtm.x.dir/all] Error 2
>>> make[1]: Leaving directory '/path/to/WTM/build'
>>> make: *** [Makefile:136: all] Error 2
>>>
>>> Anything stick out?
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20221009/cd90128b/attachment-0001.html>


More information about the petsc-users mailing list