[petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check
Rob Kudyba
rk3199 at columbia.edu
Sat Oct 8 21:31:48 CDT 2022
>
> Perhaps we can back one step:
> Use your mpicc to build a "hello world" mpi test, then run it on a compute
> node (with GPU) to see if it works.
> If no, then your MPI environment has problems;
> If yes, then use it to build petsc (turn on petsc's gpu support,
> --with-cuda --with-cudac=nvcc), and then your code.
> --Junchao Zhang
OK tried this just to eliminate that the CUDA-capable OpenMPI is a factor:
./configure --with-debugging=0 --with-cmake=true --with-mpi=true
--with-mpi-dir=/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support --with-fc=0
--with-cuda=1
[..]
cuda:
Version: 11.7
Includes: -I/path/to/cuda11.7/toolkit/11.7.1/include
Libraries: -Wl,-rpath,/path/to/cuda11.7/toolkit/11.7.1/lib64
-L/cm/shared/apps/cuda11.7/toolkit/11.7.1/lib64
-L/path/to/cuda11.7/toolkit/11.7.1/lib64/stubs -lcudart -lnvToolsExt
-lcufft -lcublas -lcusparse -lcusolver -lcurand -lcuda
CUDA SM 75
CUDA underlying compiler:
CUDA_CXX="/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/bin"/mpicxx
CUDA underlying compiler flags: CUDA_CXXFLAGS=
CUDA underlying linker libraries: CUDA_CXXLIBS=
[...]
Configure stage complete. Now build PETSc libraries with:
make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-opt all
C++ compiler version: g++ (GCC) 10.2.0
Using C++ compiler to compile PETSc
-----------------------------------------
Using C/C++ linker:
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/bin/mpicxx
Using C/C++ flags: -Wall -Wwrite-strings -Wno-strict-aliasing
-Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector
-fvisibility=hidden -g -O0
-----------------------------------------
Using system modules:
shared:slurm/20.02.6:DefaultModules:openmpi/gcc/64/4.1.1_cuda_11.0.3_aware:gdal/3.3.0:cmake/3.22.1:cuda11.7/toolkit/11.7.1:openblas/dynamic/0.3.7:gcc/10.2.0
Using mpi.h: # 1
"/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/include/mpi.h" 1
-----------------------------------------
Using libraries: -Wl,-rpath,/path/to/petsc/arch-linux-cxx-debug/lib
-L/path/to/petsc/arch-linux-cxx-debug/lib -lpetsc -lopenblas -lm -lX11
-lquadmath -lstdc++ -ldl
------------------------------------------
Using mpiexec: mpiexec -mca orte_base_help_aggregate 0 -mca pml ucx --mca
btl '^openib'
------------------------------------------
Using MAKE: /path/to/petsc/arch-linux-cxx-debug/bin/make
Using MAKEFLAGS: -j24 -l48.0 --no-print-directory -- MPIEXEC=mpiexec\
-mca\ orte_base_help_aggregate\ 0\ \ -mca\ pml\ ucx\ --mca\ btl\ '^openib'
PETSC_ARCH=arch-linux-cxx-debug PETSC_DIR=/path/to/petsc
==========================================
make[3]: Nothing to be done for 'libs'.
=========================================
Now to check if the libraries are working do:
make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-cxx-debug check
=========================================
[me at xxx petsc]$ make PETSC_DIR=/path/to/petsc
PETSC_ARCH=arch-linux-cxx-debug MPIEXEC="mpiexec -mca
orte_base_help_aggregate 0 -mca pml ucx --mca btl '^openib'" check
Running check examples to verify correct installation
Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-cxx-debug
C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
./bandwidthTest
[CUDA Bandwidth Test] - Starting...
Running on...
Device 0: Quadro RTX 8000
Quick Mode
Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 12.3
Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 13.2
Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 466.2
Result = PASS
On Sat, Oct 8, 2022 at 7:56 PM Barry Smith <bsmith at petsc.dev> wrote:
>
> True, but when users send reports back to us they will never have used
> the VERBOSE=1 option, so it requires one more round trip of email to get
> this additional information.
>
> > On Oct 8, 2022, at 6:48 PM, Jed Brown <jed at jedbrown.org> wrote:
> >
> > Barry Smith <bsmith at petsc.dev> writes:
> >
> >> I hate these kinds of make rules that hide what the compiler is doing
> (in the name of having less output, I guess) it makes it difficult to
> figure out what is going wrong.
> >
> > You can make VERBOSE=1 with CMake-generated makefiles.
>
> Anyways, either some of the MPI libraries are missing from the link line
> or they are in the wrong order and thus it is not able to search them
> properly. Here is a bunch of discussions on why that error message can
> appear
> https://stackoverflow.com/questions/19901934/libpthread-so-0-error-adding-symbols-dso-missing-from-command-line
>
Still same but more noise and I have been using the suggestion of
LDFLAGS="-Wl,--copy-dt-needed-entries" along with make:
make[2]: Entering directory '/path/to/WTM/build'
cd /path/to/WTM/build && /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake
-E cmake_depends "Unix Makefiles" /path/to/WTM /path/to/WTM
/path/to/WTM/build /path/to/WTM/build
/path/to/WTM/build/CMakeFiles/wtm.x.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/path/to/WTM/build'
make -f CMakeFiles/wtm.x.dir/build.make CMakeFiles/wtm.x.dir/build
make[2]: Entering directory '/path/to/WTM/build'
[ 66%] Building CXX object CMakeFiles/wtm.x.dir/src/WTM.cpp.o
/cm/local/apps/gcc/10.2.0/bin/c++ -I/path/to/WTM/common/richdem/include
-I/path/to/gdal-3.3.0/include -I/path/to/WTM/common/fmt/include -isystem
/path/to/petsc/arch-linux-cxx-debug/include -isystem /path/to/petsc/include
-isystem -O3 -g -Wall -Wextra -pedantic -Wshadow -Wfloat-conversion -Wall
-Wextra -pedantic -Wshadow -DRICHDEM_GIT_HASH=\"xxx\"
-DRICHDEM_COMPILE_TIME=\"2022-10-09T02:21:11Z\" -DUSEGDAL -Xpreprocessor
-fopenmp
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40.30.1
-I/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/include -std=gnu++2a -MD
-MT CMakeFiles/wtm.x.dir/src/WTM.cpp.o -MF
CMakeFiles/wtm.x.dir/src/WTM.cpp.o.d -o CMakeFiles/wtm.x.dir/src/WTM.cpp.o
-c /path/to/WTM/src/WTM.cpp
c++: warning:
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40.30.1:
linker input file unused because linking not done
[ 70%] Linking CXX executable wtm.x
/path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_link_script
CMakeFiles/wtm.x.dir/link.txt --verbose=1
/cm/local/apps/gcc/10.2.0/bin/c++ -isystem -O3 -g -Wall -Wextra -pedantic
-Wshadow CMakeFiles/wtm.x.dir/src/WTM.cpp.o -o wtm.x
-Wl,-rpath,/path/to/WTM/build/common/richdem:/path/to/gdal-3.3.0/lib:/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib:/path/to/petsc/arch-linux-cxx-debug/lib
libwtm.a common/richdem/librichdem.so /path/to/gdal-3.3.0/lib/libgdal.so
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0
common/fmt/libfmt.a /path/to/petsc/arch-linux-cxx-debug/lib/libpetsc.so
/usr/bin/ld: CMakeFiles/wtm.x.dir/src/WTM.cpp.o: undefined reference to
symbol 'ompi_mpi_comm_self'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40: error
adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1
make[2]: Leaving directory '/path/to/WTM/build'
make[1]: *** [CMakeFiles/Makefile2:225: CMakeFiles/wtm.x.dir/all] Error 2
make[1]: Leaving directory '/path/to/WTM/build'
make: *** [Makefile:136: all] Error 2
Anything stick out?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20221008/6be5c81e/attachment-0001.html>
More information about the petsc-users
mailing list