[petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

Junchao Zhang junchao.zhang at gmail.com
Fri Oct 7 21:06:06 CDT 2022


On Fri, Oct 7, 2022 at 1:08 PM Rob Kudyba <rk3199 at columbia.edu> wrote:

> Thanks for the quick reply. I added these options to make and make check
> still produce the warnings so I used the command like this:
> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug
>  MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca
> opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'" check
> Running check examples to verify correct installation
> Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
> Completed test examples
>
> Could be useful for the FAQ.
>
You mentioned you had "OpenMPI 4.1.1 with CUDA aware",  so I think a
workable mpicc should automatically find cuda libraries.  Maybe you
unloaded cuda libraries?


> I'm not trying to use PetSC to compile and linking appears to go awry:
> [ 58%] Building CXX object
> CMakeFiles/wtm.dir/src/update_effective_storativity.cpp.o
> [ 62%] Linking CXX static library libwtm.a
> [ 62%] Built target wtm
> [ 66%] Building CXX object CMakeFiles/wtm.x.dir/src/WTM.cpp.o
> [ 70%] Linking CXX executable wtm.x
> /usr/bin/ld: cannot find -lpetsc
> collect2: error: ld returned 1 exit status
> make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1
> make[1]: *** [CMakeFiles/Makefile2:269: CMakeFiles/wtm.x.dir/all] Error 2
> make: *** [Makefile:136: all] Error 2
>
It seems cmake could not find petsc.   Look
at $PETSC_DIR/share/petsc/CMakeLists.txt and try to modify your
CMakeLists.txt.


>
>
> Is there an environment variable I'm missing? I've seen the suggestion
> <https://www.mail-archive.com/search?l=petsc-users@mcs.anl.gov&q=subject:%22%5C%5Bpetsc%5C-users%5C%5D+CMake+error+in+PETSc%22&o=newest&f=1>
> to add it to LD_LIBRARY_PATH which I did with export
> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib and that
> points to:
> ls -l /path/to/petsc/arch-linux-c-debug/lib
> total 83732
> lrwxrwxrwx 1 rk3199 user       18 Oct  7 13:56 libpetsc.so ->
> libpetsc.so.3.18.0
> lrwxrwxrwx 1 rk3199 user       18 Oct  7 13:56 libpetsc.so.3.18 ->
> libpetsc.so.3.18.0
> -rwxr-xr-x 1 rk3199 user 85719200 Oct  7 13:56 libpetsc.so.3.18.0
> drwxr-xr-x 3 rk3199 user     4096 Oct  6 10:22 petsc
> drwxr-xr-x 2 rk3199 user     4096 Oct  6 10:23 pkgconfig
>
> Anything else to check?
>
If modifying  CMakeLists.txt does not work, you can try export
LIBRARY_PATH=$LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib
LD_LIBRARY_PATHis is for run time, but the error happened at link time,


>
> On Fri, Oct 7, 2022 at 1:53 PM Satish Balay <balay at mcs.anl.gov> wrote:
>
>> you can try
>>
>> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug
>> MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca
>> opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'"
>>
>> Wrt configure - it can be set with --with-mpiexec option - its saved in
>> PETSC_ARCH/lib/petsc/conf/petscvariables
>>
>> Satish
>>
>> On Fri, 7 Oct 2022, Rob Kudyba wrote:
>>
>> > We are on RHEL 8, using modules that we can load/unload various version
>> of
>> > packages/libraries, and I have OpenMPI 4.1.1 with CUDA aware loaded
>> along
>> > with GDAL 3.3.0, GCC 10.2.0, and cmake 3.22.1
>> >
>> > make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug check
>> > fails with the below errors,
>> > Running check examples to verify correct installation
>> >
>> > Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
>> > Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process
>> > See https://petsc.org/release/faq/
>> >
>> --------------------------------------------------------------------------
>> > The library attempted to open the following supporting CUDA libraries,
>> > but each of them failed.  CUDA-aware support is disabled.
>> > libcuda.so.1: cannot open shared object file: No such file or directory
>> > libcuda.dylib: cannot open shared object file: No such file or directory
>> > /usr/lib64/libcuda.so.1: cannot open shared object file: No such file or
>> > directory
>> > /usr/lib64/libcuda.dylib: cannot open shared object file: No such file
>> or
>> > directory
>> > If you are not interested in CUDA-aware support, then run with
>> > --mca opal_warn_on_missing_libcuda 0 to suppress this message.  If you
>> are
>> > interested
>> > in CUDA-aware support, then try setting LD_LIBRARY_PATH to the location
>> > of libcuda.so.1 to get passed this issue.
>> >
>> --------------------------------------------------------------------------
>> >
>> --------------------------------------------------------------------------
>> > WARNING: There was an error initializing an OpenFabrics device.
>> >
>> >   Local host:   g117
>> >   Local device: mlx5_0
>> >
>> --------------------------------------------------------------------------
>> > lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
>> > Number of SNES iterations = 2
>> > Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI
>> processes
>> > See https://petsc.org/release/faq/
>> >
>> > The library attempted to open the following supporting CUDA libraries,
>> > but each of them failed.  CUDA-aware support is disabled.
>> > libcuda.so.1: cannot open shared object file: No such file or directory
>> > libcuda.dylib: cannot open shared object file: No such file or directory
>> > /usr/lib64/libcuda.so.1: cannot open shared object file: No such file or
>> > directory
>> > /usr/lib64/libcuda.dylib: cannot open shared object file: No such file
>> or
>> > directory
>> > If you are not interested in CUDA-aware support, then run with
>> > --mca opal_warn_on_missing_libcuda 0 to suppress this message.  If you
>> are
>> > interested in CUDA-aware support, then try setting LD_LIBRARY_PATH to
>> the
>> > locationof libcuda.so.1 to get passed this issue.
>> >
>> > WARNING: There was an error initializing an OpenFabrics device.
>> >
>> >   Local host:   xxx
>> >   Local device: mlx5_0
>> >
>> > lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
>> > Number of SNES iterations = 2
>> > [g117:4162783] 1 more process has sent help message
>> > help-mpi-common-cuda.txt / dlopen failed
>> > [g117:4162783] Set MCA parameter "orte_base_help_aggregate" to 0 to see
>> all
>> > help / error messages
>> > [g117:4162783] 1 more process has sent help message
>> help-mpi-btl-openib.txt
>> > / error in device init
>> > Completed test examples
>> > Error while running make check
>> > gmake[1]: *** [makefile:149: check] Error 1
>> > make: *** [GNUmakefile:17: check] Error 2
>> >
>> > Where is $MPI_RUN set? I'd like to be able to pass options such as --mca
>> > orte_base_help_aggregate 0 --mca opal_warn_on_missing_libcuda 0 -mca pml
>> > ucx --mca btl '^openib' which will help me troubleshoot and hide
>> unneeded
>> > warnings.
>> >
>> > Thanks,
>> > Rob
>> >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20221007/b5bcbeda/attachment.html>


More information about the petsc-users mailing list