[petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

Rob Kudyba rk3199 at columbia.edu
Fri Oct 7 13:08:24 CDT 2022


Thanks for the quick reply. I added these options to make and make check
still produce the warnings so I used the command like this:
make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug
 MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca
opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'" check
Running check examples to verify correct installation
Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
Completed test examples

Could be useful for the FAQ.

I'm not trying to use PetSC to compile and linking appears to go awry:
[ 58%] Building CXX object
CMakeFiles/wtm.dir/src/update_effective_storativity.cpp.o
[ 62%] Linking CXX static library libwtm.a
[ 62%] Built target wtm
[ 66%] Building CXX object CMakeFiles/wtm.x.dir/src/WTM.cpp.o
[ 70%] Linking CXX executable wtm.x
/usr/bin/ld: cannot find -lpetsc
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1
make[1]: *** [CMakeFiles/Makefile2:269: CMakeFiles/wtm.x.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

Is there an environment variable I'm missing? I've seen the suggestion
<https://www.mail-archive.com/search?l=petsc-users@mcs.anl.gov&q=subject:%22%5C%5Bpetsc%5C-users%5C%5D+CMake+error+in+PETSc%22&o=newest&f=1>
to add it to LD_LIBRARY_PATH which I did with export
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib and that points
to:
ls -l /path/to/petsc/arch-linux-c-debug/lib
total 83732
lrwxrwxrwx 1 rk3199 user       18 Oct  7 13:56 libpetsc.so ->
libpetsc.so.3.18.0
lrwxrwxrwx 1 rk3199 user       18 Oct  7 13:56 libpetsc.so.3.18 ->
libpetsc.so.3.18.0
-rwxr-xr-x 1 rk3199 user 85719200 Oct  7 13:56 libpetsc.so.3.18.0
drwxr-xr-x 3 rk3199 user     4096 Oct  6 10:22 petsc
drwxr-xr-x 2 rk3199 user     4096 Oct  6 10:23 pkgconfig

Anything else to check?

On Fri, Oct 7, 2022 at 1:53 PM Satish Balay <balay at mcs.anl.gov> wrote:

> you can try
>
> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug
> MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca
> opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'"
>
> Wrt configure - it can be set with --with-mpiexec option - its saved in
> PETSC_ARCH/lib/petsc/conf/petscvariables
>
> Satish
>
> On Fri, 7 Oct 2022, Rob Kudyba wrote:
>
> > We are on RHEL 8, using modules that we can load/unload various version
> of
> > packages/libraries, and I have OpenMPI 4.1.1 with CUDA aware loaded along
> > with GDAL 3.3.0, GCC 10.2.0, and cmake 3.22.1
> >
> > make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug check
> > fails with the below errors,
> > Running check examples to verify correct installation
> >
> > Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
> > Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process
> > See https://petsc.org/release/faq/
> >
> --------------------------------------------------------------------------
> > The library attempted to open the following supporting CUDA libraries,
> > but each of them failed.  CUDA-aware support is disabled.
> > libcuda.so.1: cannot open shared object file: No such file or directory
> > libcuda.dylib: cannot open shared object file: No such file or directory
> > /usr/lib64/libcuda.so.1: cannot open shared object file: No such file or
> > directory
> > /usr/lib64/libcuda.dylib: cannot open shared object file: No such file or
> > directory
> > If you are not interested in CUDA-aware support, then run with
> > --mca opal_warn_on_missing_libcuda 0 to suppress this message.  If you
> are
> > interested
> > in CUDA-aware support, then try setting LD_LIBRARY_PATH to the location
> > of libcuda.so.1 to get passed this issue.
> >
> --------------------------------------------------------------------------
> >
> --------------------------------------------------------------------------
> > WARNING: There was an error initializing an OpenFabrics device.
> >
> >   Local host:   g117
> >   Local device: mlx5_0
> >
> --------------------------------------------------------------------------
> > lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> > Number of SNES iterations = 2
> > Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes
> > See https://petsc.org/release/faq/
> >
> > The library attempted to open the following supporting CUDA libraries,
> > but each of them failed.  CUDA-aware support is disabled.
> > libcuda.so.1: cannot open shared object file: No such file or directory
> > libcuda.dylib: cannot open shared object file: No such file or directory
> > /usr/lib64/libcuda.so.1: cannot open shared object file: No such file or
> > directory
> > /usr/lib64/libcuda.dylib: cannot open shared object file: No such file or
> > directory
> > If you are not interested in CUDA-aware support, then run with
> > --mca opal_warn_on_missing_libcuda 0 to suppress this message.  If you
> are
> > interested in CUDA-aware support, then try setting LD_LIBRARY_PATH to the
> > locationof libcuda.so.1 to get passed this issue.
> >
> > WARNING: There was an error initializing an OpenFabrics device.
> >
> >   Local host:   xxx
> >   Local device: mlx5_0
> >
> > lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> > Number of SNES iterations = 2
> > [g117:4162783] 1 more process has sent help message
> > help-mpi-common-cuda.txt / dlopen failed
> > [g117:4162783] Set MCA parameter "orte_base_help_aggregate" to 0 to see
> all
> > help / error messages
> > [g117:4162783] 1 more process has sent help message
> help-mpi-btl-openib.txt
> > / error in device init
> > Completed test examples
> > Error while running make check
> > gmake[1]: *** [makefile:149: check] Error 1
> > make: *** [GNUmakefile:17: check] Error 2
> >
> > Where is $MPI_RUN set? I'd like to be able to pass options such as --mca
> > orte_base_help_aggregate 0 --mca opal_warn_on_missing_libcuda 0 -mca pml
> > ucx --mca btl '^openib' which will help me troubleshoot and hide unneeded
> > warnings.
> >
> > Thanks,
> > Rob
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20221007/a1441067/attachment-0001.html>


More information about the petsc-users mailing list