[petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check
Satish Balay
balay at mcs.anl.gov
Fri Oct 7 12:52:57 CDT 2022
you can try
make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'"
Wrt configure - it can be set with --with-mpiexec option - its saved in PETSC_ARCH/lib/petsc/conf/petscvariables
Satish
On Fri, 7 Oct 2022, Rob Kudyba wrote:
> We are on RHEL 8, using modules that we can load/unload various version of
> packages/libraries, and I have OpenMPI 4.1.1 with CUDA aware loaded along
> with GDAL 3.3.0, GCC 10.2.0, and cmake 3.22.1
>
> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug check
> fails with the below errors,
> Running check examples to verify correct installation
>
> Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process
> See https://petsc.org/release/faq/
> --------------------------------------------------------------------------
> The library attempted to open the following supporting CUDA libraries,
> but each of them failed. CUDA-aware support is disabled.
> libcuda.so.1: cannot open shared object file: No such file or directory
> libcuda.dylib: cannot open shared object file: No such file or directory
> /usr/lib64/libcuda.so.1: cannot open shared object file: No such file or
> directory
> /usr/lib64/libcuda.dylib: cannot open shared object file: No such file or
> directory
> If you are not interested in CUDA-aware support, then run with
> --mca opal_warn_on_missing_libcuda 0 to suppress this message. If you are
> interested
> in CUDA-aware support, then try setting LD_LIBRARY_PATH to the location
> of libcuda.so.1 to get passed this issue.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> WARNING: There was an error initializing an OpenFabrics device.
>
> Local host: g117
> Local device: mlx5_0
> --------------------------------------------------------------------------
> lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> Number of SNES iterations = 2
> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes
> See https://petsc.org/release/faq/
>
> The library attempted to open the following supporting CUDA libraries,
> but each of them failed. CUDA-aware support is disabled.
> libcuda.so.1: cannot open shared object file: No such file or directory
> libcuda.dylib: cannot open shared object file: No such file or directory
> /usr/lib64/libcuda.so.1: cannot open shared object file: No such file or
> directory
> /usr/lib64/libcuda.dylib: cannot open shared object file: No such file or
> directory
> If you are not interested in CUDA-aware support, then run with
> --mca opal_warn_on_missing_libcuda 0 to suppress this message. If you are
> interested in CUDA-aware support, then try setting LD_LIBRARY_PATH to the
> locationof libcuda.so.1 to get passed this issue.
>
> WARNING: There was an error initializing an OpenFabrics device.
>
> Local host: xxx
> Local device: mlx5_0
>
> lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> Number of SNES iterations = 2
> [g117:4162783] 1 more process has sent help message
> help-mpi-common-cuda.txt / dlopen failed
> [g117:4162783] Set MCA parameter "orte_base_help_aggregate" to 0 to see all
> help / error messages
> [g117:4162783] 1 more process has sent help message help-mpi-btl-openib.txt
> / error in device init
> Completed test examples
> Error while running make check
> gmake[1]: *** [makefile:149: check] Error 1
> make: *** [GNUmakefile:17: check] Error 2
>
> Where is $MPI_RUN set? I'd like to be able to pass options such as --mca
> orte_base_help_aggregate 0 --mca opal_warn_on_missing_libcuda 0 -mca pml
> ucx --mca btl '^openib' which will help me troubleshoot and hide unneeded
> warnings.
>
> Thanks,
> Rob
>
More information about the petsc-users
mailing list