[petsc-dev] Exiting with error when using GPUs and non GPU-aware MPI
Satish Balay
balay at mcs.anl.gov
Mon Mar 23 20:46:17 CDT 2020
The issues related to this were discussed at:
https://gitlab.com/petsc/petsc/-/merge_requests/2506#note_283175046
Satish
On Mon, 23 Mar 2020, Richard Tran Mills via petsc-dev wrote:
> Colleagues,
>
> I did not notice this, but Junchao's MR, "Directly pass root/leafdata to MPI
> in SF when possible"
>
> https://gitlab.com/petsc/petsc/-/merge_requests/2506
>
> that was merged into master over the weekend causes PETSc to error out if
> PETSc has been configured with GPU support but the MPI implementation is
> "GPU-aware", unless the user has specified "-use_gpu_aware_mpi 0":
>
> > [0]PETSC ERROR: PETSc is configured with GPU support, but your MPI is
> not GPU-aware. For better performance, please use a GPU-aware MPI.
> > [0]PETSC ERROR: For IBM Spectrum MPI on OLCF Summit, you may need
> jsrun --smpiargs=-gpu.
> > [0]PETSC ERROR: For OpenMPI, you need to configure it --with-cuda
> (https://www.open-mpi.org/faq/?category=buildcuda)
> > [0]PETSC ERROR: For MVAPICH2-GDR, you need to set MV2_USE_CUDA=1
> (http://mvapich.cse.ohio-state.edu/userguide/gdr/)
> > [0]PETSC ERROR: For Cray-MPICH, you need to set
> MPICH_RDMA_ENABLED_CUDA=1
> (https://www.olcf.ornl.gov/tutorials/gpudirect-mpich-enabled-cuda/)
> > [0]PETSC ERROR: If you do not care, use option -use_gpu_aware_mpi 0,
> then PETSc will copy data from GPU to CPU for communication.
> > application called MPI_Abort(MPI_COMM_WORLD, 90693076) - process 0
>
> I like that we are warning users about a potential performance problem, but
> this seems like something that should print a warning, rather than exiting
> with an error. So I am wondering
>
> 1) Do people agree that this should be a warning instead of an error?
>
> and
>
> 2) Shouldn't we add a standard mechanism for reporting these sorts of warnings
> at runtime?
>
> --Richard
>
>
More information about the petsc-dev
mailing list