[petsc-dev] Exiting with error when using GPUs and non GPU-aware MPI

Satish Balay balay at mcs.anl.gov
Mon Mar 23 20:46:17 CDT 2020


The issues related to this were discussed at:

https://gitlab.com/petsc/petsc/-/merge_requests/2506#note_283175046

Satish

On Mon, 23 Mar 2020, Richard Tran Mills via petsc-dev wrote:

> Colleagues,
> 
> I did not notice this, but Junchao's MR, "Directly pass root/leafdata to MPI
> in SF when possible"
> 
>   https://gitlab.com/petsc/petsc/-/merge_requests/2506
> 
> that was merged into master over the weekend causes PETSc to error out if
> PETSc has been configured with GPU support but the MPI implementation is
> "GPU-aware", unless the user has specified "-use_gpu_aware_mpi 0":
> 
> > [0]PETSC ERROR: PETSc is configured with GPU support, but your MPI is 
> not GPU-aware. For better performance, please use a GPU-aware MPI.
> > [0]PETSC ERROR: For IBM Spectrum MPI on OLCF Summit, you may need 
> jsrun --smpiargs=-gpu.
> > [0]PETSC ERROR: For OpenMPI, you need to configure it --with-cuda 
> (https://www.open-mpi.org/faq/?category=buildcuda)
> > [0]PETSC ERROR: For MVAPICH2-GDR, you need to set MV2_USE_CUDA=1 
> (http://mvapich.cse.ohio-state.edu/userguide/gdr/)
> > [0]PETSC ERROR: For Cray-MPICH, you need to set 
> MPICH_RDMA_ENABLED_CUDA=1
> (https://www.olcf.ornl.gov/tutorials/gpudirect-mpich-enabled-cuda/)
> > [0]PETSC ERROR: If you do not care, use option -use_gpu_aware_mpi 0, 
> then PETSc will copy data from GPU to CPU for communication.
> > application called MPI_Abort(MPI_COMM_WORLD, 90693076) - process 0
> 
> I like that we are warning users about a potential performance problem, but
> this seems like something that should print a warning, rather than exiting
> with an error. So I am wondering
> 
> 1) Do people agree that this should be a warning instead of an error?
> 
> and
> 
> 2) Shouldn't we add a standard mechanism for reporting these sorts of warnings
> at runtime?
> 
> --Richard
> 
> 


More information about the petsc-dev mailing list