[petsc-users] [MPI][GPU]

Barry Smith bsmith at petsc.dev
Sun Aug 31 17:32:23 CDT 2025


  Can you pinpoint the MPI calls (and the routines in PETSc or hypre that they are in) that are not using CUDA-aware message passing? That is, inside the MPI  call, they are copying to host memory and doing the needed inter-process communication from there? I do not understand the graphics you have sent.

   Barry

Or is it possible the buffers passed to MPI are not on the GPU and so naturally do the MPI from host memory? If so, where?

> On Aug 31, 2025, at 1:30 PM, LEDAC Pierre <Pierre.LEDAC at cea.fr> wrote:
> 
> Ok, I just tried --enable-gpu-aware-mpi passed to Hypre, 
> 
> Hypre_config.h defines now HYPRE_USING_GPU_AWARE_MPI 1
> 
> But still no D2D copy near MPI calls in ex46.c example. 
> 
> Probably an obvious thing I forgot during PETSc configure, but I don't see...
> 
> Pierre LEDAC
> Commissariat à l’énergie atomique et aux énergies alternatives
> Centre de SACLAY
> DES/ISAS/DM2S/SGLS/LCAN
> Bâtiment 451 – point courrier n°41
> F-91191 Gif-sur-Yvette
> +33 1 69 08 04 03
> +33 6 83 42 05 79
> De : LEDAC Pierre
> Envoyé : dimanche 31 août 2025 19:13:36
> À : Barry Smith
> Cc : petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>
> Objet : RE: [petsc-users] [MPI][GPU]
>  
> Barry,
> 
> It solved the unrecognized option but still exchanging MPI messages through the host.
> 
> I switch to a simpler test case without reading a matrix (src/ksp/ksp/tutorials/ex46.c) but get the same behaviour.
> 
> In the Nsys profile for ex46, the MPI synchronizations occurs during PCApply so now I am wondering if the issue is related to the fact than Hypre is not configured/enabled with MPI GPU-Aware in the PETSc build. 
> I will give a try with --enable-gpu-aware-mpi passed to Hypre.
> 
> Do you know an example in PETSc which specifically bench with/without Cuda-Aware enabled for MPI ?
> 
> <pastedImage.png>
> 
> Pierre LEDAC
> Commissariat à l’énergie atomique et aux énergies alternatives
> Centre de SACLAY
> DES/ISAS/DM2S/SGLS/LCAN
> Bâtiment 451 – point courrier n°41
> F-91191 Gif-sur-Yvette
> +33 1 69 08 04 03
> +33 6 83 42 05 79
> De : Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> Envoyé : dimanche 31 août 2025 16:33:38
> À : LEDAC Pierre
> Cc : petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>
> Objet : Re: [petsc-users] [MPI][GPU]
>  
> 
>   Ahh, that ex10.c is missing a VecSetFromOptions() call before the VecLoad() and friends. In contrast, the matrix has a MatSetFromOptions(). Can you try adding it to ex10.c and see if that resolves the problem with ex10.c (and may be a path forward for your code)?
> 
>   Barry
> 
> 
>> On Aug 31, 2025, at 4:32 AM, LEDAC Pierre <Pierre.LEDAC at cea.fr <mailto:Pierre.LEDAC at cea.fr>> wrote:
>> 
>> Yes, but was surprised it was not used, so I removed it (same for -vec_type mpicuda)
>> 
>> mpirun -np 2 ./ex10 2 -f Matrix_3133717_rows_1_cpus.petsc -ksp_view -log_view -ksp_monitor -ksp_type cg -pc_type hypre -pc_hypre_type boomeramg -pc_hypre_boomeramg_strong_threshold 0.7 -mat_type aijcusparse -vec_type cuda
>> ...
>> WARNING! There are options you set that were not used!
>> WARNING! could be spelling mistake, etc!
>> There is one unused database option. It is:
>> Option left: name:-vec_type value: cuda source: command lin
>> 
>> Pierre LEDAC
>> Commissariat à l’énergie atomique et aux énergies alternatives
>> Centre de SACLAY
>> DES/ISAS/DM2S/SGLS/LCAN
>> Bâtiment 451 – point courrier n°41
>> F-91191 Gif-sur-Yvette
>> +33 1 69 08 04 03
>> +33 6 83 42 05 79
>> De : Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
>> Envoyé : samedi 30 août 2025 21:47:07
>> À : LEDAC Pierre
>> Cc : petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>
>> Objet : Re: [petsc-users] [MPI][GPU]
>>  
>> 
>> Did you try the additional option -vec_type cuda with ex10.c ?
>> 
>> 
>> 
>>> On Aug 30, 2025, at 1:16 PM, LEDAC Pierre <Pierre.LEDAC at cea.fr <mailto:Pierre.LEDAC at cea.fr>> wrote:
>>> 
>>> Hello,
>>> 
>>> My code is built with PETSc 3.23+OpenMPI 4.1.6 (Cuda support enabled) and profling indicates that MPI communications are done between GPUs in all the code except PETSc part where D2H transfers occur.
>>> 
>>> I reproduced the PETSc issue with the example under src/ksp/ksp/tutorials/ex10 on 2 MPI ranks. See output in ex10.log
>>> 
>>> Also below the Nsys system profiling on ex10 with D2H and H2D copies before/after MPI calls.
>>> 
>>> Thanks for your help,
>>> 
>>> <pastedImage.png>
>>> 
>>> 
>>> Pierre LEDAC
>>> Commissariat à l’énergie atomique et aux énergies alternatives
>>> Centre de SACLAY
>>> DES/ISAS/DM2S/SGLS/LCAN
>>> Bâtiment 451 – point courrier n°41
>>> F-91191 Gif-sur-Yvette
>>> +33 1 69 08 04 03
>>> +33 6 83 42 05 79
>>> <ex10.log>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250831/f81c1263/attachment.html>


More information about the petsc-users mailing list