[petsc-dev] Question on PETSc + CUDA configuration with MPI on cluster
Matthew Knepley
knepley at gmail.com
Tue Sep 23 06:58:12 CDT 2025
Also, the configure.log has
#define PETSC_HAVE_MPI_GPU_AWARE 1
which says PETSc thinks the GPU support is there.
Thanks,
Matt
On Tue, Sep 23, 2025 at 1:20 AM Satish Balay <balay.anl at fastmail.org> wrote:
> orte-info output does suggest OpenMPI is built with cuda enabled.
>
> Are you able to run PETSc examples? What do you get for:
>
> >>>>
> balay at petsc-gpu-01:/scratch/balay/petsc/src/snes/tutorials$ make ex19
> /scratch/balay/petsc/arch-linux-c-debug/bin/mpicc -fPIC -Wall
> -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch
> -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -g3 -O0
> -I/scratch/balay/petsc/include
> -I/scratch/balay/petsc/arch-linux-c-debug/include
>
> -I/nfs/gce/projects/petsc/soft/u22.04/spack-2024-11-27-cuda/opt/spack/linux-ubuntu22.04-x86_64/gcc-11.4.0/cuda-12.0.1-gy7foq57oi6wzltombtsdy5eqz5gkjgc/include
> -Wl,-export-dynamic ex19.c
> -Wl,-rpath,/scratch/balay/petsc/arch-linux-c-debug/lib
> -L/scratch/balay/petsc/arch-linux-c-debug/lib
>
> -Wl,-rpath,/nfs/gce/projects/petsc/soft/u22.04/spack-2024-11-27-cuda/opt/spack/linux-ubuntu22.04-x86_64/gcc-11.4.0/cuda-12.0.1-gy7foq57oi6wzltombtsdy5eqz5gkjgc/lib64
>
> -L/nfs/gce/projects/petsc/soft/u22.04/spack-2024-11-27-cuda/opt/spack/linux-ubuntu22.04-x86_64/gcc-11.4.0/cuda-12.0.1-gy7foq57oi6wzltombtsdy5eqz5gkjgc/lib64
>
> -L/nfs/gce/projects/petsc/soft/u22.04/spack-2024-11-27-cuda/opt/spack/linux-ubuntu22.04-x86_64/gcc-11.4.0/cuda-12.0.1-gy7foq57oi6wzltombtsdy5eqz5gkjgc/lib64/stubs
> -Wl,-rpath,/scratch/balay/petsc/arch-linux-c-debug/lib
> -L/scratch/balay/petsc/arch-linux-c-debug/lib
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/11
> -L/usr/lib/gcc/x86_64-linux-gnu/11 -lpetsc -llapack -lblas -lm -lcudart
> -lnvToolsExt -lcufft -lcublas -lcusparse -lcusolver -lcurand -lcuda
> -lX11 -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi
> -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -o ex19
> balay at petsc-gpu-01:/scratch/balay/petsc/src/snes/tutorials$ ./ex19
> -snes_monitor -dm_mat_type seqaijcusparse -dm_vec_type seqcuda -pc_type
> gamg -pc_gamg_esteig_ksp_max_it 10 -ksp_monitor -mg_levels_ksp_max_it 3
> lid velocity = 0.0625, prandtl # = 1., grashof # = 1.
> 0 SNES Function norm 2.391552133017e-01
> 0 KSP Residual norm 2.013462697105e-01
> 1 KSP Residual norm 5.027022294231e-02
> 2 KSP Residual norm 7.248258907839e-03
> 3 KSP Residual norm 8.590847505363e-04
> 4 KSP Residual norm 1.511762118013e-05
> 5 KSP Residual norm 1.410585959219e-06
> 1 SNES Function norm 6.812362089434e-05
> 0 KSP Residual norm 2.315252918142e-05
> 1 KSP Residual norm 2.351994603807e-06
> 2 KSP Residual norm 3.882072626158e-07
> 3 KSP Residual norm 2.227447016095e-08
> 4 KSP Residual norm 2.200353394658e-09
> 5 KSP Residual norm 1.147903850265e-10
> 2 SNES Function norm 3.411489611752e-10
> Number of SNES iterations = 2
> balay at petsc-gpu-01:/scratch/balay/petsc/src/snes/tutorials$
> <<<<
>
> So what issue are you seeing with your code? And does it go away with the
> option: "-use_gpu_aware_mpi 0"? for example:
>
> >>>>
> balay at petsc-gpu-01:/scratch/balay/petsc/src/snes/tutorials$ ./ex19
> -snes_monitor -dm_mat_type seqaijcusparse -dm_vec_type seqcuda -pc_type
> gamg -pc_gamg_esteig_ksp_max_it 10 -ksp_monitor -mg_levels_ksp_max_it 3
> -use_gpu_aware_mpi 0
> lid velocity = 0.0625, prandtl # = 1., grashof # = 1.
> 0 SNES Function norm 2.391552133017e-01
> 0 KSP Residual norm 2.013462697105e-01
> 1 KSP Residual norm 5.027022294231e-02
> 2 KSP Residual norm 7.248258907839e-03
> 3 KSP Residual norm 8.590847505363e-04
> 4 KSP Residual norm 1.511762118013e-05
> 5 KSP Residual norm 1.410585959219e-06
> 1 SNES Function norm 6.812362089434e-05
> 0 KSP Residual norm 2.315252918142e-05
> 1 KSP Residual norm 2.351994603807e-06
> 2 KSP Residual norm 3.882072626158e-07
> 3 KSP Residual norm 2.227447016095e-08
> 4 KSP Residual norm 2.200353394658e-09
> 5 KSP Residual norm 1.147903850265e-10
> 2 SNES Function norm 3.411489611752e-10
> Number of SNES iterations = 2
> balay at petsc-gpu-01:/scratch/balay/petsc/src/snes/tutorials$
> <<<<
>
> Satish
>
> On Tue, 23 Sep 2025, 岳新海 wrote:
>
> > I get:
> > [mae_yuexh at login01 ~]$ orte-info |grep 'MCA btl'
> > MCA btl:
> smcuda (MCA v2.1, API v3.1, Component v4.1.5)
> > MCA btl:
> tcp (MCA v2.1, API v3.1, Component v4.1.5)
> > MCA btl:
> self (MCA v2.1, API v3.1, Component v4.1.5)
> > MCA btl:
> vader (MCA v2.1, API v3.1, Component v4.1.5)
> >
> >
> >
> > Xinhai
> >
> >
> >
> >
> >
> > 岳新海
> >
> >
> >
> > 南方科技大学/学生/研究生/2023级研究生
> >
> >
> >
> > 广东省深圳市南山区学苑大道1088号
> >
> >
> >
> >
> >
> >
> >
> > ------------------ Original ------------------
> > From: "Satish Balay"<balay.anl at fastmail.org>;
> > Date: Tue, Sep 23, 2025 03:25 AM
> > To: "岳新海"<12332508 at mail.sustech.edu.cn>;
> > Cc: "petsc-dev"<petsc-dev at mcs.anl.gov>;
> > Subject: Re: [petsc-dev] Question on PETSc + CUDA configuration
> with MPI on cluster
> >
> >
> >
> >
> >
> What do you get for (with your openmpi install) :orte-info |grep 'MCA btl'
> >
> > With cuda built openmpi - I get:
> > balay at petsc-gpu-01
> :/scratch/balay/petsc$ ./arch-linux-c-debug/bin/orte-info |grep 'MCA btl'
> >
> MCA btl: smcuda (MCA v2.1, API v3.1, Component v4.1.6)
> >
> MCA btl: openib (MCA v2.1, API v3.1, Component v4.1.6)
> >
> MCA btl: self (MCA v2.1, API v3.1, Component v4.1.6)
> >
> MCA btl: tcp (MCA v2.1, API v3.1, Component v4.1.6)
> >
> MCA btl: vader (MCA v2.1, API v3.1, Component v4.1.6)
> >
> > And without cuda:
> > balay at petsc-gpu-01
> :/scratch/balay/petsc.x$ ./arch-test/bin/orte-info | grep 'MCA btl'
> >
> MCA btl: openib (MCA v2.1, API v3.1, Component v4.1.6)
> >
> MCA btl: self (MCA v2.1, API v3.1, Component v4.1.6)
> >
> MCA btl: tcp (MCA v2.1, API v3.1, Component v4.1.6)
> >
> MCA btl: vader (MCA v2.1, API v3.1, Component v4.1.6)
> >
> >
> i.e "smcuda" should be listed for a cuda enabled openmpi.
> >
> >
> Its not clear if GPU-aware MPI makes a difference for all MPI impls (or versions) - so good to verify. [its a performance issue anyway - so primarily useful when performing timing measurements]
> >
> > Satish
> >
> > On Mon, 22 Sep 2025, 岳新海 wrote:
> >
> > > Dear PETSc Team,
> > >
> >
> > I am encountering an issue when running PETSc with CUDA support on a cluster. When I set the vector type to VECCUDA, PETSc reports that my MPI is not GPU-aware. However, the MPI library (OpenMPI 4.1.5) I used to configure PETSc was built with the --with-cuda option enabled.
> > >
> > >
> > > Here are some details:
> > > PETSc version: 3.20.6
> >
> > MPI: OpenMPI 4.1.5, configured with --with-cuda
> > > GPU: RTX3090
> > > CUDA version: 12.1
> >
> > I have attached both my PETSc configure command and OpenMPI configure command for reference.
> > >
> > > My questions are:
> > >
> > >
> > >
> > >
> >
> > Even though I enabled --with-cuda in OpenMPI, why does PETSc still report that MPI is not GPU-aware?
> > >
> > >
> > >
> >
> > Are there additional steps or specific configuration flags required (either in OpenMPI or PETSc) to ensure GPU-aware MPI is correctly detected?
> > >
> > >
> >
> > Any guidance or suggestions would be greatly appreciated.
> > >
> > >
> > >
> > > Best regards,
> > >
> > > Xinhai Yue
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > 岳新海
> > >
> > >
> > >
> > > 南方科技大学/学生/研究生/2023级研究生
> > >
> > >
> > >
> > > 广东省深圳市南山区学苑大道1088号
> > >
> > >
> > >
> > >
> > >
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cdM21c-7ReRi2XQTPS6YvfilWnEw4nUkPb1NxGBgs3JvVOaKitEMUhroNxRlbKSqRNErzlAPkMZW25Dxws16$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cdM21c-7ReRi2XQTPS6YvfilWnEw4nUkPb1NxGBgs3JvVOaKitEMUhroNxRlbKSqRNErzlAPkMZW22UEHKIh$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20250923/ac89a888/attachment-0001.html>
More information about the petsc-dev
mailing list