[petsc-users] gpu cpu parallel

Junchao Zhang junchao.zhang at gmail.com
Tue Nov 11 21:48:47 CST 2025


Hi, Wenbo,
   I think your approach should work.  But before going this extra step
with gpu_comm,  have you tried to map multiple MPI ranks (CPUs) to one GPU,
using nvidia's multiple process service (MPS)?  If MPS works well,  then
you can avoid the extra complexity.

--Junchao Zhang


On Tue, Nov 11, 2025 at 7:50 PM Wenbo Zhao <zhaowenbo.npic at gmail.com> wrote:

> Dear all,
>
> We are trying to solve ksp using GPUs.
> We found the example, src/ksp/ksp/tutorials/bench_kspsolve.c, in which the
> matrix is created and assembling using COO way provided by PETSc. In this
> example, the number of CPU is as same as the number of GPU.
> In our case, computation of the parameters of matrix is performed on CPUs.
> And the cost of it is expensive, which might take half of total time or
> even more.
>
>  We want to use more CPUs to compute parameters in parallel. And a smaller
> communication domain (such as gpu_comm) for the CPUs corresponding to the
> GPUs is created. The parameters are computed by all of the CPUs (in
> MPI_COMM_WORLD). Then, the parameters are send to gpu_comm related CPUs via
> MPI. Matrix (type of aijcusparse) is then created and assembled within
> gpu_comm. Finally, ksp_solve is performed on GPUs.
>
> I’m not sure if this approach will work in practice. Are there any
> comparable examples I can look to for guidance?
>
> Best,
> Wenbo
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251111/370a1837/attachment.html>


More information about the petsc-users mailing list