[petsc-users] gpu cpu parallel
Junchao Zhang
junchao.zhang at gmail.com
Tue Nov 11 21:48:47 CST 2025
Hi, Wenbo,
I think your approach should work. But before going this extra step
with gpu_comm, have you tried to map multiple MPI ranks (CPUs) to one GPU,
using nvidia's multiple process service (MPS)? If MPS works well, then
you can avoid the extra complexity.
--Junchao Zhang
On Tue, Nov 11, 2025 at 7:50 PM Wenbo Zhao <zhaowenbo.npic at gmail.com> wrote:
> Dear all,
>
> We are trying to solve ksp using GPUs.
> We found the example, src/ksp/ksp/tutorials/bench_kspsolve.c, in which the
> matrix is created and assembling using COO way provided by PETSc. In this
> example, the number of CPU is as same as the number of GPU.
> In our case, computation of the parameters of matrix is performed on CPUs.
> And the cost of it is expensive, which might take half of total time or
> even more.
>
> We want to use more CPUs to compute parameters in parallel. And a smaller
> communication domain (such as gpu_comm) for the CPUs corresponding to the
> GPUs is created. The parameters are computed by all of the CPUs (in
> MPI_COMM_WORLD). Then, the parameters are send to gpu_comm related CPUs via
> MPI. Matrix (type of aijcusparse) is then created and assembled within
> gpu_comm. Finally, ksp_solve is performed on GPUs.
>
> I’m not sure if this approach will work in practice. Are there any
> comparable examples I can look to for guidance?
>
> Best,
> Wenbo
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251111/370a1837/attachment.html>
More information about the petsc-users
mailing list