[petsc-users] Port existing GMRES+ILU(0) implementation to GPU
Matthew Knepley
knepley at gmail.com
Wed Feb 11 07:42:22 CST 2026
On Wed, Feb 11, 2026 at 5:55 AM feng wang <snailsoar at hotmail.com> wrote:
> Hi Junchao,
>
> Thanks for your reply. Probably I did not phrase it in a clear way.
>
> I am using openACC to port the CFD code to the GPU, so the CPU and the GPU
> version essentially share the same source code. For the original CPU
> version, it uses Jacobi (hand-coded) or GMRES+ILU(0) (with pestc) to solve
> the sparse linear system.
>
> The current GPU version of the code only port the Jacobi solver to the
> GPU, now I want to port GMRES+ILU(0) to the GPU. What changes do I need to
> make to the existing CPU version of GMRES+ILU(0) to achieve this goal?
>
I think what Junchao is saying, is that if you use the GPU vec and mat
types, this should be running on the GPU already. Does that not work?
Thanks,
Matt
> BTW: For performance the GPU version of the CFD code has minimum
> communication between the CPU and GPU, so for Ax=b, A, x and b are created
> in the GPU directly
>
> Thanks,
> Feng
>
>
> ------------------------------
> *From:* Junchao Zhang <junchao.zhang at gmail.com>
> *Sent:* 11 February 2026 3:00
> *To:* feng wang <snailsoar at hotmail.com>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>; Barry Smith <
> bsmith at petsc.dev>
> *Subject:* Re: [petsc-users] Port existing GMRES+ILU(0) implementation to
> GPU
>
> Sorry, I don't understand your question. What blocks you from running
> your GMRES+ILU(0) on GPUs? I Cc'ed Barry, who knows better about
> the algorithms.
>
> --Junchao Zhang
>
>
> On Tue, Feb 10, 2026 at 3:57 PM feng wang <snailsoar at hotmail.com> wrote:
>
> Hi Junchao,
>
> I have managed to configure Petsc for GPU, also managed to run ksp/ex15
> using -mat_type aijcusparse -vec_type cuda. It seems runs much faster
> compared to the scenario if I don't use " -mat_type aijcusparse -vec_type
> cuda". so I believe it runs okay for GPUs.
>
> I have an existing CFD code that runs natively on GPUs. so all the data is
> offloaded to GPU at the beginning and some data are copied back to the cpu
> at the very end. It got a hand-coded Newton-Jacobi that runs in GPUs for
> the implicit solver. *My question is: my code also has a GMRES+ILU(0)
> implemented with Petsc but it only runs on cpus (which I implemented a few
> years ago). How can I replace the existing Newton-Jacobi (which runs in
> GPUs) with GMRES+ILU(0) which should run in GPUs. Could you please give
> some advice?*
>
> Thanks,
> Feng
>
> ------------------------------
> *From:* Junchao Zhang <junchao.zhang at gmail.com>
> *Sent:* 09 February 2026 23:18
> *To:* feng wang <snailsoar at hotmail.com>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Port existing GMRES+ILU(0) implementation to
> GPU
>
> Hi Feng,
> At the first step, you don't need to change your CPU implementation.
> Then do profiling to see where it is worth putting your effort. Maybe you
> need to assemble your matrices and vectors on GPUs too, but decide that at
> a later stage.
>
> Thanks!
> --Junchao Zhang
>
>
> On Mon, Feb 9, 2026 at 4:31 PM feng wang <snailsoar at hotmail.com> wrote:
>
> Hi Junchao,
>
> Many thanks for your reply.
>
> This is great! Do I need to change anything for my current CPU
> implementation? or I just link to a version of Petsc that is configured
> with cuda and make sure the necessary data are copied to the "device",
> then Petsc will do the rest magic for me?
>
> Thanks,
> Feng
> ------------------------------
> *From:* Junchao Zhang <junchao.zhang at gmail.com>
> *Sent:* 09 February 2026 1:55
> *To:* feng wang <snailsoar at hotmail.com>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Port existing GMRES+ILU(0) implementation to
> GPU
>
> Hello Feng,
> It is possible to run GMRES with ILU(0) on GPUs. You may need to
> configure PETSc with CUDA (--with-cuda --with-cudac=nvcc) or Kokkos (with
> extra --download-kokkos --download-kokkos-kernels). Then run with
> -mat_type {aijcusparse or aijkokkos} -vec_type {cuda or kokkos}.
> But triangular solve is not GPU friendly and the performance might be
> poor. But you should try it, I think.
>
> Thanks!
> --Junchao Zhang
>
> On Sun, Feb 8, 2026 at 5:46 PM feng wang <snailsoar at hotmail.com> wrote:
>
> Dear All,
>
> I have an existing implementation of GMRES with ILU(0), it works well for
> cpu now. I went through the Petsc documentation, it seems Petsc has some
> support for GPUs. is it possible for me to run GMRES with ILU(0) in GPUs?
>
> Many thanks for your help in advance,
> Feng
>
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!f7i0WZFkMTRDMqONYUCeDNn9XDQjXS7bps7XWsgAlnO54oH90yfmvfuu-0QJAbxqCNYSof3G34TuqHuiTIAb$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!f7i0WZFkMTRDMqONYUCeDNn9XDQjXS7bps7XWsgAlnO54oH90yfmvfuu-0QJAbxqCNYSof3G34TuqDSTsgpM$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20260211/a6399026/attachment-0001.html>
More information about the petsc-users
mailing list