[petsc-dev] Subsurface application and Algebraic Multigrid on GPUs

Brian Van Straalen bvstraalen at lbl.gov
Wed Sep 19 13:09:38 CDT 2018

Hi Jed,

  Good to hear from you!     For the first cases where the geometry is not
moving the matrix stays fixed for many many solves and shipping it once to
the GPU makes perfect sense.

Mark has been helping us with our AMG parameter space.  plain GAMG
aggregation has worked well for us (hence why my toying with writing it for
ourselves).  Our own coarsening in Chombo has some bugs in it somewhere
that make our homegrown GMG unreliable for complex embedded geometry.  GAMG
done properly works.  You have to set some parameters correctly, but it

  So far OpenMP offload has serialization effects for kernel launch that we
are digging through.   Since we have not been knighted with Summit access
we are working on summitdev and the other GPU clusters we have laying
around.  On Cori KNL we run with 200-800k unknowns per node.


On Wed, Sep 19, 2018 at 1:16 PM, Jed Brown <jed at jedbrown.org> wrote:

> Brian, how frequently do you need to update the matrix (thus rebuild the
> preconditioner)?
> If it is infrequent, we could (in the near term) provide AMG setup on
> CPU with solves on GPU.
> What is your typical problem size per node to be run on Summit?  What is
> your MPI/OpenMP(?) decomposition?
> Are these heterogeneous Poisson solves or are the equations to be solved
> implicitly more complicated?  Do you have experimental information about
> relative convergence rates/grid complexity/strong scalability for your
> operator solved using classical AMG (e.g., Hypre) versus smoothed (or
> plain) aggregation (ML, GAMG default)?
> Brian Van Straalen <bvstraalen at lbl.gov> writes:
> > So Baky and I have been at the Brookhaven GPU Hackathon now for three
> days,
> > talking to everyone.  We have also been emailing with people who will
> > respond to us from the hypre team and the PETSc team, as well as reading
> > every blog post and mail archive and message board and from what we can
> > tell, a distributed AMG preconditioner will not be available for us on a
> > Summit platform for the foreseeable future.
> >
> > There is a hypre build for CUDA, but it has a problem with it's use of
> > CUSP, and nobody seems to be working on it.
> >
> > PETSc has some .cu cuda files for the SpMV and Vector operations but the
> > preconditioners are limited to point Jacobi and similar simple operations
> > and a version of ILU.  Neither works for our stiff projection in the
> > embedded boundary algorithms.   We built it and ran it and PETSc takes
> > several hundred iterations to get the residual down by a factor of 6.  We
> > need to get down to more like 10e-11 for this solve.
> >
> > The AMG being worked on by the NVIDIA team is not targeted for multi-node
> > solving, and I haven't heard back from them in months.
> >
> >   We are left with two options as I see it to meet our ECP Milestones:
> >
> > 1. Build yet another interface, this time to see if there is a
> distributed
> > GPU AMG preconditioner in Trilinos
> >
> >  2. Implement our own special-purpose EB-GMG solver written in Chombo.
> >
> > I would love to be wrong about all this.
> >
> > Brian
> >
> > --
> > Brian Van Straalen         Lawrence Berkeley Lab
> > BVStraalen at lbl.gov        Computational Research
> > (510) 486-4976            Division (crd.lbl.gov)

Brian Van Straalen         Lawrence Berkeley Lab
BVStraalen at lbl.gov        Computational Research
(510) 486-4976            Division (crd.lbl.gov)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20180919/b7ff869e/attachment.html>

More information about the petsc-dev mailing list