[petsc-dev] Subsurface application and Algebraic Multigrid on GPUs
Mark Adams
mfadams at lbl.gov
Tue Sep 25 06:47:56 CDT 2018
FYI, I've built Barry's updates on SUMMIT and tested it on SUMMITDEV. I
can't run on SUMMIT now. It has been merged in the master branch.
THis is how you run the cuda tests (in the PETSc root directory):
make -f gmakefile.test test globsearch="snes*tutorials*ex19*cuda*"
Mark
On Thu, Sep 20, 2018 at 5:06 PM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>
> Brian,
>
> I have finished making the (relatively few) changes needed to get
> PETSc's GAMG to run on a combination of the CPU and GPU. Any of the AMG
> kernels that has a CUDA backed is run automatically on the GPU while the
> kernels without a CUDA backend are run on the CPU. In particular the
> "solve" portion" (Chebyshev/Jacobi smoothing, coarse grid restriction and
> interpolation) will run on the GPU as well as part of the AMG "setup".
>
> This is in the branch barry/mpiaijcusparse-better-subclass-mpiaij
> which will hopefully be in the master branch tomorrow if it passes all the
> test suite tonight. I see Mark is already attempting to build PETSc on
> summit and can hopefully quickly determine if the branch works (Mark since
> Summit is presumably a batch system you will need to run the last two test
> cases listed in src/snes/examples/tutorials/ex19.c by setting up the
> approbate batch file and including the appropriate PETSc command line
> options.)
>
> We look forward to hearing how it functions and in particular would
> love to receive -log_view performance output on summit comparing the use of
> the GPU with simply running on the CPU for your application. This would
> also tell us what additional kernels, if any, should be ported to a CUDA
> backend.
>
>
> Barry
>
>
>
>
> > On Sep 19, 2018, at 4:43 PM, Mills, Richard Tran <rtmills at anl.gov>
> wrote:
> >
> > Hi Brian,
> >
> > Your message to petsc-dev has prompted some ongoing discussion among the
> core PETSc developers, and we'll hopefully be able to give you an outline
> of a coherent plan to help you meet your ECP milestones soon.
> >
> > We have had adding GPU support within PETSc's GAMG preconditioner on our
> list of goals for some time, but we didn't manage to get this into the
> recent 3.10 release. We can bump up the priority of this, and, as Jed has
> said, we should be able to provide AMG setup on the CPU and the solves on
> the CPU in relatively short order, and we can see how much this can help in
> the near term. Doing the setup on the GPU is much more involved, but is
> something that we are interested in doing.
> >
> > Just wanted to let you know that your query has not gone unnoticed.
> Expect a more detailed reply from us soon.
> >
> > Best regards,
> > Richard
> >
> > On Wed, Sep 19, 2018 at 11:43 AM Jed Brown <jed at jedbrown.org> wrote:
> > Brian, how frequently do you need to update the matrix (thus rebuild the
> > preconditioner)?
> >
> > If it is infrequent, we could (in the near term) provide AMG setup on
> > CPU with solves on GPU.
> >
> > What is your typical problem size per node to be run on Summit? What is
> > your MPI/OpenMP(?) decomposition?
> >
> > Are these heterogeneous Poisson solves or are the equations to be solved
> > implicitly more complicated? Do you have experimental information about
> > relative convergence rates/grid complexity/strong scalability for your
> > operator solved using classical AMG (e.g., Hypre) versus smoothed (or
> > plain) aggregation (ML, GAMG default)?
> >
> > Brian Van Straalen <bvstraalen at lbl.gov> writes:
> >
> > > So Baky and I have been at the Brookhaven GPU Hackathon now for three
> days,
> > > talking to everyone. We have also been emailing with people who will
> > > respond to us from the hypre team and the PETSc team, as well as
> reading
> > > every blog post and mail archive and message board and from what we can
> > > tell, a distributed AMG preconditioner will not be available for us on
> a
> > > Summit platform for the foreseeable future.
> > >
> > > There is a hypre build for CUDA, but it has a problem with it's use of
> > > CUSP, and nobody seems to be working on it.
> > >
> > > PETSc has some .cu cuda files for the SpMV and Vector operations but
> the
> > > preconditioners are limited to point Jacobi and similar simple
> operations
> > > and a version of ILU. Neither works for our stiff projection in the
> > > embedded boundary algorithms. We built it and ran it and PETSc takes
> > > several hundred iterations to get the residual down by a factor of 6.
> We
> > > need to get down to more like 10e-11 for this solve.
> > >
> > > The AMG being worked on by the NVIDIA team is not targeted for
> multi-node
> > > solving, and I haven't heard back from them in months.
> > >
> > > We are left with two options as I see it to meet our ECP Milestones:
> > >
> > > 1. Build yet another interface, this time to see if there is a
> distributed
> > > GPU AMG preconditioner in Trilinos
> > >
> > > 2. Implement our own special-purpose EB-GMG solver written in Chombo.
> > >
> > > I would love to be wrong about all this.
> > >
> > > Brian
> > >
> > > --
> > > Brian Van Straalen Lawrence Berkeley Lab
> > > BVStraalen at lbl.gov Computational Research
> > > (510) 486-4976 Division (crd.lbl.gov)
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20180925/b104174a/attachment-0001.html>
More information about the petsc-dev
mailing list