[petsc-users] Why use MATMPIBAIJ?

Fri Jan 22 16:11:14 CST 2016

Thanks for your suggestions! If it's just 2X, I will not waste my time!

Hom Nath

On Fri, Jan 22, 2016 at 5:06 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Fri, Jan 22, 2016 at 3:47 PM, Hom Nath Gharti <hng.email at gmail.com>
> wrote:
>>
>> Hi Matt,
>>
>> SPECFEM currently has only an explicit time scheme and does not have
>> full gravity implemented. I am adding implicit time scheme and full
>> gravity so that it can be used for interesting quasistatic problems
>> such as glacial rebound, post seismic relaxation etc. I am using Petsc
>> as a linear solver which I would like to see GPU implemented.
>
>
> Why? It really does not make sense for those operations.
>
> It is an unfortunate fact, but the usefulness of GPUs has been oversold. You
> can certainly
> get some mileage out of a SpMV on the GPU, but there the maximum win is
> maybe 2x or
> less for a nice CPU, and then you have to account for transfer time and
> other latencies.
> Unless you have a really compelling case, I would not waste your time.
>
> To come to this opinion, I used years of my own time looking at GPUs.
>
>   Thanks,
>
>      Matt
>
>>
>> Thanks,
>> Hom Nath
>>
>> On Fri, Jan 22, 2016 at 4:33 PM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>> > On Fri, Jan 22, 2016 at 12:17 PM, Hom Nath Gharti <hng.email at gmail.com>
>> > wrote:
>> >>
>> >> Thanks Matt for great suggestion. One last question, do you know
>> >> whether the GPU capability of current PETSC version is matured enough
>> >> to try for my problem?
>> >
>> >
>> > The only thing that would really make sense to do on the GPU is the SEM
>> > integration, which
>> > would not be part of PETSc. This is what SPECFEM has optimized.
>> >
>> >   Thanks,
>> >
>> >     Matt
>> >
>> >>
>> >> Thanks again for your help.
>> >> Hom Nath
>> >>
>> >> On Fri, Jan 22, 2016 at 1:07 PM, Matthew Knepley <knepley at gmail.com>
>> >> wrote:
>> >> > On Fri, Jan 22, 2016 at 11:47 AM, Hom Nath Gharti
>> >> > <hng.email at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Thanks a lot.
>> >> >>
>> >> >> With AMG it did not converge within the iteration limit of 3000.
>> >> >>
>> >> >> In solid: elastic wave equation with added gravity term \rho
>> >> >> \nabla\phi
>> >> >> In fluid: acoustic wave equation with added gravity term \rho
>> >> >> \nabla\phi
>> >> >> Both solid and fluid: Poisson's equation for gravity
>> >> >> Outer space: Laplace's equation for gravity
>> >> >>
>> >> >> We combine so called mapped infinite element with spectral-element
>> >> >> method (higher order FEM that uses nodal quadrature) and solve in
>> >> >> frequency domain.
>> >> >
>> >> >
>> >> > 1) The Poisson and Laplace equation should be using MG, however you
>> >> > are
>> >> > using SEM, so
>> >> >     you would need to use a low order PC for the high order problem,
>> >> > also
>> >> > called p-MG (Paul Fischer), see
>> >> >
>> >> >       http://epubs.siam.org/doi/abs/10.1137/110834512
>> >> >
>> >> > 2) The acoustic wave equation is Helmholtz to us, and that needs
>> >> > special
>> >> > MG
>> >> > tweaks that
>> >> >      are still research material so I can understand using ASM.
>> >> >
>> >> > 3) Same thing for the elastic wave equations. Some people say they
>> >> > have
>> >> > this
>> >> > solved using
>> >> >     hierarchical matrix methods, something like
>> >> >
>> >> >       http://portal.nersc.gov/project/sparse/strumpack/
>> >> >
>> >> >     However, I think the jury is still out.
>> >> >
>> >> > If you can do 100 iterations of plain vanilla solvers, that seems
>> >> > like a
>> >> > win
>> >> > right now. You might improve
>> >> > the time using FS, but I am not sure about the iterations on the
>> >> > smaller
>> >> > problem.
>> >> >
>> >> >   Thanks,
>> >> >
>> >> >     Matt
>> >> >
>> >> >>
>> >> >> Hom Nath
>> >> >>
>> >> >> On Fri, Jan 22, 2016 at 12:16 PM, Matthew Knepley
>> >> >> <knepley at gmail.com>
>> >> >> wrote:
>> >> >> > On Fri, Jan 22, 2016 at 11:10 AM, Hom Nath Gharti
>> >> >> > <hng.email at gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Thanks Matt.
>> >> >> >>
>> >> >> >> Attached detailed info on ksp of a much smaller test. This is a
>> >> >> >> multiphysics problem.
>> >> >> >
>> >> >> >
>> >> >> > You are using FGMRES/ASM(ILU0). From your description below, this
>> >> >> > sounds
>> >> >> > like
>> >> >> > an elliptic system. I would at least try AMG (-pc_type gamg) to
>> >> >> > see
>> >> >> > how
>> >> >> > it
>> >> >> > does. Any
>> >> >> > other advice would have to be based on seeing the equations.
>> >> >> >
>> >> >> >   Thanks,
>> >> >> >
>> >> >> >     Matt
>> >> >> >
>> >> >> >>
>> >> >> >> Hom Nath
>> >> >> >>
>> >> >> >> On Fri, Jan 22, 2016 at 12:01 PM, Matthew Knepley
>> >> >> >> <knepley at gmail.com>
>> >> >> >> wrote:
>> >> >> >> > On Fri, Jan 22, 2016 at 10:52 AM, Hom Nath Gharti
>> >> >> >> > <hng.email at gmail.com>
>> >> >> >> > wrote:
>> >> >> >> >>
>> >> >> >> >> Dear all,
>> >> >> >> >>
>> >> >> >> >> I take this opportunity to ask for your important suggestion.
>> >> >> >> >>
>> >> >> >> >> I am solving an elastic-acoustic-gravity equation on the
>> >> >> >> >> planet.
>> >> >> >> >> I
>> >> >> >> >> have displacement vector (ux,uy,uz) in solid region,
>> >> >> >> >> displacement
>> >> >> >> >> potential (\xi) and pressure (p) in fluid region, and
>> >> >> >> >> gravitational
>> >> >> >> >> potential (\phi) in all of space. All these variables are
>> >> >> >> >> coupled.
>> >> >> >> >>
>> >> >> >> >> Currently, I am using MATMPIAIJ and form a single global
>> >> >> >> >> matrix.
>> >> >> >> >> Does
>> >> >> >> >> using a MATMPIBIJ or MATNEST improve the
>> >> >> >> >> convergence/efficiency
>> >> >> >> >> in
>> >> >> >> >> this case? For your information, total degrees of freedoms are
>> >> >> >> >> about
>> >> >> >> >> a
>> >> >> >> >> billion.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > 1) For any solver question, we need to see the output of
>> >> >> >> > -ksp_view,
>> >> >> >> > and
>> >> >> >> > we
>> >> >> >> > would also like
>> >> >> >> >
>> >> >> >> >   -ksp_monitor_true_residual -ksp_converged_reason
>> >> >> >> >
>> >> >> >> > 2) MATNEST does not affect convergence, and MATMPIBAIJ only in
>> >> >> >> > the
>> >> >> >> > blocksize
>> >> >> >> > which you
>> >> >> >> >     could set without that format
>> >> >> >> >
>> >> >> >> > 3) However, you might see benefit from using something like
>> >> >> >> > PCFIELDSPLIT
>> >> >> >> > if
>> >> >> >> > you have multiphysics here
>> >> >> >> >
>> >> >> >> >    Matt
>> >> >> >> >
>> >> >> >> >>
>> >> >> >> >> Any suggestion would be greatly appreciated.
>> >> >> >> >>
>> >> >> >> >> Thanks,
>> >> >> >> >> Hom Nath
>> >> >> >> >>
>> >> >> >> >> On Fri, Jan 22, 2016 at 10:32 AM, Matthew Knepley
>> >> >> >> >> <knepley at gmail.com>
>> >> >> >> >> wrote:
>> >> >> >> >> > On Fri, Jan 22, 2016 at 9:27 AM, Mark Adams
>> >> >> >> >> > <mfadams at lbl.gov>
>> >> >> >> >> > wrote:
>> >> >> >> >> >>>
>> >> >> >> >> >>>
>> >> >> >> >> >>>
>> >> >> >> >> >>> I said the Hypre setup cost is not scalable,
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >> I'd be a little careful here.  Scaling for the matrix
>> >> >> >> >> >> triple
>> >> >> >> >> >> product
>> >> >> >> >> >> is
>> >> >> >> >> >> hard and hypre does put effort into scaling. I don't have
>> >> >> >> >> >> any
>> >> >> >> >> >> data
>> >> >> >> >> >> however.
>> >> >> >> >> >> Do you?
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > I used it for PyLith and saw this. I did not think any AMG
>> >> >> >> >> > had
>> >> >> >> >> > scalable
>> >> >> >> >> > setup time.
>> >> >> >> >> >
>> >> >> >> >> >    Matt
>> >> >> >> >> >
>> >> >> >> >> >>>
>> >> >> >> >> >>> but it can be amortized over the iterations. You can
>> >> >> >> >> >>> quantify
>> >> >> >> >> >>> this
>> >> >> >> >> >>> just by looking at the PCSetUp time as your increase the
>> >> >> >> >> >>> number
>> >> >> >> >> >>> of
>> >> >> >> >> >>> processes. I don't think they have a good
>> >> >> >> >> >>> model for the memory usage, and if they do, I do not know
>> >> >> >> >> >>> what
>> >> >> >> >> >>> it
>> >> >> >> >> >>> is.
>> >> >> >> >> >>> However, generally Hypre takes more
>> >> >> >> >> >>> memory than the agglomeration MG like ML or GAMG.
>> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >> >> agglomerations methods tend to have lower "grid
>> >> >> >> >> >> complexity",
>> >> >> >> >> >> that
>> >> >> >> >> >> is
>> >> >> >> >> >> smaller coarse grids, than classic AMG like in hypre. THis
>> >> >> >> >> >> is
>> >> >> >> >> >> more
>> >> >> >> >> >> of a
>> >> >> >> >> >> constant complexity and not a scaling issue though.  You
>> >> >> >> >> >> can
>> >> >> >> >> >> address
>> >> >> >> >> >> this
>> >> >> >> >> >> with parameters to some extent. But for elasticity, you
>> >> >> >> >> >> want
>> >> >> >> >> >> to
>> >> >> >> >> >> at
>> >> >> >> >> >> least
>> >> >> >> >> >> try, if not start with, GAMG or ML.
>> >> >> >> >> >>
>> >> >> >> >> >>>
>> >> >> >> >> >>>   Thanks,
>> >> >> >> >> >>>
>> >> >> >> >> >>>     Matt
>> >> >> >> >> >>>
>> >> >> >> >> >>>>
>> >> >> >> >> >>>>
>> >> >> >> >> >>>> Giang
>> >> >> >> >> >>>>
>> >> >> >> >> >>>> On Mon, Jan 18, 2016 at 5:25 PM, Jed Brown
>> >> >> >> >> >>>> <jed at jedbrown.org>
>> >> >> >> >> >>>> wrote:
>> >> >> >> >> >>>>>
>> >> >> >> >> >>>>> Hoang Giang Bui <hgbk2008 at gmail.com> writes:
>> >> >> >> >> >>>>>
>> >> >> >> >> >>>>> > Why P2/P2 is not for co-located discretization?
>> >> >> >> >> >>>>>
>> >> >> >> >> >>>>> Matt typed "P2/P2" when me meant "P2/P1".
>> >> >> >> >> >>>>
>> >> >> >> >> >>>>
>> >> >> >> >> >>>
>> >> >> >> >> >>>
>> >> >> >> >> >>>
>> >> >> >> >> >>> --
>> >> >> >> >> >>> What most experimenters take for granted before they begin
>> >> >> >> >> >>> their
>> >> >> >> >> >>> experiments is infinitely more interesting than any
>> >> >> >> >> >>> results
>> >> >> >> >> >>> to
>> >> >> >> >> >>> which
>> >> >> >> >> >>> their
>> >> >> >> >> >>> experiments lead.
>> >> >> >> >> >>> -- Norbert Wiener
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > --
>> >> >> >> >> > What most experimenters take for granted before they begin
>> >> >> >> >> > their
>> >> >> >> >> > experiments
>> >> >> >> >> > is infinitely more interesting than any results to which
>> >> >> >> >> > their
>> >> >> >> >> > experiments
>> >> >> >> >> > lead.
>> >> >> >> >> > -- Norbert Wiener
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > --
>> >> >> >> > What most experimenters take for granted before they begin
>> >> >> >> > their
>> >> >> >> > experiments
>> >> >> >> > is infinitely more interesting than any results to which their
>> >> >> >> > experiments
>> >> >> >> > lead.
>> >> >> >> > -- Norbert Wiener
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > What most experimenters take for granted before they begin their
>> >> >> > experiments
>> >> >> > is infinitely more interesting than any results to which their
>> >> >> > experiments
>> >> >> > lead.
>> >> >> > -- Norbert Wiener
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > What most experimenters take for granted before they begin their
>> >> > experiments
>> >> > is infinitely more interesting than any results to which their
>> >> > experiments
>> >> > lead.
>> >> > -- Norbert Wiener
>> >
>> >
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> > experiments
>> > is infinitely more interesting than any results to which their
>> > experiments
>> > lead.
>> > -- Norbert Wiener
>
>
>
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener