[petsc-users] preconditioning matrix-free newton-krylov

Mark Lohry mlohry at gmail.com
Tue Oct 31 10:01:45 CDT 2017


Matt, using greedy on that system took 30 seconds wallclock, does that seem
reasonable? It's a bit concerning on scaling since this is a small system.
MATCOLORINGSL (the default) takes <10 seconds; the user manual specifies
it's a sequential method, yet seems to work on MPIAIJ without complaining.
Also, any of these expected to work with BAIJ?

-Mark

On Tue, Oct 31, 2017 at 9:34 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Mon, Oct 30, 2017 at 3:02 PM, Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
>
>>
>> > On Oct 30, 2017, at 1:58 PM, Mark Lohry <mlohry at gmail.com> wrote:
>> >
>> > Hmm, metis doesn't really have anything to do with the sparsity of the
>> Jacobian does it?
>> >
>> > No, I just mean I'm doing initial partitioning and parallel
>> communication for the residual evaluations independently of petsc, and then
>> doing a 1-to-1 mapping to the petsc solution vector. Along with manually
>> setting the non-zero structure of the MPIAIJ system as in the user manual.
>> I don't think there's anything wrong with the system structure as it gives
>> the same correct answer as the un-preconditioned matrix-free approach.
>> >
>> > The exact system those MatColoring times came from has size (100x100)
>> blocks on the diagonals corresponding to the tetrahedral cells, with those
>> having 4 neighbor blocks on the same row (or fewer for elements on
>> boundaries.)
>>
>>   Hmm, are those blocks dense? If so you could benefit enormously from
>> using BAIJ format.
>>
>>   Matt,
>>
>>     Sounds like performance bugs for the parallel coloring apply
>> algorithms with big "diagonal blocks"
>>
>
> Peter wrote the JP code (I think). I tried to look at it last night, but
> abstraction is not present. Its not easy
> to see where a performance problem might lurk. I think if we care, we just
> have to instrument it and run
> this example. Personally I have never used anything but greedy, which
> works great.
>
>    Matt
>
>
>>   Mark,
>>
>>      Could you run with -ksp_view_mat binary and send the resulting file
>> called binaryoutput and we can run the coloring codes local to performance
>> debug.
>>
>>
>>   Barry
>>
>> >
>> > On Mon, Oct 30, 2017 at 1:55 PM, Smith, Barry F. <bsmith at mcs.anl.gov>
>> wrote:
>> >
>> > > On Oct 30, 2017, at 12:39 PM, Mark Lohry <mlohry at gmail.com> wrote:
>> > >
>> > >
>> > > >
>> > > > 3) Are there any hooks analogous to KSPSetPreSolve/PostSolve for
>> the FD computation of the jacobians, or for the computation of the
>> preconditioner? I'd like to get a handle on the relative costs of these.
>> > >
>> > >   No, do you just want the time? You can get that from the logging;
>> for example -log_view
>> > >
>> > > Yes, was just thinking in regards to your suggestion of recomputing
>> when the number of linear iterations gets too high; I assume it's the ratio
>> of preconditioner cost vs linear solver cost at runtime that's the metric
>> of interest, and not the absolute value of either. But I'll cross that
>> bridge when I come to it.
>> > >
>> > > When I had asked, I was looking to see where a long pause was
>> happening thinking it was the FD jacobian; turned out to be before that in
>> MatColoringApply which seems surprisingly expensive. MATCOLORINGJP took ~15
>> minutes on 32 cores on a small 153,000^2 system, with MATCOLORINGGREEDY
>> taking 30 seconds. Any guidance there, or is this expected? I'm not using
>> DM, just manually entering the sparsity resulting from a metis
>> decomposition of a tetrahedral mesh.
>> >
>> >    Hmm, metis doesn't really have anything to do with the sparsity of
>> the Jacobian does it?
>> >
>> >   Matt,
>> >
>> >    These times are huge. What is going on?
>> >
>> >    Barry
>> >
>> > >
>> > >
>> > > Thanks for the info on the lag logic, I'll play with the TS pre/post
>> calls for the time-accurate problems and only use LagJacobian.
>> > >
>> > > On Mon, Oct 30, 2017 at 11:29 AM, Smith, Barry F. <bsmith at mcs.anl.gov>
>> wrote:
>> > >
>> > > > On Oct 29, 2017, at 11:50 AM, Mark Lohry <mlohry at gmail.com> wrote:
>> > > >
>> > > > Thanks again Barry, I've got the preconditioners hooked up with
>> -snes_mf_operator and at least AMG looks to be working great on a high
>> order unstructured DG problem.
>> > > >
>> > > > Couple questions on the SNESSetLagJacobian +
>> SNESSetLagPreconditioner code flow:
>> > > >
>> > > > 1) With -snes_mf_operator, and given SNESSetLagJacobian(snes, 1)
>> (default)  and SNESSetLagPreconditioner(snes, 2), after the first KSP solve
>> in a newton iteration, will it do the finite different jacobian
>> calculation? Or will the Jacobian only be computed when the preconditioner
>> lag setting demands it on the 3rd newton step? I suspect it's the latter
>> based on where I see the code pause.
>> > >
>> > >    SNES with -snes_mf_operator will ALWAYS use the matrix-free finite
>> difference f(x+h) - f(x) to apply the matrix vector product.
>> > >
>> > >    The LagJacobian and LagPreconditioner are not coordinated. The
>> first determines how often the Jacobian used for preconditioning is
>> recomputed and the second determines how often the preconditioner is
>> recomputed.
>> > >
>> > >    If you are using -snes_mf_operator then it never makes sense to
>> have lagJacobian < lagPreconditioner since it would recompute the Jacobian
>> but not actually use it. It also makes no sense for lagPreconditioner <
>> lagJacobian because you'd be recomputing the preconditioner on the same
>> Jacobian.
>> > >
>> > > But actually if you don't change the Jacobian used in building the
>> preconditioner then when it tries to recompute the preconditioner it
>> determines the matrix has not changed so skips rebuilding the
>> preconditioner. So when using -snes_mf_operator there is really no reason
>> generally to set the preconditioner lag.
>> > > >
>> > > > 2) How do implicit TS and SNESSetLagPreconditioner/Persists
>> interact? Does the counter since-last-preconditioner-compute reset with
>> time steps, or does that lag counter just increment with every SNES solve
>> regardless of how many nonlinear solves might have happened in a given
>> timestep? Say lag preconditioner is 2, and a time stepper uses 3, 2, and 3
>> nonlinear solves on 3 steps, is the flow
>> > > >
>> > > > (time step 1)->(update preconditioner)->(snes solve)->(snes
>> solve)->(update preconditioner)->(snes solve)
>> > > > (time step 2)->(snes solve)->(update preconditioner)->(snes solve)
>> > > > (time step 3)->(snes solve)->(update preconditioner)->(snes
>> solve)->(snes solve)
>> > > >
>> > > > or
>> > > >
>> > > > (time step 1)->(update preconditioner)->(snes solve)->(snes
>> solve)->(update preconditioner)->(snes solve)
>> > > > (time step 2)->(update preconditioner)->(snes solve)->(snes solve)
>> > > > (time step 3)->(update preconditioner)->(snes solve)->(snes
>> solve)->(update preconditioner)->(snes solve)
>> > > >
>> > > > ?
>> > > >
>> > > > I think for implicit time stepping I'd probably want the
>> preconditioner to be recomputed just once at the beginning of each time
>> step, or some multiple of that. Does that sound reasonable?
>> > >
>> > >   Yes, what you want to do is completely reasonable.
>> > >
>> > >   You can use SNESSetLagJacobian() and   SNESSetLagJacobianPersists()
>> in combination to have the Jacobian recomputed ever fixed number of times;
>> if you set the persists flag and set LagJacobian to 10 it will recompute
>> the Jacobian used in the preconditioner every 10th time a new Jacobian is
>> needed.
>> > >
>> > >    If you want to compute the new Jacobian used to build the
>> preconditioner once at the beginning of each new TS stage you can set
>> SNESSetLagJacobian() to negative -2 in the TS prestage call. There are
>> possibly other tricks you can do by setting the two flags at different
>> locations.
>> > >
>> > >    An alternative to hardwiring how often the Jacobian used to build
>> the preconditioner is rebuilt is to rebuild based on when the
>> preconditioner starts "working less well". Here you could put an additional
>> KSPMonitor or SNESMonitor that detects if the number of linear iterations
>> is above a certain amount and then sets the recompute Jacobian flag to -2
>> so that for the next solve it recreates the Jacobian used in building the
>> preconditioner.
>> > >
>> > >
>> > > >
>> > > > 3) Are there any hooks analogous to KSPSetPreSolve/PostSolve for
>> the FD computation of the jacobians, or for the computation of the
>> preconditioner? I'd like to get a handle on the relative costs of these.
>> > >
>> > >   No, do you just want the time? You can get that from the logging;
>> for example -log_view
>> > >
>> > > >
>> > > >
>> > > > Best,
>> > > > Mark
>> > > >
>> > > > On Sat, Sep 23, 2017 at 3:28 PM, Mark Lohry <mlohry at gmail.com>
>> wrote:
>> > > > Great, thanks Barry.
>> > > >
>> > > > On Sat, Sep 23, 2017 at 3:12 PM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> > > >
>> > > > > On Sep 23, 2017, at 12:48 PM, Mark W. Lohry <mlohry at princeton.edu>
>> wrote:
>> > > > >
>> > > > > I'm currently using JFNK in an application where I don't have a
>> hand-coded jacobian, and it's working well enough but as expected the
>> scaling isn't great.
>> > > > >
>> > > > > What is the general process for using PC with
>> MatMFFDComputeJacobian? Does it make sense to occasionally have petsc
>> re-compute the jacobian via finite differences, and then recompute the
>> preconditioner? Any that just need the sparsity structure?
>> > > >
>> > > >  Mark
>> > > >
>> > > >    Yes, this is a common approach. SNESSetLagJacobian
>> -snes_lag_jacobian
>> > > >
>> > > >     The normal approach in SNES to use matrix-free for the operator
>> and use finite differences to compute an approximate Jacobian used to
>> construct preconditioners is to to create a sparse matrix with the sparsity
>> of the approximate Jacobian (yes you need a way to figure out the sparsity,
>> if you use DMDA it will figure out the sparsity for you). Then you use
>> > > >
>> > > >    SNESSetJacobian(snes,J,J, SNESComputeJacobianDefaultColor,
>> NULL);
>> > > >
>> > > > and use the options database option -snes_mf_operator
>> > > >
>> > > >
>> > > > > Are there any PCs that don't work in the matrix-free context?
>> > > >
>> > > >   If you do the above you can use almost all the PC since you are
>> providing an explicit matrix from which to build the preconditioner
>> > > >
>> > > > > Are there any example codes I overlooked?
>> > > > >
>> > > > > Last but not least... can the Boomer-AMG preconditioner work with
>> JFNK? To really show my ignorance of AMG, can it actually be written as a
>> matrix P^-1(Ax-b)=0, , or is it just a linear operator?
>> > > >
>> > > >   Again, if you provide an approximate Jacobian like above you can
>> use it with BoomerAMG, if you provide NO explicit matrix you cannot use
>> BoomerAMG or almost any other preconditioner.
>> > > >
>> > > >    Barry
>> > > >
>> > > > >
>> > > > > Thanks,
>> > > > > Mark
>> > > >
>> > > >
>> > > >
>> > >
>> > >
>> >
>> >
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/~mk51/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20171031/5dbcd1dd/attachment-0001.html>


More information about the petsc-users mailing list