[petsc-dev] SNESMonitorVI fix: maint or master?

Dmitry Karpeyev karpeev at mcs.anl.gov
Wed Oct 7 18:49:32 CDT 2015


Okay, I'll fix this with SNESCheckFunctionNorm() in both VI solvers in
maint.

On Wed, Oct 7, 2015, 17:23 Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Oct 7, 2015, at 6:01 PM, Dmitry Karpeyev <karpeev at mcs.anl.gov> wrote:
> >
> > Well, there is a RELAP7 (MOOSE-based code) that encounters that problem:
> > the equation of state returns a NaN that gets handled correctly (retried
> with a
> > smaller timestep), but once we turn on -snes_vi_monitor, "Cannot get
> here"
> > is thrown.
>
>    Yeah that is bad.
>
> > Apparently, the monitor is called before divergence is declared.
> >
> > I don't think it's meaningless crap: it can tell the user how many NaNs
> there are,
> > which can give them an idea of how many mesh points stay into the
> nonphysical
> > regime.  If not there, it should be counted somewhere.  In any event,
> the code
> > shouldn't die with a PLIB error in this case.  How should we handle it?
>
>   I understand your point about providing useful information but I am
> afraid you are opening up a can of worms by wanting the to call the monitor
> function in this failed case. In all the other SNESSolve implementations
> SNESCheckFunctionNorm(snes,fnorm); is called BEFORE anything else is done
> (monitor or converged test etc) so the monitor is not called on the "bad
> last iteration". It is just bad luck that in updating the code we forgot to
> put the SNESCheckFunctionNorm() into the vi solvers; it is missing in both
> SNESSolve_VINEWTONSSLS and SNESSolve_VINEWTONRSLS
>
>    The correct fix is to add the SNESCheckFunctionNorm() this will prevent
> the current crash and should go into maint. It should go immediately after
> the call to    ierr  = SNESLineSearchGetNorms(snes->linesearch, &xnorm,
> &gnorm, &ynorm);CHKERRQ(ierr); in both routines.
>
>   The "counting" of nonphysical points etc should not be handled by the VI
> monitor. It could be handled by SNESComputeFunction() perhaps.
>
>
>   Barry
>
> >
> > On Wed, Oct 7, 2015 at 4:58 PM Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> > > On Oct 7, 2015, at 5:48 PM, Dmitry Karpeyev <karpeev at mcs.anl.gov>
> wrote:
> > >
> > > Now that we allow NaNs bubble up through the solver,
> > > this can trip mysterious-looking errors in SNESVI:
> > > PETSC_ERR_PLIB, "Can never get here"
> > > is thrown from SNESMonitorVI(), because a NaN in
> > > the residual can defeat all of the seemingly-exhaustive
> > > if-then-else branches counting the number of active constraints.
> >
> >   Shouldn't the SNESSolver already returned as "failed" before it ever
> gets to SNESMonitorVI in this case? Since the vector was marked with some
> Nan's it means that something has gone wrong already and any data in there
> is meaningless crap? Why count meaningless crap?
> > >
> > > I think the fix should be to count the number of NaNs separately
> > > and report them alongside the legitimate active bounds to give
> > > the user as much useful information as possible.  Since this entails
> > > a substantial difference to the output format of -snes_vi_monitor,
> > > should the fix go to maint or master?
> >
> >   Not maint.
> >
> > >
> > > Dmitry.
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20151007/e6569243/attachment.html>


More information about the petsc-dev mailing list