[petsc-dev] vector inner products

Wed Apr 18 17:05:51 CDT 2018

In the master branch, setting a H0 matrix and the gradient norm appear to be
correct only for the quasi-Newton code.  Those notions have not been 
propagated anywhere else.

The quasi-Newton approximation, for example, can be used as a preconditioner 
in the Newton methods.  If the H0 matrix is set and it uses a KSP solver to
apply the "inverse", then do we need to add flexible variants of NASH, 
STCG, and GLTR for correctness?

The H0 also need to be symmetric or act like a symmetric operator for the
quasi-Newton approximation to be reasonable, and certainly when it is
used as a preconditioner in NASH, STCG, and GLTR.  Does this restrict 
the KSP solvers that can be applied when we need a inv(H0)*vector 
product?

At the end go the day, the question becomes how much thought and effort do
we want to spend in order to support the H0 matrix?  Or do we want to 
leave this as supported only for the TAO quasi-Newton method?

Todd.

> On Apr 13, 2018, at 9:11 PM, Oxberry, Geoffrey Malcolm <oxberry1 at llnl.gov> wrote:
> 
> 
> 
> On 4/12/18, 15:06, "petsc-dev on behalf of Munson, Todd" <petsc-dev-bounces at mcs.anl.gov on behalf of tmunson at mcs.anl.gov> wrote:
> 
> 
>    I am not looking at Geoff's pull request right now.
> 
>    Let me try to be clearer, in the master branch, the TaoGradientNorm() function is only 
>    used for termination tests inside the optimization methods.  It does not change anything 
>    else that goes on inside of the methods.  
> 
> Gradients and Hessians also depend on duality pairings, which in turn affects how users write these quantities, and how Hessians are approximated in quasi-Newton methods. Related code for the Hilbert space case was included in PR#347 (h/t Patrick Farrell).
> 
>    A user-defined convergence test (presuming we 
>    can get the callbacks right) would suffice.  As all norms are equivalent in finite
>    dimensions, a user could also scale the standard termination tolerance by 
>    the correct constant.
> 
> This decision makes tolerances discretization-dependent, which is a leaky abstraction, and an unnatural way to encode algorithm convergence criteria. Scaling the $\ell_{2}$-norm in place of using the correct primal and dual norms also ignores the metric geometry of the primal and dual spaces, and will affect algorithm convergence negatively. A callback for a user-defined convergence test would be preferable. 
> 
>    If you need to live in function spaces, which seems to be the argument, then it seems
>    that PETSc needs to be changed by more than just a single termination test.
> 
> A similar discussion can be found in the discussion of PR#347.
> 
>    Thanks,
>    Todd.
> 
>> On Apr 12, 2018, at 3:27 PM, Stefano Zampini <stefano.zampini at gmail.com> wrote:
>> 
>> The gradient norm is the one induced by the mass matrix of the DM associated with the control.
>> In principle, TaoGradientNorm() can be replaced by DMCreateMassMatrix() + solve with the mass matrix.
>> 
>> For PDE constrained optimization, the “gradient norm” is crucial, since we consider optimization problems in Banach spaces.
>> We should keep supporting it, maybe differently than as it is now, but keep it.
>> 
>>> On Apr 12, 2018, at 11:21 PM, Jed Brown <jed at jedbrown.org> wrote:
>>> 
>>> Are you thinking about this PR again?
>>> 
>>> https://bitbucket.org/petsc/petsc/pull-requests/506
>>> 
>>> There's an issue here that Krylov methods operate in the discrete inner
>>> product while some higher level operations are of interest in
>>> (approximations of) continuous inner products (or norms).  The object in
>>> PETSc that endows continuous attributes (like a hierarchy, subdomains,
>>> fields) on discrete quantities is DM, so my first inclination is that
>>> any continuous interpretation of vectors, including inner products and
>>> norms, belongs in DM.
>>> 
>>> "Munson, Todd" <tmunson at mcs.anl.gov> writes:
>>> 
>>>> There is a bit of code in TAO that allows the user to change the norm to 
>>>> a matrix norm.  This was introduced to get some mesh independent 
>>>> behavior in one example (tao/examples/tutorials/ex3.c).  That 
>>>> norm, however, does not propagate down into the KSP methods
>>>> and is only used for testing convergence of the nonlinear
>>>> problem.
>>>> 
>>>> A few questions then:  Is similar functionality needed in SNES?  Are 
>>>> TAO and SNES even the right place for this functionality?  Should 
>>>> it belong to the Vector class so that you can change the inner 
>>>> products and have all the KSP methods (hopefully) work 
>>>> correctly?
>>>> 
>>>> Note: that this discussion brings us to the brink of supporting an 
>>>> optimize-then-discretize approach.  I am not convinced we should 
>>>> go down that rabbit hole.
>>>> 
>>>> Thanks, Todd.
>> 
> 
>