[petsc-dev] coding style

Thu Aug 18 19:20:41 CDT 2016

On Thu, Aug 18, 2016 at 7:10 PM, Oxberry, Geoffrey Malcolm <
oxberry1 at llnl.gov> wrote:

>
> On Aug 18, 2016, at 4:53 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>
> On Aug 18, 2016, at 6:25 PM, Oxberry, Geoffrey Malcolm <oxberry1 at llnl.gov>
> wrote:
>
> cc-ing Patrick Farrell for the function space part of this discussion,
> since he knows more about it than I do, and can speak to his experience re:
> the Riesz map pull request he made for LMVM.
>
> On Aug 18, 2016, at 4:01 PM, Munson, Todd <tmunson at mcs.anl.gov> wrote:
>
>
> For now, I am not proposing interface changes, but rather answering the
> question of what types of problems do we need to support.  We can
> discuss actual interfaces later.
>
> Note: you really only need one multiplier for the each of the
> constraints (maybe interior-point methods are different).
> The sign changes depending on the bound that is active.
>
>
> I don’t believe interior-point methods work this way; they typically
> maintain estimates of duality multipliers, and inequality constraints
> aren’t necessarily active until convergence. When these methods converge,
> complementary slackness holds approximately, and I believe a crossover
> algorithm must be implemented to determine the active constraints, at which
> point, complementary slackness would be satisfied more exactly.
>
>
> No doubt that internally, you have more multipliers.  What we report
> to the user for the multipliers at the solution may be different
> though.
>
> Internally the solvers can do lots of stuff.  Most interior point
> methods, for example, would convert any inequality constraints
> into equality constraints by adding slack variables and
> bounding the slacks.  That way they only need bounds on
> variables.
>
>
> Yes; for instance, KNITRO essentially does this. I’m more concerned about
> the developer API having consistent semantics. I don’t want to have to
> remember that I should be flipping signs or assuming implicit zeroes,
> depending on the method, unless the gains in performance are significant.
>
>
> I believe crossover is only really needed when the problem is
> degenerate at the solution and you need to make decisions to
> find an invertible basis matrix.  If you don't care about
> the invertible basis matrix, then its not needed.  [One
> reason for wanting the invertible basis matrix is for
> parametric sensitivities or for branch and bound
> for combinatorial problems.]
>
>
> Yes, you are correct; those are the cases I was thinking of specifically.
> Customers ask for this information in practice to get sensitivity
> information, or to understand which constraints are really active.
>
>
> In the strict complementarity case, the multiplier for the
> inactive bounds will go to zero and the slack will remain
> bounded below.  Setting the multiplier to zero will only
> decrease the norm of the gradient of the Lagrangian.
>
> I already do this in SQPTR, although only for the Hilbert space case.
> Assuming it eventually gets merged, the interfaces need to be revised for
> norm-reflexive convex Banach spaces so that all inner products become
> duality pairings. I am not sure if ROL does this, but the Heinkenschloss
> and Ridzal paper cited in the SQPTR comments suggests this generalization
> can be made.
>
>
> I think doing things in function spaces is more of a change than I
> want to consider right now.  You would have to start by convincing
> the regular PETSc folks to convey the function spaces for their
> PDEs…
>
>
> As far as I know, the only information needed in the Hilbert space
> approach is matrices denoting inner products and norms.
>
>
>   The norms are not a problem, especially just the norm you use to decide
> if you have "converged".  The trickier issue is "the inner products", which
> ones and what it actually means to switch it from the usual l2 within an
> algorithm.
>
>
> Agreed. For Hilbert spaces, there are natural inner products that act like
> l2; I mainly worked off the TaoGradientNorm function when I built up the
> API in PR #506. CG can be posed in terms of inner products, and generalized
> to an abstract inner product space setting. Given the current
> implementation, the “inner product” could always be switched back to the
> usual l2, possibly at the cost of retarded convergence due to worse
> conditioning.
>
> I don’t know how to generalize beyond that setting to, say, an abstract
> Krylov subspace method and a normed vector space equipped with a duality
> pairing. I have a tiny bit of intuition, but I defer to people who have
> more expertise in the area and welcome the discussion. References would be
> helpful, too.
>

There is a really excellent short book about this that SIAM just published.
Its by Strakos and Marek.

  Matt

> This information is needed for mesh-independent convergence, and
> presumably for FEM discretizations, you already have this information in
> the form of mass matrices. If the information is already there, why not use
> it? As I understand it, this issue motivated Patrick Farrell’s pull request
> to add Riesz maps to LMVM.
>
> The problem with using the Euclidean norm is that refining the PDE mesh
> materially changes the 2-norm of the vector representing the function. If I
> have a constant function on a mesh, and then I increase resolution by
> doubling the number of degrees of freedom, I have changed the discrete norm
> by sqrt(2), but I have not changed the function norm at all. Inflating the
> norm means that more iterations are required to attain convergence for a
> given absolute tolerance, and this phenomenon is what is observed in
> practice, e.g., with LMVM, if you don’t include function space information
> (e.g., via the Riesz map).
>
> To borrow at turn of phrase from Jed, note that if you are solving a
> problem that doesn’t involve a PDE, then these norms just revert to
> 2-norms, and if the space of decision variables is really some subset of
> R^{n}, then norms are equivalent anyway, so you only lose a constant factor
> in changing convergence criteria from infinity-norms (e.g., KNITRO), to
> 2-norms.
>
>
> I prefer the simpler discretize then optimize…
>
>
> This approach is what SQPTR is designed to use; Ridzal’s thesis uses
> examples in a discretize-then-optimize approach. It is compatible with
> using function space information, which essentially gives you a natural
> preconditioner. I am not suggesting deriving optimality conditions and then
> discretizing them, and the method I implemented is not in conflict with
> your approach.
>
>
> Todd.
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20160818/04105e81/attachment.html>