[petsc-dev] coding style

Thu Aug 18 18:28:56 CDT 2016

cc-ing Patrick again, because I screwed up his e-mail address

On Aug 18, 2016, at 4:25 PM, Oxberry, Geoffrey M. <oxberry1 at llnl.gov<mailto:oxberry1 at llnl.gov>> wrote:

cc-ing Patrick Farrell for the function space part of this discussion, since he knows more about it than I do, and can speak to his experience re: the Riesz map pull request he made for LMVM.

On Aug 18, 2016, at 4:01 PM, Munson, Todd <tmunson at mcs.anl.gov<mailto:tmunson at mcs.anl.gov>> wrote:

For now, I am not proposing interface changes, but rather answering the
question of what types of problems do we need to support.  We can
discuss actual interfaces later.

Note: you really only need one multiplier for the each of the
constraints (maybe interior-point methods are different).
The sign changes depending on the bound that is active.

I don’t believe interior-point methods work this way; they typically maintain estimates of duality multipliers, and inequality constraints aren’t necessarily active until convergence. When these methods converge, complementary slackness holds approximately, and I believe a crossover algorithm must be implemented to determine the active constraints, at which point, complementary slackness would be satisfied more exactly.

No doubt that internally, you have more multipliers.  What we report
to the user for the multipliers at the solution may be different
though.

Internally the solvers can do lots of stuff.  Most interior point
methods, for example, would convert any inequality constraints
into equality constraints by adding slack variables and
bounding the slacks.  That way they only need bounds on
variables.

Yes; for instance, KNITRO essentially does this. I’m more concerned about the developer API having consistent semantics. I don’t want to have to remember that I should be flipping signs or assuming implicit zeroes, depending on the method, unless the gains in performance are significant.

I believe crossover is only really needed when the problem is
degenerate at the solution and you need to make decisions to
find an invertible basis matrix.  If you don't care about
the invertible basis matrix, then its not needed.  [One
reason for wanting the invertible basis matrix is for
parametric sensitivities or for branch and bound
for combinatorial problems.]

Yes, you are correct; those are the cases I was thinking of specifically. Customers ask for this information in practice to get sensitivity information, or to understand which constraints are really active.

In the strict complementarity case, the multiplier for the
inactive bounds will go to zero and the slack will remain
bounded below.  Setting the multiplier to zero will only
decrease the norm of the gradient of the Lagrangian.

I already do this in SQPTR, although only for the Hilbert space case. Assuming it eventually gets merged, the interfaces need to be revised for norm-reflexive convex Banach spaces so that all inner products become duality pairings. I am not sure if ROL does this, but the Heinkenschloss and Ridzal paper cited in the SQPTR comments suggests this generalization can be made.

I think doing things in function spaces is more of a change than I
want to consider right now.  You would have to start by convincing
the regular PETSc folks to convey the function spaces for their
PDEs…

As far as I know, the only information needed in the Hilbert space approach is matrices denoting inner products and norms. This information is needed for mesh-independent convergence, and presumably for FEM discretizations, you already have this information in the form of mass matrices. If the information is already there, why not use it? As I understand it, this issue motivated Patrick Farrell’s pull request to add Riesz maps to LMVM.

The problem with using the Euclidean norm is that refining the PDE mesh materially changes the 2-norm of the vector representing the function. If I have a constant function on a mesh, and then I increase resolution by doubling the number of degrees of freedom, I have changed the discrete norm by sqrt(2), but I have not changed the function norm at all. Inflating the norm means that more iterations are required to attain convergence for a given absolute tolerance, and this phenomenon is what is observed in practice, e.g., with LMVM, if you don’t include function space information (e.g., via the Riesz map).

To borrow at turn of phrase from Jed, note that if you are solving a problem that doesn’t involve a PDE, then these norms just revert to 2-norms, and if the space of decision variables is really some subset of R^{n}, then norms are equivalent anyway, so you only lose a constant factor in changing convergence criteria from infinity-norms (e.g., KNITRO), to 2-norms.

I prefer the simpler discretize then optimize…

This approach is what SQPTR is designed to use; Ridzal’s thesis uses examples in a discretize-then-optimize approach. It is compatible with using function space information, which essentially gives you a natural preconditioner. I am not suggesting deriving optimality conditions and then discretizing them, and the method I implemented is not in conflict with your approach.

Todd.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20160818/7ac0ef20/attachment.html>