[petsc-users] Guidance on using Tao for unconstrained minimization

Thu Feb 27 15:58:44 CST 2020

The log shows that the LS keeps backtracking on the step length and still fails to find any point that reduces the function value even though the gradient passes the directional derivative test (i.e. it is a valid descent direction). If you’re confident that the gradient is accurate, then this behavior suggests that you’re stuck at a discontinuity in the function. Are you certain that this objective is at least C1 continuous?

I also see that the LS is taking a step length of 5.0 in the 109th iteration right before it fails on the 110th. This might be too aggressive. You could restrict this with “-tao_ls_stepmax 1.0” or switch to a backtracking LS with “-tao_ls_type armijo” see if it changes anything at all. Note that the backtracking line search may behave a bit differently than restricting max step to 1.0, because backtracking completely ignores curvature information. Trying both would be helpful.

You could additionally try to disable the line search entirely with “-tao_ls_type unit” and accept 1.0 step lengths no matter what. This might cause an issue at the very beginning where I do see the LS doing some work. However, if it helps you reach the same point of failure as before, then it will keep going further without LS failures, and what happens with the function value and gradient norm here can help diagnose the problem. If there is indeed a discontinuous point like I suspect, then BNCG without line search might bounce back and forth around that point until it hits the iteration limit.

TAO does have support for matrix-free Hessians in the Newton-type algorithms. You can switch to them with “-tao_type bnls” for Newton Line Search or “-tao_type bntr” for Newton Trust Region. They both use a Krylov method to iteratively invert the Hessian, and in matrix-free cases, use a BFGS approximation as the preconditioner. They're going to take more time per iteration, but should at least reach the point of failure in fewer iterations than BNCG. Whether or not that trade-off improves the overall time is problem dependent. The same LS modifications I mentioned above are also applicable to these. Ultimately though, if there really is a discontinuity, they're likely to get stuck in the same spot unless they manages to find a different local minimum that BNCG is not finding.

—
Alp Dener
Postdoctoral Researcher
Argonne National Laboratory
https://www.anl.gov/profile/alp-dener

On February 27, 2020 at 1:48:44 PM, Ellen Price (ellen.price at cfa.harvard.edu<mailto:ellen.price at cfa.harvard.edu>) wrote:

I tried what you suggested and used the bounded CG method. It gets a lot farther than the unconstrained CG method and finds a lower residual, but it still experiences a line search failure after a while. Any thoughts? I'm attaching the log output.

In case it's helpful, I also spent a few more hours working on the code and now can compute the Hessian times an arbitrary vector (matrix-free using a MATSHELL); even matrix-free, however, the Hessian is much slower to compute than the gradient and objective. To answer a previous question, I am as sure as I can be that the gradient is correct, since I'm using automatic differentiation and not relying on a hand-coded function.

Thanks for your help,
Ellen

On Thu, Feb 27, 2020 at 11:40 AM Adam Denchfield <adenchfi at hawk.iit.edu<mailto:adenchfi at hawk.iit.edu>> wrote:
Hi Ellen,

It is as Alp said. To emphasize what he said, you don't need to worry about using a bounded CG method - the bounded CG methods can be used for unconstrained problems, and are much better than the old unconstrained CG code.

On Thu, Feb 27, 2020, 9:55 AM Dener, Alp via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hi Ellen,

It looks like you’re using the old unconstrained CG code. This will be deprecated in the near future in favor of the newer bound-constrained CG algorithm (TAOBNCG) that can also solve unconstrained problems when the user does not specify any bounds on the problem.

The newer TAOBNCG algorithm implements a preconditioner that significantly improves the scaling of the search direction and helps the line search accept the unit step length most of the time. I would recommend making sure that you’re on PETSc version 3.11 or newer, and then switching to this with “-tao_type bncg”. You will not need to change any of your code to do this. If you still fail to converge, please send a new log with the new algorithm and we can evaluate the next steps.

—
Alp Dener
Postdoctoral Researcher
Argonne National Laboratory
https://www.anl.gov/profile/alp-dener

On February 26, 2020 at 6:01:34 PM, Ellen Price (ellen.price at cfa.harvard.edu<mailto:ellen.price at cfa.harvard.edu>) wrote:

Hi Jed,

Thanks for getting back to me! Here's the output for my CG config. Sorry it's kind of a lot.

Ellen

On Wed, Feb 26, 2020 at 12:43 PM Jed Brown <jed at jedbrown.org<mailto:jed at jedbrown.org>> wrote:
Could you share output for your current configuration with -tao_monitor -tao_ls_monitor -tao_view?

"Ellen M. Price" <ellen.price at cfa.harvard.edu<mailto:ellen.price at cfa.harvard.edu>> writes:

> Hello PETSc users!
>
> I am using Tao for an unconstrained minimization problem. I have found
> that CG works better than the other types for this application. After
> about 85 iterations, I get an error about line search failure. I'm not
> clear on what this means, or how I could mitigate the problem, and
> neither the manual nor FAQ give any guidance. Can anyone suggest things
> I could try to help the method converge? I have function and gradient
> info, but no Hessian.
>
> Thanks,
> Ellen Price
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200227/62628de0/attachment.html>