[petsc-users] Report Bug TaoALMM class

Barry Smith bsmith at petsc.dev
Fri Nov 11 13:04:13 CST 2022



> On Nov 4, 2022, at 7:43 AM, Stephan Köhler <stephan.koehler at math.tu-freiberg.de> wrote:
> 
> Barry,
> 
> this is a nonartificial code.  This is a problem in the ALMM subsolver.  I want to solve a problem with a TaoALMM solver what       then happens is:
> 
> TaoSolve(tao)    /* TaoALMM solver */
>    |
>    |
>    |-------->   This calls the TaoALMM subsolver routine
>                   
>                  TaoSolve(subsolver)
>                        |
>                        |
>                        |----------->   The subsolver does not correctly work, at least with an Armijo line search, since the solution is overwritten within the line search.  
>                                        In my case, the subsolver does not make any progress although it is possible.
> 
> To get to my real problem you can simply change line 268 to if(0)  (from if(1) -----> if(0)) and line 317 from // ierr = TaoSolve(tao); CHKERRQ(ierr);  -------> ierr = TaoSolve(tao); CHKERRQ(ierr);
> What you can see is that the solver does not make any progress, but it should make progress.
> 
> To be honest, I do not really know why the option -tao_almm_subsolver_tao_ls_monitor has know effect if the ALMM solver is called and not the subsolver. I also do not know why -tao_almm_subsolver_tao_view prints as termination reason for the subsolver 
> 
>      Solution converged:    ||g(X)|| <= gatol
> 
> This is obviously not the case.  I set the tolerance        
> -tao_almm_subsolver_tao_gatol 1e-8 \
> -tao_almm_subsolver_tao_grtol 1e-8 \

  This is because TaoSolve_ALMM adaptively sets the tolerances for the subsolver 

/* update subsolver tolerance */
    PetscCall(PetscInfo(tao, "Subsolver tolerance: ||G|| <= %e\n", (double)auglag->gtol));
    PetscCall(TaoSetTolerances(auglag->subsolver, auglag->gtol, 0.0, 0.0));

  So any values one set initially are ignored. Unfortunately, given the organization of TaoSetFromOptions() as a general tool, there is no way to have ALMM not accept the command line tolerances, producing a message that the end that they have been ignored. Hence the user thinks they have been set and gets confused that they seem to be ignored. I don't see any way to prevent this confusion cleanly.


   I am still digging through all the nesting here.


>  
> I encountered this and then I looked into the ALMM class and therefore I tried to call the subsolver (previous example).
> 
> I attach the updated programm and also the options.
> 
> Stephan
> 
> 
> 
> 
> 
>  <https://www.dict.cc/?s=obviously>
> On 03.11.22 22:15, Barry Smith wrote:
>> 
>>   Thanks for your response and the code. I understand the potential problem and how your code demonstrates a bug if the TaoALMMSubsolverObjective() is used in the manner you use in the example where you directly call TaoComputeObjective() multiple times line a line search code might.
>> 
>>   What I don't have or understand is how to reproduce the problem in a real code that uses Tao. That is where the Tao Armijo line search code has a problem when it is used (somehow) in a Tao solver with ALMM. You suggest "If you have an example for your own, you can switch the Armijo line search by the option -tao_ls_type armijo.  The thing is that it will cause no problems if the line search accepts the steps with step length one."  I don't see how to do this if I use -tao_type almm I cannot use -tao_ls_type armijo; that is the option -tao_ls_type doesn't seem to me to be usable in the context of almm (since almm internally does directly its own trust region approach for globalization). If we remove the if (1) code from your example, is there some Tao options I can use to get the bug to appear inside the Tao solve?
>> 
>> I'll try to explain again, I agree that the fact that the Tao solution is aliased (within the ALMM solver) is a problem with repeated calls to TaoComputeObjective() but I cannot see how these repeated calls could ever happen in the use of TaoSolve() with the ALMM solver. That is when is this "design problem" a true problem as opposed to just a potential problem that can be demonstrated in artificial code?
>> 
>> The reason I need to understand the non-artificial situation it breaks things is to come up with an appropriate correction for the current code.
>> 
>>   Barry
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>> On Nov 3, 2022, at 12:46 PM, Stephan Köhler <stephan.koehler at math.tu-freiberg.de> <mailto:stephan.koehler at math.tu-freiberg.de> wrote:
>>> 
>>> Barry,
>>> 
>>> so far, I have not experimented with trust-region methods, but I can imagine that this "design feature" causes no problem for trust-region methods, if the old point is saved and after the trust-region check fails the old point is copied to the actual point.  But the implementation of the Armijo line search method does not work that way.  Here, the actual point will always be overwritten.  Only if the line search fails, then the old point is restored, but then the TaoSolve method ends with a line search failure. 
>>> 
>>> If you have an example for your own, you can switch the Armijo line search by the option -tao_ls_type armijo.  The thing is that it will cause no problems if the line search accepts the steps with step length one.  
>>> It is also possible that, by luck, it will cause no problems, if the "excessive" step brings a reduction of the objective
>>> 
>>> Otherwise, I attach my example, which is not minimal, but here you can see that it causes problems.  You need to set the paths to the PETSc library in the makefile.  You find the options for this problem in the run_test_tao_neohooke.sh script.
>>> The import part begins at line 292 in test_tao_neohooke.cpp
>>> 
>>> Stephan
>>> 
>>> On 02.11.22 19:04, Barry Smith wrote:
>>>>   Stephan,
>>>> 
>>>>     I have located the troublesome line in TaoSetUp_ALMM() it has the line
>>>> 
>>>>   auglag->Px = tao->solution;
>>>> 
>>>> and in alma.h it has 
>>>> 
>>>> Vec  Px, LgradX, Ce, Ci, G;         /* aliased vectors (do not destroy!) */
>>>> 
>>>> Now auglag->P in some situations alias auglag->P  and in some cases auglag->Px serves to hold a portion of auglag->P. So then in TaoALMMSubsolverObjective_Private()
>>>> the lines
>>>> 
>>>> PetscCall(VecCopy(P, auglag->P));
>>>>  PetscCall((*auglag->sub_obj)(auglag->parent));
>>>> 
>>>> causes, just as you said, tao->solution to be overwritten by the P at which the objective function is being computed. In other words, the solution of the outer Tao is aliased with the solution of the inner Tao, by design. 
>>>> 
>>>> You are definitely correct, the use of TaoALMMSubsolverObjective_Private and TaoALMMSubsolverObjectiveAndGradient_Private  in a line search would be problematic. 
>>>> 
>>>> I am not an expert at these methods or their implementations. Could you point to an actual use case within Tao that triggers the problem. Is there a set of command line options or code calls to Tao that fail due to this "design feature". Within the standard use of ALMM I do not see how the objective function would be used within a line search. The TaoSolve_ALMM() code is self-correcting in that if a trust region check fails it automatically rolls back the solution.
>>>> 
>>>>   Barry
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On Oct 28, 2022, at 4:27 AM, Stephan Köhler <stephan.koehler at math.tu-freiberg.de> <mailto:stephan.koehler at math.tu-freiberg.de> wrote:
>>>>> 
>>>>> Dear PETSc/Tao team,
>>>>> 
>>>>> it seems to be that there is a bug in the TaoALMM class:
>>>>> 
>>>>> In the methods TaoALMMSubsolverObjective_Private and TaoALMMSubsolverObjectiveAndGradient_Private the vector where the function value for the augmented Lagrangian is evaluate
>>>>> is copied into the current solution, see, e.g., https://petsc.org/release/src/tao/constrained/impls/almm/almm.c.html line 672 or 682.  This causes subsolver routine to not converge if the line search for the subsolver rejects the step length 1. for some
>>>>> update.  In detail:
>>>>> 
>>>>> Suppose the current iterate is xk and the current update is dxk. The line search evaluates the augmented Lagrangian now at (xk + dxk).  This causes that the value (xk + dxk) is copied in the current solution.  If the point (xk + dxk) is rejected, the line search should
>>>>> try the point (xk + alpha * dxk), where alpha < 1.  But due to the copying, what happens is that the point ((xk + dxk) + alpha * dxk) is evaluated, see, e.g., https://petsc.org/release/src/tao/linesearch/impls/armijo/armijo.c.html line 191.
>>>>> 
>>>>> Best regards
>>>>> Stephan Köhler
>>>>> 
>>>>> -- 
>>>>> Stephan Köhler
>>>>> TU Bergakademie Freiberg
>>>>> Institut für numerische Mathematik und Optimierung
>>>>> 
>>>>> Akademiestraße 6
>>>>> 09599 Freiberg
>>>>> Gebäudeteil Mittelbau, Zimmer 2.07
>>>>> 
>>>>> Telefon: +49 (0)3731 39-3173 (Büro)
>>>>> 
>>>>> <OpenPGP_0xC9BF2C20DFE9F713.asc>
>>> 
>>> -- 
>>> Stephan Köhler
>>> TU Bergakademie Freiberg
>>> Institut für numerische Mathematik und Optimierung
>>> 
>>> Akademiestraße 6
>>> 09599 Freiberg
>>> Gebäudeteil Mittelbau, Zimmer 2.07
>>> 
>>> Telefon: +49 (0)3731 39-3173 (Büro)
>>> <Minimal_example_without_vtk_2.tar.gz><OpenPGP_0xC9BF2C20DFE9F713.asc>
>> 
> 
> -- 
> Stephan Köhler
> TU Bergakademie Freiberg
> Institut für numerische Mathematik und Optimierung
> 
> Akademiestraße 6
> 09599 Freiberg
> Gebäudeteil Mittelbau, Zimmer 2.07
> 
> Telefon: +49 (0)3731 39-3173 (Büro)
> <run_test_tao_neohooke.sh><test_tao_neohooke.cpp><OpenPGP_0xC9BF2C20DFE9F713.asc>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20221111/b1def03d/attachment.html>


More information about the petsc-users mailing list