[petsc-users] TAO: Finite Difference vs Continuous Adjoint gradient issues

Julian Andrej juan at tf.uni-kiel.de
Thu Nov 23 02:29:49 CST 2017


On 2017-11-22 16:27, Emil Constantinescu wrote:
> On 11/22/17 3:48 AM, Julian Andrej wrote:
>> Hello,
>> 
>> we prepared a small example which computes the gradient via the 
>> continuous adjoint method of a heating problem with a cost functional.
>> 
>> We implemented the text book example and tested the gradient via a 
>> Taylor Remainder (which works fine). Now we wanted to solve the
>> optimization problem with TAO and checked the gradient vs. the finite 
>> difference gradient and run into problems.
>> 
>> Testing hand-coded gradient (hc) against finite difference gradient 
>> (fd), if the ratio ||fd - hc|| / ||hc|| is
>> 0 (1.e-8), the hand-coded gradient is probably correct.
>> Run with -tao_test_display to show difference
>> between hand-coded and finite difference gradient.
>> ||fd|| 0.000147076, ||hc|| = 0.00988136, angle cosine = 
>> (fd'hc)/||fd||||hc|| = 0.99768
>> 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| = 
>> 0.00973464
>> max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985149, difference ||fd-hc|| 
>> = 0.00243363
>> ||fd|| 0.000382547, ||hc|| = 0.0257001, angle cosine = 
>> (fd'hc)/||fd||||hc|| = 0.997609
>> 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| = 
>> 0.0253185
>> max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985117, difference ||fd-hc|| 
>> = 0.00624562
>> ||fd|| 8.84429e-05, ||hc|| = 0.00594196, angle cosine = 
>> (fd'hc)/||fd||||hc|| = 0.997338
>> 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985156, difference ||fd-hc|| = 
>> 0.00585376
>> max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985006, difference ||fd-hc|| 
>> = 0.00137836
>> 
>> Despite these differences we achieve convergence with our hand coded 
>> gradient, but have to use -tao_ls_type unit.
> 
> Both give similar (assume descent) directions, but seem to be scaled
> differently. It could be a bad scaling by the mass matrix somewhere in
> the continuous adjoint. This could be seen if you plot them side by
> side as a quick diagnostic.
> 

I visualized and attached the two gradients. The CADJ is hand coded and
the DADJ is from pyadjoint which is the same as the finite difference
gradient from TAO.

If the attachement gets lost in the mailing list,, here is a direct link 
[1]

[1] https://cloud.tf.uni-kiel.de/index.php/s/nmiNOoI213dx1L1

> Emil
> 
>> $ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor -tao_gatol 
>> 1e-7 -tao_ls_type unit
>> iter =   0, Function value: 0.000316722,  Residual: 0.00126285
>> iter =   1, Function value: 3.82272e-05,  Residual: 0.000438094
>> iter =   2, Function value: 1.26011e-07,  Residual: 8.4194e-08
>> Tao Object: 1 MPI processes
>>    type: blmvm
>>        Gradient steps: 0
>>    TaoLineSearch Object: 1 MPI processes
>>      type: unit
>>    Active Set subset type: subvec
>>    convergence tolerances: gatol=1e-07,   steptol=0.,   gttol=0.
>>    Residual in Function/Gradient:=8.4194e-08
>>    Objective value=1.26011e-07
>>    total number of iterations=2,                          (max: 2000)
>>    total number of function/gradient evaluations=3,      (max: 4000)
>>    Solution converged:    ||g(X)|| <= gatol
>> 
>> $ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor 
>> -tao_fd_gradient
>> iter =   0, Function value: 0.000316722,  Residual: 4.87343e-06
>> iter =   1, Function value: 0.000195676,  Residual: 3.83011e-06
>> iter =   2, Function value: 1.26394e-07,  Residual: 1.60262e-09
>> Tao Object: 1 MPI processes
>>    type: blmvm
>>        Gradient steps: 0
>>    TaoLineSearch Object: 1 MPI processes
>>      type: more-thuente
>>    Active Set subset type: subvec
>>    convergence tolerances: gatol=1e-08,   steptol=0.,   gttol=0.
>>    Residual in Function/Gradient:=1.60262e-09
>>    Objective value=1.26394e-07
>>    total number of iterations=2,                          (max: 2000)
>>    total number of function/gradient evaluations=3474,      (max: 
>> 4000)
>>    Solution converged:    ||g(X)|| <= gatol
>> 
>> 
>> We think, that the finite difference gradient should be in line with 
>> our hand coded gradient for such a simple example.
>> 
>> We appreciate any hints on debugging this issue. It is implemented in 
>> python (firedrake) and i can provide the code if this is needed.
>> 
>> Regards
>> Julian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gradients.png
Type: image/png
Size: 121947 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20171123/1ed15994/attachment-0001.png>


More information about the petsc-users mailing list