[petsc-users] TAO: Finite Difference vs Continuous Adjoint gradient issues

Wed Nov 22 09:56:02 CST 2017

Just to add on Emil's answer: being the adjoint ode linear, you may either
being not properly scaling the initial condition (if your objective is a
final value one) or the adjoint forcing (i.e. the gradient wrt the state of
the objective function if you have a cost gradient)

2017-11-22 18:34 GMT+03:00 Smith, Barry F. <bsmith at mcs.anl.gov>:

>
>
> > On Nov 22, 2017, at 3:48 AM, Julian Andrej <juan at tf.uni-kiel.de> wrote:
> >
> > Hello,
> >
> > we prepared a small example which computes the gradient via the
> continuous adjoint method of a heating problem with a cost functional.
>
>    Julian,
>
>      The first thing to note is that the continuous adjoint is not exactly
> the same as the adjoint for the actual algebraic system you are solving.
> (It is only, as I understand it possibly the same in the limit with very
> fine mesh and time step). Thus you would not actually expect these to match
> with PETSc fd. Now as your refine space/time do the numbers get closer to
> each other?
>
>   Note the angle cosine is very close to one which means that they are
> producing the same search direction, just different lengths.
>
>    How is the convergence of the solver if you use -tao_fd_gradient do you
> still need unit.
>
> > but have to use -tao_ls_type unit.
>
>    This is slightly odd, because this line search always just takes the
> full step, the other ones would normally be better since they are more
> sophisticated in picking the step size. Please run without the -tao_ls_type
> unit.  and send the output
>
>    Also does your problem have bound constraints? If not use -tao_type
> lmvm  and send the output.
>
>    Just saw Emil's email, yes there could easily be a scaling issue with
> your continuous adjoint.
>
>   Barry
>
>
>
> >
> > We implemented the text book example and tested the gradient via a
> Taylor Remainder (which works fine). Now we wanted to solve the
> > optimization problem with TAO and checked the gradient vs. the finite
> difference gradient and run into problems.
> >
> > Testing hand-coded gradient (hc) against finite difference gradient
> (fd), if the ratio ||fd - hc|| / ||hc|| is
> > 0 (1.e-8), the hand-coded gradient is probably correct.
> > Run with -tao_test_display to show difference
> > between hand-coded and finite difference gradient.
> > ||fd|| 0.000147076, ||hc|| = 0.00988136, angle cosine =
> (fd'hc)/||fd||||hc|| = 0.99768
> > 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| =
> 0.00973464
> > max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985149, difference ||fd-hc|| =
> 0.00243363
> > ||fd|| 0.000382547, ||hc|| = 0.0257001, angle cosine =
> (fd'hc)/||fd||||hc|| = 0.997609
> > 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| =
> 0.0253185
> > max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985117, difference ||fd-hc|| =
> 0.00624562
> > ||fd|| 8.84429e-05, ||hc|| = 0.00594196, angle cosine =
> (fd'hc)/||fd||||hc|| = 0.997338
> > 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985156, difference ||fd-hc|| =
> 0.00585376
> > max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985006, difference ||fd-hc|| =
> 0.00137836
> >
> > Despite these differences we achieve convergence with our hand coded
> gradient, but have to use -tao_ls_type unit.
> >
> > $ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor -tao_gatol
> 1e-7 -tao_ls_type unit
> > iter =   0, Function value: 0.000316722,  Residual: 0.00126285
> > iter =   1, Function value: 3.82272e-05,  Residual: 0.000438094
> > iter =   2, Function value: 1.26011e-07,  Residual: 8.4194e-08
> > Tao Object: 1 MPI processes
> >  type: blmvm
> >      Gradient steps: 0
> >  TaoLineSearch Object: 1 MPI processes
> >    type: unit
> >  Active Set subset type: subvec
> >  convergence tolerances: gatol=1e-07,   steptol=0.,   gttol=0.
> >  Residual in Function/Gradient:=8.4194e-08
> >  Objective value=1.26011e-07
> >  total number of iterations=2,                          (max: 2000)
> >  total number of function/gradient evaluations=3,      (max: 4000)
> >  Solution converged:    ||g(X)|| <= gatol
> >
> > $ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor
> -tao_fd_gradient
> > iter =   0, Function value: 0.000316722,  Residual: 4.87343e-06
> > iter =   1, Function value: 0.000195676,  Residual: 3.83011e-06
> > iter =   2, Function value: 1.26394e-07,  Residual: 1.60262e-09
> > Tao Object: 1 MPI processes
> >  type: blmvm
> >      Gradient steps: 0
> >  TaoLineSearch Object: 1 MPI processes
> >    type: more-thuente
> >  Active Set subset type: subvec
> >  convergence tolerances: gatol=1e-08,   steptol=0.,   gttol=0.
> >  Residual in Function/Gradient:=1.60262e-09
> >  Objective value=1.26394e-07
> >  total number of iterations=2,                          (max: 2000)
> >  total number of function/gradient evaluations=3474,      (max: 4000)
> >  Solution converged:    ||g(X)|| <= gatol
> >
> >
> > We think, that the finite difference gradient should be in line with our
> hand coded gradient for such a simple example.
> >
> > We appreciate any hints on debugging this issue. It is implemented in
> python (firedrake) and i can provide the code if this is needed.
> >
> > Regards
> > Julian
>
>

-- 
Stefano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20171122/60015c10/attachment-0001.html>