[petsc-users] Calculating adjoint of more than one cost function separately

Tue Dec 29 11:27:23 CST 2020

Hi Hong,

I wanted to have separate calls to TSAdjointSolve() for each cost functional just for design purposes (separation of concerns). In pyadjoint, there is the ReducedFunctional object that encapsulates the functionality of a single cost functional and its derivative. Now I understand that there is a very compelling reason to actually calculate all cost functional gradients together (saving on checkpoint loadings). Thanks for clarifying that. I will work with that in mind from now on.

Best,
Miguel

From: "Zhang, Hong" <hongzhang at anl.gov>
Date: Monday, December 28, 2020 at 8:43 PM
To: "Salazar De Troya, Miguel" <salazardetro1 at llnl.gov>
Cc: "Salazar De Troya, Miguel via petsc-users" <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Calculating adjoint of more than one cost function separately




On Dec 28, 2020, at 9:31 PM, Salazar De Troya, Miguel <salazardetro1 at llnl.gov<mailto:salazardetro1 at llnl.gov>> wrote:

Hello,

Thanks for your response, Hong. I see that all cost functionals are evaluated in a single backward run.

All gradients, not necessarily the cost functionals.


However, I want to do it separately. I want to isolate the evaluation of the gradients for each cost functional.

What is the motivation of doing multiple TSAdjointSolve() calls in your case? Note that evaluating the gradients in one call is more efficient because you do not have to load the same checkpoints multiple times.


Can you please elaborate on how to reuse the trajectory for multiple calls? Specifically, how to set the trajectory back to the end so I can call TSAdjoint() again?

This is the last thing you want to do. Before each adjoint run, you can reset TS into the same state as when the forward run ends by specifying the final time, the step size and the step number. You will be limited to use disk (default option) for checkpointing. Here is an example modified from ex20adj.c:

diff --git a/src/ts/tutorials/ex20adj.c b/src/ts/tutorials/ex20adj.c
index 8ca9e0b7ba..e185bc4721 100644
--- a/src/ts/tutorials/ex20adj.c
+++ b/src/ts/tutorials/ex20adj.c
@@ -277,6 +277,10 @@ int main(int argc,char **argv)
   ierr = TSGetSolveTime(ts,&user.ftime);CHKERRQ(ierr);
   ierr = TSGetStepNumber(ts,&user.steps);CHKERRQ(ierr);

+  for (PetscInt iter=1; iter<3; iter++) {
+    ierr = TSSetTime(ts,user.ftime);CHKERRQ(ierr);
+    ierr = TSSetTimeStep(ts,0.001);CHKERRQ(ierr);
+    ierr = TSSetStepNumber(ts,user.steps);CHKERRQ(ierr);
   /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
      Adjoint model starts here
      - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
@@ -321,7 +325,7 @@ int main(int argc,char **argv)
   ierr = VecRestoreArray(user.mup[1],&x_ptr);CHKERRQ(ierr);
   ierr = VecRestoreArray(user.lambda[1],&y_ptr);CHKERRQ(ierr);
   ierr = PetscPrintf(PETSC_COMM_WORLD,"\n sensivitity wrt parameters: d[z(tf)]/d[mu]\n%g\n",(double)PetscRealPart(derp));CHKERRQ(ierr);
-
+  }
   /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
      Free work space.  All PETSc objects should be destroyed when they
      are no longer needed.

Hong (Mr.)



Miguel
From: "Zhang, Hong" <hongzhang at anl.gov<mailto:hongzhang at anl.gov>>
Date: Monday, December 28, 2020 at 6:16 PM
To: "Salazar De Troya, Miguel" <salazardetro1 at llnl.gov<mailto:salazardetro1 at llnl.gov>>
Cc: "Salazar De Troya, Miguel via petsc-users" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Calculating adjoint of more than one cost function separately





On Dec 27, 2020, at 5:01 PM, Salazar De Troya, Miguel via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:

Hello,

I am interested in calculating the gradients of an optimization problem with one goal and one constraint functions which need TSAdjoint for their adjoints. I’d like to call each of their adjoints in different calls, but it does not seem to be possible without making compromises.

If you are calculating the derivatives to the same set of parameters, the adjoints of all cost functionals can be done with a single backward run.



For instance, one could set TSCreateQuadratureTS() and TSSetCostGradients() with different quadratures (and their gradients) for each adjoint call (one at a time). This would evaluate the cost functions in the backwards run though, whereas one typically computes the cost functions in a different routine than the adjoint call (like in line searches evaluations)

The second argument of TSCreateQuadratureTS() allows you to choose if the quadrature is evaluated in the forward run or in the backward run.  The choice typically depends on the optimization algorithms. Some optimization algorithms may expect users to provide an objective function and its gradient as a bundle; in this case, the choice does not make a difference. Some other algorithms may occasionally evaluate the objective function without evaluating its gradient, then evaluating the quadrature in the forward run is definitely a better choice.


One could also set TSCreateQuadratureTS() with the goal and the constraint functions to be evaluated at the forward run (as typically done when computing the cost function). The problem would be that the adjoint call now requires two sets of gradients for TSSetCostGradients() and their adjoint are calculated together, costing twice if your routines for the cost and the constraint gradients are separated.

You can put the two sets of gradients in vector arrays and pass them to TSSetCostGradients() together. Only one call to TSAdjointSolve() is needed. See the example src/ts/tutorials/ex20adj.c, where we have two independent cost functionals, and their adjoints correspond to lambda[0]/mup[0] and lambda[1]/mup[1] respectively. After performing a TSAdjontSolve, you will get the gradients for both cost functionals.




The only solution I can think of is to set TSCreateQuadratureTS() with both the goal and constraint functions in the forward run. Then, in the adjoint calls, reset TSCreateQuadratureTS() with just the cost function I am interested in (either the goal or the constraint) and set just a single TSSetCostGradients(). Will this work? Are there better alternatives?

TSCreateQuadratureTS() is needed only when you have integral terms in the cost functionals. It has nothing to do with the procedure to compute the adjoints for multiple cost functionals simultaneously. Do you have integrals in both the goal and the constraint? If so, you can create one quadrature TS and evaluate both integrals together. For example, you may have r[0] (the first element of the output vector in your cost integrand) for the goal and r[1] for the constraint. Just be careful that the adjoint variables (array lambda[]/mup[]) should be organized in the same order.




Even if successful, there is the problem that the trajectory goes back to the beginning when we perform a TSAdjointSolve() call. Subsequent calls to TSAdjointSolve() (for instance for another cost function) are invalid because the trajectory is not set at the end of the simulation. One needs to call the forward problem to bring it back to the end. Is there a quick way to set the trajectory state to the last time step without having to run the forward problem? I am attaching an example to illustrate this issue. One can uncomment lines 120-122 to obtain the right value of the derivative.

Most likely you need only one call to TSAdjointSolve(). Reusing the trajectory for multiple calls is also doable. But I doubt you would need it.

Hong (Mr.)




Thanks
Miguel

Miguel A. Salazar de Troya
Postdoctoral Researcher, Lawrence Livermore National Laboratory
B141
Rm: 1085-5
Ph: 1(925) 422-6411
<simple-ode.py>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20201229/80c2a52d/attachment-0001.html>