[petsc-dev] Discussion about time-dependent optimization moved from PR

Wed Oct 18 13:54:47 CDT 2017

> On Oct 18, 2017, at 7:31 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>> On Oct 18, 2017, at 4:47 AM, Stefano Zampini <stefano.zampini at gmail.com> wrote:
>> 
>> I don't have a real API, just a sketch of a possible general framework (with holes....)
>> 
>> First, I prefer having TSCreateAdjointTS(ts,&ats) (and possibly SNESGetAdjointSNES for optimization with nonlinear time-independent problems) that allows users to replace their Jacobian evaluation routines.
> 
>   So you don't like my idea of pulling the sensitivity API out of TS and putting it into a higher level object (that serves up the needed integrators via PetscSensiGetTS() and PetscSensiGetAdjointTS()?  Do you think that my model adds unneeded complexity to users, or what?

I agree on pulling it out, but instead of having PetscSensi, I proposed to have a DM that has the API with callbacks for parameter depedent evaluation of residuals and Jacobians (and Hessian terms).
It can also be TSGetDMDesign() and SNESGetDMDesign() instead of DMGetDMDesign which does not sound very well.

TSCreateAdjointTS() just gives you back a TS that is able to perform the adjoint run (either continuous or discrete) and shares the DMDesign, that can activate first-order or second order adjoints.
Both approaches have the same forcing term API (that only differ by the sampling time) which samples the state derivative of the objective functions (for gradient computations) and on some hessian terms for second-order adjoints (don’t know the details for discrete adjoints) The quadrature function they need to perform is the same, except for the input. I think DMDesign can also provide support for quadrature callbacks.

> 
>  Barry
> 
> 
> 
>> 
>> Then, there's the need of carrying over the design vector to TS (and SNES), and this can be accomplished by a DM
>> 
>> // User code
>> DMGetDMDesign(dm,&ddm)
>> DMSetApplicationContext(ddm,userctx);                       // attach user context
>> DMDesignSetSetUp(ddm,(PetscErrorCode*)(DM,Vec)) //setup user context
>> DMDesignSetGlobalToLocal //maybe other callbacks ???
>> DMDesignSetDesignVec(ddm,currentdesign)                //current design vector, can be updated by higher level routines
>> 
>> Then, within PETSc, we can do
>> 
>> TSGetDM(ts,&dm)
>> DMGetDMTS(dm,&tdm)
>> DMGetDMDesign(dm,&ddm)
>> DMTSSetDMDesign(tdm,ddm) // without taking reference on ddm, just passing callbacks from DMDesign to private DMTS ops 
>> 
>> and for gradient based optimization and sensitivity for DAEs, we need to implement the equivalents of TSSet{Gradient|Hessian}{DAE|IC} in DMTS. DMDesign should provide an API with callbacks for computing gradients (and possibly hessian terms) of residual evaluations wrt the parameters.
>> 
>> Similarly for a parameter dependent SNES
>> 
>> SNESGetDM(snes,&dm)
>> DMGetDMSNES(dm,&kdm)
>> DMGetDMDesign(dm,&ddm)
>> DMSNESSetDMDesign(kdm,ddm) // without taking reference on ddm, just passing callbacks from DMDesign to private DMSNES ops
>> 
>> Then, we need a consistent way of expressing objective functions (DMTAOObjective???), and embed quadratures inside TSSolve(). Thoughts?
>> 
>> 
>> 
>> 2017-10-17 22:02 GMT+03:00 Jed Brown <jed at jedbrown.org>:
>> Barry Smith <bsmith at mcs.anl.gov> writes:
>> 
>>>> On Oct 17, 2017, at 10:41 AM, Jed Brown <jed at jedbrown.org> wrote:
>>>> 
>>>> Barry Smith <bsmith at mcs.anl.gov> writes:
>>>> 
>>>>>>>> My
>>>>>>>> preference is for TSCreateAdjointTS() which the user can customize to
>>>>>>>> use arbitrary time integration methods (thus becoming a
>>>>>>>> "continuous-in-time" adjoint) and to use different spatial
>>>>>>>> discretizations (for consistency with the adjoint PDE).
>>>>>>> 
>>>>>>> I understand, I think TSCreateAdjointTS() is not the right model. I think that the correct model is, for example
>>>>>>> 
>>>>>>> PetscSensiCreate()
>>>>>>> PetscSensiSetObjective.... or whatever etc etc
>>>>>>> PetscSensiGetIntegrator(,&TS)
>>>>>>> PetscSensiGetAdjointIntegrator(,&TS).
>>>>>> 
>>>>>> How would PetscSensiGetAdjointIntegrator be implemented?  Since it needs
>>>>>> to be able to create a discrete adjoint (which depends on the forward TS
>>>>>> implementation), the control flow *must* go through TS.  What does that
>>>>>> interface look like?
>>>>> 
>>>>>  I am not sure what you mean.
>>>>> 
>>>>>  PetscSensi will contain the information about objective function etc as well as a pointer to the trajectory history and (of course) a pointer to the original TS. All the  auxiliary vectors etc related to sensitivities.
>>>>> 
>>>>>  The adjoint TS that is returned has the calls need to get access to the forward solution that it needs etc from the PetscSensi object (which likely will call the forward TS for missing timestep values etc. But the calls go through the PetscSensi object from the adjoint TS, they don't go directly to the forward TS.)
>>>> 
>>>> To be concrete with TSRK, PetscSensi does not have access to the Butcher
>>>> table for the forward method because that only exists in
>>>> src/ts/impls/explicit/rk/rk.c.  Since the discrete adjoint depends on
>>>> this detail of the forward method, there MUST be some code somewhere
>>>> that depends on the forward Butcher table.  Since that data ONLY EXISTS
>>>> in rk.c, whatever it is that PetscSensi does, there must be some code in
>>>> rk.c to define the adjoint method.
>>> 
>>>   Of course, that would be incorporated in what is an essentially a TSRKADJOINT implementation that does the adjoint integration. Note that (1) this is simpler than the regular TSRK because it only needs to handle linear with given time steps (2) it also computes along sensitivity information with calls to TSAdjointComputeDRDPFunction() etc
>>> 
>>>   Here is what the code looks like now. Just needs minor refactoring to make it match my proposal.
>> 
>> Right, so this function needs to call the dF/dp (RHSJacobian), dr/dy,
>> and dr/dp functions.  You're proposing that those would be set on
>> PetscSensi, but this code below will continue to live in rk.c after
>> refactoring (rather than in PetscSensi).
>> 
>> Since PetscSensi is not inside TS, there needs to be a public TS
>> interface ("developer level" if you like, but still public).  I want to
>> spec out that essential bit of public TS interface.
>> 
>> Note that I proposed putting the sensitivity update stuff below into
>> post-stage functions.  If that is possible, the bulk of the function
>> could actually live in PetscSensi.  Or it could be packaged with TS and
>> reusable across TS implementations.  I still want to start by spec'ing
>> out what the TS public interface needs to able to support this
>> PetscSensi object.
>> 
>>> static PetscErrorCode TSAdjointStep_RK(TS ts)
>>> {
>>>  TS_RK           *rk   = (TS_RK*)ts->data;
>>>  RKTableau        tab  = rk->tableau;
>>>  const PetscInt   s    = tab->s;
>>>  const PetscReal *A = tab->A,*b = tab->b,*c = tab->c;
>>>  PetscScalar     *w    = rk->work;
>>>  Vec             *Y    = rk->Y,*VecDeltaLam = rk->VecDeltaLam,*VecDeltaMu = rk->VecDeltaMu,*VecSensiTemp = rk->VecSensiTemp;
>>>  PetscInt         i,j,nadj;
>>>  PetscReal        t = ts->ptime;
>>>  PetscErrorCode   ierr;
>>>  PetscReal        h = ts->time_step;
>>>  Mat              J,Jp;
>>> 
>>>  PetscFunctionBegin;
>>>  rk->status = TS_STEP_INCOMPLETE;
>>>  for (i=s-1; i>=0; i--) {
>>>    rk->stage_time = t + h*(1.0-c[i]);
>>>    for (nadj=0; nadj<ts->numcost; nadj++) {
>>>      ierr = VecCopy(ts->vecs_sensi[nadj],VecSensiTemp[nadj]);CHKERRQ(ierr);
>>>      ierr = VecScale(VecSensiTemp[nadj],-h*b[i]);CHKERRQ(ierr);
>>>      for (j=i+1; j<s; j++) {
>>>        ierr = VecAXPY(VecSensiTemp[nadj],-h*A[j*s+i],VecDeltaLam[nadj*s+j]);CHKERRQ(ierr);
>>>      }
>>>    }
>>>    /* Stage values of lambda */
>>>    ierr = TSGetRHSJacobian(ts,&J,&Jp,NULL,NULL);CHKERRQ(ierr);
>>>    ierr = TSComputeRHSJacobian(ts,rk->stage_time,Y[i],J,Jp);CHKERRQ(ierr);
>>>    if (ts->vec_costintegral) {
>>>      ierr = TSAdjointComputeDRDYFunction(ts,rk->stage_time,Y[i],ts->vecs_drdy);CHKERRQ(ierr);
>>>    }
>>>    for (nadj=0; nadj<ts->numcost; nadj++) {
>>>      ierr = MatMultTranspose(J,VecSensiTemp[nadj],VecDeltaLam[nadj*s+i]);CHKERRQ(ierr);
>>>      if (ts->vec_costintegral) {
>>>        ierr = VecAXPY(VecDeltaLam[nadj*s+i],-h*b[i],ts->vecs_drdy[nadj]);CHKERRQ(ierr);
>>>      }
>>>    }
>>> 
>>>    /* Stage values of mu */
>>>    if(ts->vecs_sensip) {
>>>      ierr = TSAdjointComputeRHSJacobian(ts,rk->stage_time,Y[i],ts->Jacp);CHKERRQ(ierr);
>>>      if (ts->vec_costintegral) {
>>>        ierr = TSAdjointComputeDRDPFunction(ts,rk->stage_time,Y[i],ts->vecs_drdp);CHKERRQ(ierr);
>>>      }
>>> 
>>>      for (nadj=0; nadj<ts->numcost; nadj++) {
>>>        ierr = MatMultTranspose(ts->Jacp,VecSensiTemp[nadj],VecDeltaMu[nadj*s+i]);CHKERRQ(ierr);
>>>        if (ts->vec_costintegral) {
>>>          ierr = VecAXPY(VecDeltaMu[nadj*s+i],-h*b[i],ts->vecs_drdp[nadj]);CHKERRQ(ierr);
>>>        }
>>>      }
>>>    }
>>>  }
>>> 
>>>  for (j=0; j<s; j++) w[j] = 1.0;
>>>  for (nadj=0; nadj<ts->numcost; nadj++) {
>>>    ierr = VecMAXPY(ts->vecs_sensi[nadj],s,w,&VecDeltaLam[nadj*s]);CHKERRQ(ierr);
>>>    if(ts->vecs_sensip) {
>>>      ierr = VecMAXPY(ts->vecs_sensip[nadj],s,w,&VecDeltaMu[nadj*s]);CHKERRQ(ierr);
>>>    }
>>>  }
>>>  rk->status = TS_STEP_COMPLETE;
>>>  PetscFunctionReturn(0);
>>> }
>>> 
>>>> 
>>>> What is that code and what is the call stack to reach it?
>>> 
>>> Now you ask, how does the PetscSensi know to use that discrete adjoint integrator and return the right TS (instead of an empty one requiring the user to set the type)? This is simple, each TS can return
>>> its discrete adjoint integrator when requested, sure mechanically it
>>> is like a hidden TSGetAdjointTS() but from the public API
>> 
>> It's public TS API: PetscSensi will include petscts.h, not
>> petsc/private/tsimpl.h.  Let's work out what that TS API needs to be and
>> then we can discuss whether it is "clearer" to set that stuff on a
>> helper object.  If all we have are pass-through interfaces from
>> PetscSensi to TS, then it's probably not useful.  If we have a
>> meaningfully higher level interface, then it may be useful.
>> 
>>> point of view it is much clearer to come from the PetscSensi. You tell
>>> the PetscSensi all the information you need about the derivatives, you
>>> don't tell the TS. You are also right that not EVERYTHING related to
>>> sensitivities is only in the PetscSensi code since the TSRKADJOINT
>>> object knows about sensitivities that it carries around (and is in
>>> rk.c) but from the user point of view only the PetscSensi knows about
>>> sensitivities.
>>> 
>>>   Of course I could still be missing stuff
>>> 
>>>   Barry
>>> 
>>> 
>>>> 
>>>>>  Both proposed models shove one heck of a lot of stuff into TS; I think it would be a better design and better user API if we don't do that; we reserve the TS for being able to integrate something (and that is about all). Maybe it is not possible but I don't see why it is not possible.
>>>>> 
>>>>>  Since I don't know the details I asked for the people who do know the details (Hong and Stefano) to try to develop such an API (or clearly articular why it is impossible.)
>> 
>> 
>> 
>> -- 
>> Stefano
>