[petsc-users] TSAdjoint multilevel checkpointing running out of memory

Matthew Knepley knepley at gmail.com
Tue Dec 8 19:37:53 CST 2020

On Tue, Dec 8, 2020 at 6:47 PM Zhang, Hong via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Anton,
> TSAdjoint should manage checkpointing automatically, and the number of
> checkpoints in RAM and disk should not exceed the user-specified values.
> Can you send us the output for -ts_trajectory_monitor in your case?

One other thing. It is always possible to miscalculate RAM a little. If you
set it to 4 checkpoints, does it complete?



> Hong (Mr.)
> On Dec 8, 2020, at 3:37 PM, Anton Glazkov <anton.glazkov at chch.ox.ac.uk>
> wrote:
> Good evening,
> I’m attempting to run a multi-level checkpointing code on a cluster (ie
> RAM+disk storage with –download-revolve as a configure option) with the
> options “-ts_trajectory_type memory -ts_trajectory_max_cps_ram 5
> -ts_trajectory_max_cps_disk 5000”, for example. My question is, if I have
> 100,000 time points, for example, that need to be evaluated  during the
> forward and adjoint run, does TSAdjoint automatically optimize the
> checkpointing so that the number of checkpoints in RAM and disk do not
> exceed these values, or is one of the options ignored. I ask because I have
> a case that runs correctly with -ts_trajectory_type basic, but runs out of
> memory when attempting to fill the checkpoints in RAM when running the
> adjoint (I have verified that 5 checkpoints will actually fit into the
> available memory). This makes me think that maybe the
> -ts_trajectory_max_cps_ram 5 option is being ignored?
> Best wishes,
> Anton

What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20201208/e250fa30/attachment.html>

More information about the petsc-users mailing list