[Nek5000-users] Restart problem

Thu Dec 22 22:22:55 CST 2011

> Hi
>
> I simulate jet in crossflow problem with nek5000 and I've got serius
> problems with restarting the simulation. It causes strong spurious
> velocity oscillation and I cannot get rid of them. I've implemented
> restarting procedure described in prepost.f, but it doesn't help much.
> Playing with file format (parameters p66, p67) and projection (p94, p95)
> I can only decrease oscillations amplitude, but they are still there.
> Surprisingly saving output files in double precision (p63=8) makes
> everything worse. Has anybody got similar problems?
> Best regards
>
> Adam

Hi,

I think that the full restart capability should now work with
the current version of the source.

There is a 2D example in the repo, with a README that I also
provide below.

Basically, you will now save 4 files each time you wish to 
checkpoint.  The files come in two sets, A and B, and the A
set is then overwritten by the 3rd checkpoint, etc. so that
you have at most 8 checkpoint files on hand at any one time.
The files are 64 bit and thus cannot be used by VisIt --- thus,
they are truly designated as checkpoint/restart files and not
analysis files.   More information in the README below.

Please let me know if you have comments or questions.

Best regards,

Paul

-------------------------------------------------------------------------
>From the examples/cyl_restart directory:

SET UP:
=======

This directory contains an example of full restart capabilities for
Nek5000.

The model flow is a von Karman street in the wake of a 2D cylinder.
The quantity of interest is taken to be the lift, which is monitored
via "grep agy logfile" in the run_test script.   A matlab file, doit.m,
can be used to analyze the output files containing the lift history
of the four cases.  The cases are:

ca   -   initial run (no projection)
cb   -   restart run for ca case

pa   -   initial run (with projection)
pb   -   restart run for pa case

BACKGROUND:
===========

Timestepping in Nek5000 is based on BDFk/EXTk (k=3, typ.), which uses kth-order
backward-difference formulae (BDFk) to evaluate velocity time derivatives and
kth-order extrapolation (EXTk) for explicit evaluation of the nonlinear and
pressure boundary terms.  Under normal conditions, the velocity and pressure
for preceding timesteps are required to advance the the solution at each step.

At startup, the timestepper is typically bootstrapped using a lower-order
BDF/EXT formula that, given the artificiality of most initial conditions,
is typically adequate.    The velocity field often has enough inertia and
sufficient signature such that the same bootstrap procedure also works when
restarting from an existing solution (i.e., a .fnnnnn or .fldnn file, stored
in 32-bit precision).

For some cases, it is important to have reproducibility of the time history
to the working precision (14 digits, typ.) of the code.  The full restart
feature is designed to provide this capability.   The main features of 
full restart are:

.Preserve alternating sets of snapshots (4 per set) in 64-bit precision.
  (Alternating sets are saved in case the job fails in the middle of
   saving a set.)

.Use the most recent set to restart the computation by overwriting
  the solution for the first steps, 0 through 3, with the preserved
  snapshots.

Full restart is triggered through the .usr file.    In the given example 
cases, "ca" and "cb" the restart-save is illustrated in ca.usr and the 
actual restart, plus the save, is illustrated in cb.usr.   For these cases, 
the restart is encapsulated in the user-provided routine "my_full_restart"
shown below, along with the calling format in userchk:

c-----------------------------------------------------------------------
       subroutine userchk
       include 'SIZE'
       include 'TOTAL'

       logical if_drag_out,if_torq_out

       call my_full_restart

       scale = 1.
       if_drag_out = .true.
       if_torq_out = .false.
       call torque_calc(scale,x0,if_drag_out,if_torq_out)

       return
       end
c-----------------------------------------------------------------------
       subroutine my_full_restart

       character*80 s80(4)

       call blank(s80,4*80)
       s80(1) ='rs8ca0.f00005'
       s80(2) ='rs8ca0.f00006'
       s80(3) ='rs8ca0.f00007'
       s80(4) ='rs8ca0.f00008'

       call full_restart(s80,4)  ! Will overload 5-8 onto steps 0-3

       iosave = iostep           ! Trigger save based on iostep
       call full_restart_save(iosave)

       return
       end
c-----------------------------------------------------------------------

Note that in the example above, the set enumerated 5--8 is used to restart 
the computation.   This set is generated by first running the "ca" case.

Note that the frequency of the restart output is coincident with the
standard output frequency of field files (snapshots).  This might be too 
frequent if one is, say, making a movie where snapshots are typically 
dumped every 10 steps.   It would make more sense in this case to set 
iosave=1000, say.

Note also that if one is initiating a computation from something other
than the full restart mode then the full_restart() call should be commented
out.

COMMENTS:
=========

Full reproducibility of the solution is predicated on having sufficient
history information to replicate the state of "a" when running "b".
While such replication is possible, it does preclude acceleration of the
iterative solvers by projection onto prior solution spaces [1,2], since
these projections typically retain relatively long sequences of information
(e.g., spanning tens of steps) to maximally extract all the regularity in the
solution history.   Consequently, _full_ reproducibility is not retained with
projection turned on.  In this case, the solution is reproduced only to the
tolerance of the iterative solvers, which is in any case the maximum level
of accuracy attainable in the solution.   To illustrate the difference,
we provide a test case pairing, "pa" and "pb", which is essentially the 
same as the ca/cb pair save that projection is turned on for pa/pb.