[Nek5000-users] Restart problem

Thu Jan 5 02:46:37 CST 2012

On Thu, 2011-12-22 at 22:22 -0600, nek5000-users at lists.mcs.anl.gov
wrote:

Hi

Thank you Paul. I've made short test with jet in crossflow and it seem
to work, but I've got sometimes following warning

WARNING: restart file has a NSPCAL > LDIMT
read only part of the fld-data!
WARNING: NPSCAL read from restart file differs from 
currently used NPSCAL!

What's strange this warning doesn't show up at every restart. I'm going
to make some longer tests now. 
Regards
Adam 

> > Hi
> >
> > I simulate jet in crossflow problem with nek5000 and I've got serius
> > problems with restarting the simulation. It causes strong spurious
> > velocity oscillation and I cannot get rid of them. I've implemented
> > restarting procedure described in prepost.f, but it doesn't help much.
> > Playing with file format (parameters p66, p67) and projection (p94, p95)
> > I can only decrease oscillations amplitude, but they are still there.
> > Surprisingly saving output files in double precision (p63=8) makes
> > everything worse. Has anybody got similar problems?
> > Best regards
> >
> > Adam
> 
> Hi,
> 
> I think that the full restart capability should now work with
> the current version of the source.
> 
> There is a 2D example in the repo, with a README that I also
> provide below.
> 
> Basically, you will now save 4 files each time you wish to 
> checkpoint.  The files come in two sets, A and B, and the A
> set is then overwritten by the 3rd checkpoint, etc. so that
> you have at most 8 checkpoint files on hand at any one time.
> The files are 64 bit and thus cannot be used by VisIt --- thus,
> they are truly designated as checkpoint/restart files and not
> analysis files.   More information in the README below.
> 
> Please let me know if you have comments or questions.
> 
> Best regards,
> 
> Paul
> 
> -------------------------------------------------------------------------
> >From the examples/cyl_restart directory:
> 
> SET UP:
> =======
> 
> This directory contains an example of full restart capabilities for
> Nek5000.
> 
> The model flow is a von Karman street in the wake of a 2D cylinder.
> The quantity of interest is taken to be the lift, which is monitored
> via "grep agy logfile" in the run_test script.   A matlab file, doit.m,
> can be used to analyze the output files containing the lift history
> of the four cases.  The cases are:
> 
> ca   -   initial run (no projection)
> cb   -   restart run for ca case
> 
> pa   -   initial run (with projection)
> pb   -   restart run for pa case
> 
> BACKGROUND:
> ===========
> 
> Timestepping in Nek5000 is based on BDFk/EXTk (k=3, typ.), which uses kth-order
> backward-difference formulae (BDFk) to evaluate velocity time derivatives and
> kth-order extrapolation (EXTk) for explicit evaluation of the nonlinear and
> pressure boundary terms.  Under normal conditions, the velocity and pressure
> for preceding timesteps are required to advance the the solution at each step.
> 
> At startup, the timestepper is typically bootstrapped using a lower-order
> BDF/EXT formula that, given the artificiality of most initial conditions,
> is typically adequate.    The velocity field often has enough inertia and
> sufficient signature such that the same bootstrap procedure also works when
> restarting from an existing solution (i.e., a .fnnnnn or .fldnn file, stored
> in 32-bit precision).
> 
> For some cases, it is important to have reproducibility of the time history
> to the working precision (14 digits, typ.) of the code.  The full restart
> feature is designed to provide this capability.   The main features of 
> full restart are:
> 
> .Preserve alternating sets of snapshots (4 per set) in 64-bit precision.
>   (Alternating sets are saved in case the job fails in the middle of
>    saving a set.)
> 
> .Use the most recent set to restart the computation by overwriting
>   the solution for the first steps, 0 through 3, with the preserved
>   snapshots.
> 
> 
> Full restart is triggered through the .usr file.    In the given example 
> cases, "ca" and "cb" the restart-save is illustrated in ca.usr and the 
> actual restart, plus the save, is illustrated in cb.usr.   For these cases, 
> the restart is encapsulated in the user-provided routine "my_full_restart"
> shown below, along with the calling format in userchk:
> 
> 
> c-----------------------------------------------------------------------
>        subroutine userchk
>        include 'SIZE'
>        include 'TOTAL'
> 
>        logical if_drag_out,if_torq_out
> 
>        call my_full_restart
> 
>        scale = 1.
>        if_drag_out = .true.
>        if_torq_out = .false.
>        call torque_calc(scale,x0,if_drag_out,if_torq_out)
> 
>        return
>        end
> c-----------------------------------------------------------------------
>        subroutine my_full_restart
> 
>        character*80 s80(4)
> 
>        call blank(s80,4*80)
>        s80(1) ='rs8ca0.f00005'
>        s80(2) ='rs8ca0.f00006'
>        s80(3) ='rs8ca0.f00007'
>        s80(4) ='rs8ca0.f00008'
> 
>        call full_restart(s80,4)  ! Will overload 5-8 onto steps 0-3
> 
> 
>        iosave = iostep           ! Trigger save based on iostep
>        call full_restart_save(iosave)
> 
>        return
>        end
> c-----------------------------------------------------------------------
> 
> 
> Note that in the example above, the set enumerated 5--8 is used to restart 
> the computation.   This set is generated by first running the "ca" case.
> 
> Note that the frequency of the restart output is coincident with the
> standard output frequency of field files (snapshots).  This might be too 
> frequent if one is, say, making a movie where snapshots are typically 
> dumped every 10 steps.   It would make more sense in this case to set 
> iosave=1000, say.
> 
> Note also that if one is initiating a computation from something other
> than the full restart mode then the full_restart() call should be commented
> out.
> 
> 
> COMMENTS:
> =========
> 
> Full reproducibility of the solution is predicated on having sufficient
> history information to replicate the state of "a" when running "b".
> While such replication is possible, it does preclude acceleration of the
> iterative solvers by projection onto prior solution spaces [1,2], since
> these projections typically retain relatively long sequences of information
> (e.g., spanning tens of steps) to maximally extract all the regularity in the
> solution history.   Consequently, _full_ reproducibility is not retained with
> projection turned on.  In this case, the solution is reproduced only to the
> tolerance of the iterative solvers, which is in any case the maximum level
> of accuracy attainable in the solution.   To illustrate the difference,
> we provide a test case pairing, "pa" and "pb", which is essentially the 
> same as the ca/cb pair save that projection is turned on for pa/pb.
> 
> 
> 
> 
> _______________________________________________
> Nek5000-users mailing list
> Nek5000-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users