[petsc-users] "-ts_exact_final_time matchstep" leads to DIVERGED_STEP_REJECTED

Barry Smith bsmith at petsc.dev
Tue Dec 17 17:33:33 CST 2024


  This output is odd and seems wrong

> 0 TS dt 0. time 0.


This is printing the initial timestep and time before it does any timestepping. We expect dt to be 1e-12 as your other case does print

0 TS dt 1e-12 time 0.

What the exact final time flag is set to shouldn't affect the timestep this early in the computation. 

You could try in the debugger to trace the variable ts->time_step to see when it is being set from 1.e-12 to 0. 




> On Dec 13, 2024, at 4:40 PM, Blondel, Sophie <sblondel at utk.edu> wrote:
> 
> Barry,
> 
> The short output is "SNESSolve has not converged due to Nan or Inf norm", the full one is attached.
> 
> Cheers,
> 
> Sophie
> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> Sent: Friday, December 13, 2024 14:56
> To: Blondel, Sophie <sblondel at utk.edu <mailto:sblondel at utk.edu>>
> Cc: Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>>; Zhang, Hong <hongzhang at anl.gov <mailto:hongzhang at anl.gov>>; Emil Constantinescu <emconsta at anl.gov <mailto:emconsta at anl.gov>>; petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>; xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net> <xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net>>
> Subject: Re: [petsc-users] "-ts_exact_final_time matchstep" leads to DIVERGED_STEP_REJECTED
>  
> 
>     There is a bit of complicated logic to determine the "adjusted" timestep in TSAdaptChoose() when if (*accept && ts->exact_final_time == TS_EXACTFINALTIME_MATCHSTEP) {
> 
>     Is it possible that hmax = tmax - t; is exactly zero, and the logic below does not correctly handle that case?
> 
> 0 TS dt 0. time 0.
> 0 TS dt 0. time 0.
> 0 TS dt 0. time 0.
> 0 TS dt 0. time 0.
>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
> 
>   Sophie,
> 
>      Any idea why SNES reason DIVERGED_FNORM_NAN?   Could you run with -snes_error_if_not_converged? 
> 
>> On Dec 13, 2024, at 2:34 PM, Blondel, Sophie <sblondel at utk.edu <mailto:sblondel at utk.edu>> wrote:
>> 
>> Hi everyone,
>> 
>> The first max time it is trying to reach is 1.0e-12 s, and the initial dt is set to 1.0e-12 s from the commandline options. I believe it's not a formatting issue and that the dt is actually set somewhere to 0 s because that's why the step is rejected.
>> 
>> Best,
>> 
>> Sophie
>> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
>> Sent: Friday, December 13, 2024 14:21
>> To: Blondel, Sophie <sblondel at utk.edu <mailto:sblondel at utk.edu>>; Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>>; Zhang, Hong <hongzhang at anl.gov <mailto:hongzhang at anl.gov>>; Emil Constantinescu <emconsta at anl.gov <mailto:emconsta at anl.gov>>
>> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>; xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net> <xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net>>
>> Subject: Re: [petsc-users] "-ts_exact_final_time matchstep" leads to DIVERGED_STEP_REJECTED
>>  
>> 
>>    Hm, what is the final time you are stepping towards in this run?
>> 
>>    There is something wrong with the adapt code since it seems to start with a dt of 0 but then tries "adapting" several times, but it could be the 
>> monitor function does not correctly format numbers smaller than 1.e-12 and it is just using truly small dt.
>> 
>>    Jed, Hong, Emil?
>> 
>>    Barry
>> 
>> 
>>> On Dec 10, 2024, at 11:08 AM, Blondel, Sophie <sblondel at utk.edu <mailto:sblondel at utk.edu>> wrote:
>>> 
>>> Good morning Barry,
>>> 
>>> Attached are the updated files, there is more useful information in them.
>>> 
>>> Cheers,
>>> 
>>> Sophie
>>>   
>>> From: Blondel, Sophie via Xolotl-psi-development <xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net>>
>>> Sent: Monday, December 9, 2024 17:29
>>> To: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
>>> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>; xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net> <xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net>>
>>> Subject: Re: [Xolotl-psi-development] [petsc-users] "-ts_exact_final_time matchstep" leads to DIVERGED_STEP_REJECTED
>>>  
>>> Hi Barry,
>>> 
>>> I hope you are doing well.
>>> 
>>> Attached are the output. To give a little more context, this is a "new" way of running the code where multiple instances are created and communicate together every few time steps (like coupling the code with itself in memory). Here there are 3 instances that each have a separate TS object, plus one "main" instance that doesn't solve anything but compute rates to exchange between the other instances.
>>> 
>>> Cheers,
>>> 
>>> Sophie
>>>   
>>> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
>>> Sent: Monday, December 9, 2024 15:12
>>> To: Blondel, Sophie <sblondel at utk.edu <mailto:sblondel at utk.edu>>
>>> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>; xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net> <xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net>>
>>> Subject: Re: [petsc-users] "-ts_exact_final_time matchstep" leads to DIVERGED_STEP_REJECTED
>>>  
>>> 
>>> 
>>>> On Dec 9, 2024, at 2:56 PM, Blondel, Sophie via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> I am trying to understand a strange behavior I'm encountering: when running my application with "-ts_exact_final_time stepover" everything goes well, but when I switch to "matchstep" I get DIVERGED_STEP_REJECTED before the first time step is finished.
>>> 
>>>    This is in the very first time-step in TSSolve? 
>>> 
>>>     Please run with -ts_monitor and send all the output (best for a short time interval and do it twice once with -ts_exact_final_time stepover and once with exact.
>>> 
>>>    Barry
>>> 
>>> 
>>>> I tried increasing the maximum number of rejections and it just takes longer to diverge, and if I set the value to "unlimited" it is basically an infinite loop.
>>>> 
>>>> Is there a way to check why is the step rejected? Could the "matchstep" option change tolerances somewhere that would cause that behavior?
>>>> 
>>>> Let me know if I should provide more information.
>>>> 
>>>> Best,
>>>> 
>>>> Sophie Blondel
> 
> <matchstep_reason.txt>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241217/b24e7514/attachment-0003.html>


More information about the petsc-users mailing list