[petsc-users] "-ts_exact_final_time matchstep" leads to DIVERGED_STEP_REJECTED

Barry Smith bsmith at petsc.dev
Tue Dec 17 17:04:42 CST 2024


  Could you please send all the options you are using and any that are set in the code, like -ts_type  etc

> On Dec 16, 2024, at 9:53 AM, Blondel, Sophie <sblondel at utk.edu> wrote:
> 
> Good morning Barry,
> 
> Changing the initial timestep on the commandline didn't change anything. It's like there is something happening afterwards that sets the initial dt to 0 s that doesn't happen with TS_EXACTFINALTIME_STEPOVER.
> 
> Cheers,
> 
> Sophie
> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> Sent: Friday, December 13, 2024 22:27
> To: Blondel, Sophie <sblondel at utk.edu <mailto:sblondel at utk.edu>>
> Cc: Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>>; Zhang, Hong <hongzhang at anl.gov <mailto:hongzhang at anl.gov>>; Emil Constantinescu <emconsta at anl.gov <mailto:emconsta at anl.gov>>; petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>; xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net> <xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net>>
> Subject: Re: [petsc-users] "-ts_exact_final_time matchstep" leads to DIVERGED_STEP_REJECTED
>  
> 
>    Ok, the SNES reason DIVERGED_FNORM_NAN is likely due to the TS using a dt of zero, and hence, the evaluation of the TS causing a divide by zero. 
>    
>    So the TS adapt needs to be understood better in this situation. My guess is still TS_EXACTFINALTIME_MATCHSTEP is buggy when one actually does get an exact match.
> 
>   Please try using a different initial timestep like 0.5e-12 s on the command line.
> 
>   Barry
> 
> 
>> On Dec 13, 2024, at 4:40 PM, Blondel, Sophie <sblondel at utk.edu <mailto:sblondel at utk.edu>> wrote:
>> 
>> Barry,
>> 
>> The short output is "SNESSolve has not converged due to Nan or Inf norm", the full one is attached.
>> 
>> Cheers,
>> 
>> Sophie
>> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
>> Sent: Friday, December 13, 2024 14:56
>> To: Blondel, Sophie <sblondel at utk.edu <mailto:sblondel at utk.edu>>
>> Cc: Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>>; Zhang, Hong <hongzhang at anl.gov <mailto:hongzhang at anl.gov>>; Emil Constantinescu <emconsta at anl.gov <mailto:emconsta at anl.gov>>; petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>; xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net> <xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net>>
>> Subject: Re: [petsc-users] "-ts_exact_final_time matchstep" leads to DIVERGED_STEP_REJECTED
>>  
>> 
>>     There is a bit of complicated logic to determine the "adjusted" timestep in TSAdaptChoose() when if (*accept && ts->exact_final_time == TS_EXACTFINALTIME_MATCHSTEP) {
>> 
>>     Is it possible that hmax = tmax - t; is exactly zero, and the logic below does not correctly handle that case?
>> 
>> 0 TS dt 0. time 0.
>> 0 TS dt 0. time 0.
>> 0 TS dt 0. time 0.
>> 0 TS dt 0. time 0.
>>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>>       TSAdapt basic step   0 stage rejected (SNES reason DIVERGED_FNORM_NAN) t=0          + 0.000e+00 retrying with dt=0.000e+00 
>> 
>>   Sophie,
>> 
>>      Any idea why SNES reason DIVERGED_FNORM_NAN?   Could you run with -snes_error_if_not_converged? 
>> 
>>> On Dec 13, 2024, at 2:34 PM, Blondel, Sophie <sblondel at utk.edu <mailto:sblondel at utk.edu>> wrote:
>>> 
>>> Hi everyone,
>>> 
>>> The first max time it is trying to reach is 1.0e-12 s, and the initial dt is set to 1.0e-12 s from the commandline options. I believe it's not a formatting issue and that the dt is actually set somewhere to 0 s because that's why the step is rejected.
>>> 
>>> Best,
>>> 
>>> Sophie
>>> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
>>> Sent: Friday, December 13, 2024 14:21
>>> To: Blondel, Sophie <sblondel at utk.edu <mailto:sblondel at utk.edu>>; Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>>; Zhang, Hong <hongzhang at anl.gov <mailto:hongzhang at anl.gov>>; Emil Constantinescu <emconsta at anl.gov <mailto:emconsta at anl.gov>>
>>> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>; xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net> <xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net>>
>>> Subject: Re: [petsc-users] "-ts_exact_final_time matchstep" leads to DIVERGED_STEP_REJECTED
>>>  
>>> 
>>>    Hm, what is the final time you are stepping towards in this run?
>>> 
>>>    There is something wrong with the adapt code since it seems to start with a dt of 0 but then tries "adapting" several times, but it could be the 
>>> monitor function does not correctly format numbers smaller than 1.e-12 and it is just using truly small dt.
>>> 
>>>    Jed, Hong, Emil?
>>> 
>>>    Barry
>>> 
>>> 
>>>> On Dec 10, 2024, at 11:08 AM, Blondel, Sophie <sblondel at utk.edu <mailto:sblondel at utk.edu>> wrote:
>>>> 
>>>> Good morning Barry,
>>>> 
>>>> Attached are the updated files, there is more useful information in them.
>>>> 
>>>> Cheers,
>>>> 
>>>> Sophie
>>>>    
>>>> From: Blondel, Sophie via Xolotl-psi-development <xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net>>
>>>> Sent: Monday, December 9, 2024 17:29
>>>> To: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
>>>> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>; xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net> <xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net>>
>>>> Subject: Re: [Xolotl-psi-development] [petsc-users] "-ts_exact_final_time matchstep" leads to DIVERGED_STEP_REJECTED
>>>>  
>>>> Hi Barry,
>>>> 
>>>> I hope you are doing well.
>>>> 
>>>> Attached are the output. To give a little more context, this is a "new" way of running the code where multiple instances are created and communicate together every few time steps (like coupling the code with itself in memory). Here there are 3 instances that each have a separate TS object, plus one "main" instance that doesn't solve anything but compute rates to exchange between the other instances.
>>>> 
>>>> Cheers,
>>>> 
>>>> Sophie
>>>>    
>>>> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
>>>> Sent: Monday, December 9, 2024 15:12
>>>> To: Blondel, Sophie <sblondel at utk.edu <mailto:sblondel at utk.edu>>
>>>> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>; xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net> <xolotl-psi-development at lists.sourceforge.net <mailto:xolotl-psi-development at lists.sourceforge.net>>
>>>> Subject: Re: [petsc-users] "-ts_exact_final_time matchstep" leads to DIVERGED_STEP_REJECTED
>>>>  
>>>> 
>>>> 
>>>>> On Dec 9, 2024, at 2:56 PM, Blondel, Sophie via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> I am trying to understand a strange behavior I'm encountering: when running my application with "-ts_exact_final_time stepover" everything goes well, but when I switch to "matchstep" I get DIVERGED_STEP_REJECTED before the first time step is finished.
>>>> 
>>>>    This is in the very first time-step in TSSolve? 
>>>> 
>>>>     Please run with -ts_monitor and send all the output (best for a short time interval and do it twice once with -ts_exact_final_time stepover and once with exact.
>>>> 
>>>>    Barry
>>>> 
>>>> 
>>>>> I tried increasing the maximum number of rejections and it just takes longer to diverge, and if I set the value to "unlimited" it is basically an infinite loop.
>>>>> 
>>>>> Is there a way to check why is the step rejected? Could the "matchstep" option change tolerances somewhere that would cause that behavior?
>>>>> 
>>>>> Let me know if I should provide more information.
>>>>> 
>>>>> Best,
>>>>> 
>>>>> Sophie Blondel
>> 
>> <matchstep_reason.txt>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241217/a8972f58/attachment-0001.html>


More information about the petsc-users mailing list