[petsc-users] Tough to reproduce petsctablefind error

Junchao Zhang junchao.zhang at gmail.com
Sat Sep 26 17:58:28 CDT 2020


On Sat, Sep 26, 2020 at 5:44 PM Mark Adams <mfadams at lbl.gov> wrote:

>
>
> On Sat, Sep 26, 2020 at 1:07 PM Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Sat, Sep 26, 2020 at 11:17 AM Mark McClure <mark at resfrac.com> wrote:
>>
>>> Thank you, all for the explanations.
>>>
>>> Following Matt's suggestion, we'll use -g (and not use
>>> -with-debugging=0) all future compiles to all users, so in future, we can
>>> provide better information.
>>>
>>> Second, Chris is going to boil our function down to minimum stub and
>>> share in case there is some subtle issue with the way the functions are
>>> being called.
>>>
>>> Third, I have question/request - Petsc is, in fact, detecting an error.
>>> As far as I can tell, this is not an uncontrolled 'seg fault'. It seems to
>>> me that maybe Petsc could choose to return out from the function when it
>>> detects this error, returning an error code, rather than dumping the core
>>> and terminating the program. If Petsc simply returned out with an error
>>> message, this would resolve the problem for us. After the Petsc call, we
>>> check for Petsc error messages. If Petsc returns an error - that's fine -
>>> we use a direct solver as a backup, and the simulation continues. So - I am
>>> not sure whether this is feasible - but if Petsc could return out with an
>>> error message - rather than dumping the core and terminating the program -
>>> then that would effectively resolve the issue for us. Would this change be
>>> possible?
>>>
>>
>> At some level, I think it is currently doing what you want. CHKERRQ()
>> simply returns an error code from that function call, printing an error
>> message. Suppressing the message is harder I think,
>>
>
> He does not need this.
>
>
>> but for now, if you know what function call is causing the error, you can
>> just catch the (ierr != 0) yourself instead of using CHKERRQ.
>>
>
> This is what I suggested earlier but maybe I was not clear enough.
>
> Your code calls something like
>
> ierr = SNESSolve(....); CHKERRQ(ierr);
>
> You can replace this with:
>
>  ierr = SNESSolve(....);
>  if (ierr) {
>
How to deal with CHKERRQ(ierr); inside SNESSolve()?

>     ....
>  }
>
> I suggested something earlier to do here. Maybe call KSPView. You could
> even destroy the solver and start the solver from scratch and see if that
> works.
>
> Mark
>
>
>> The drawback here is that we might not have cleaned up
>> all the state so that restarting makes sense. It should be possible to
>> just kill the solve, reset the solver, and retry, although it is not clear
>> to me at first glance if MPI will be in an okay state.
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200926/d4a48d4e/attachment.html>


More information about the petsc-users mailing list