[petsc-users] Error reported by MUMPS in numerical factorization phase

Hong hzhang at mcs.anl.gov
Wed Dec 2 12:26:56 CST 2015


Danyang:
It is likely a zero pivot. I'm adding a feature to petsc. When matrix
factorization fails, computation continues with error information stored in
ksp->reason=DIVERGED_PCSETUP_FAILED.
For your timestepping code, you may able to automatically reduce timestep
and continue your simulation.

Do you want to test it? If so, you need install petsc-dev with my
branch hzhang/matpackage-erroriffpe on your cluster. We may merge this
branch to petsc-master soon.

>
> It's not easy to run in debugging mode as the cluster does not have petsc
> installed using debug mode. Restart the case from the crashing time does
> not has the problem. So if I want to detect this error, I need to start the
> simulation from beginning which takes hours in the cluster.
>

This is why we are adding this new feature.

>
> Do you mean I need to redo symbolic factorization? For now, I only do
> factorization once at the first timestep and then reuse it. Some of the
> code is shown below.
>
>             if (timestep == 1) then
>               call PCFactorSetMatSolverPackage(pc_flow,MATSOLVERMUMPS,ierr)
>               CHKERRQ(ierr)
>
>               call PCFactorSetUpMatSolverPackage(pc_flow,ierr)
>               CHKERRQ(ierr)
>
>               call PCFactorGetMatrix(pc_flow,a_flow_j,ierr)
>               CHKERRQ(ierr)
>             end if
>
>             call KSPSolve(ksp_flow,b_flow,x_flow,ierr)
>             CHKERRQ(ierr)
>

I do not think you need to change this part of code.
Does you code check convergence at each time step?

Hong

>
>
> On 15-12-02 08:39 AM, Hong wrote:
>
> Danyang :
>>
>> My code fails due to the error in external library. It works fine for the
>> previous 2000+ timesteps but then crashes.
>>
>> [4]PETSC ERROR: Error in external library
>> [4]PETSC ERROR: Error reported by MUMPS in numerical factorization phase:
>> INFO(1)=-1, INFO(2)=0
>>
>
> This simply says an error occurred in proc[0] during numerical
> factorization, which usually either encounter a zeropivot or run out of
> memory. Since it is at a later timesteps, which I guess you reuse matrix
> factor, zeropivot might be the problem.
> Is possible to run it in debugging mode? In this way, mumps would dump out
> more information.
>
>>
>> Then I tried the same simulation on another machine using the same number
>> of processors, it does not fail.
>>
> Does this machine  have larger memory?
>
> Hong
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20151202/5fb315b0/attachment.html>


More information about the petsc-users mailing list