[petsc-dev] MUMPS silent errors

Smith, Barry F. bsmith at mcs.anl.gov
Wed Oct 10 18:41:20 CDT 2018


  I looked at the code and it is handled in the PETSc way. The user should not expect KSP to error just because it was unable to solve a linear system; they should be calling KSPGetConvergedReason() after KSPSolve() to check that the solution was computed successfully.

   Barry


> On Oct 10, 2018, at 2:12 PM, Zhang, Junchao <jczhang at mcs.anl.gov> wrote:
> 
> I met a case where MUMPS numeric factorization returned an error code -9 in mumps->id.INFOG(1) but A->erroriffailure was false in the following code in mumps.c
> 1199: PetscErrorCode MatFactorNumeric_MUMPS(Mat F,Mat A,const MatFactorInfo *info)
> 1200: 
> {
> ...
> 
> 1227:   PetscMUMPS_c(mumps);
> 1228:   if
>  (mumps->id.INFOG(1) < 0) {
> 
> 1229:     if
>  (A->erroriffailure) {
> 
> 1230:       SETERRQ2(PETSC_COMM_SELF,PETSC_ERR_LIB,"Error reported by MUMPS in numerical factorization phase: INFOG(1)=%d, INFO(2)=%d\n"
> ,mumps->id.INFOG(1),mumps->id.INFO(2));
> 
> 1231:     } else
>  {
> 
> 1232:       if (mumps->id.INFOG(1) == -10) { /* numerically singular matrix */
> 1233:         PetscInfo2(F,"matrix is numerically singular, INFOG(1)=%d, INFO(2)=%d\n"
> ,mumps->id.INFOG(1),mumps->id.INFO(2));
> 
> 1234: 
>         F->factorerrortype = MAT_FACTOR_NUMERIC_ZEROPIVOT;
> 
> 
> The code continued to KSPSolve and finished successfully (with wrong answer). The user did not call KSPGetConvergedReason() after KSPSolve. I found I had  to either add -ksp_error_if_not_converged or call KSPSetErrorIfNotConverged(ksp,PETSC_TRUE) to make the code fail. 
> Is it expected?  In my view, it is dangerous. If MUMPS fails in one stage, PETSc should not proceed to the next stage because it may hang there.



More information about the petsc-dev mailing list