[petsc-dev] MUMPS silent errors

Zhang, Junchao jczhang at mcs.anl.gov
Wed Oct 10 15:42:31 CDT 2018


OK, I see. I will check with MUMPS developers to know its expected behavior with failures in previous phases.

--Junchao Zhang


On Wed, Oct 10, 2018 at 2:54 PM Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:
On Wed, Oct 10, 2018 at 3:12 PM Zhang, Junchao <jczhang at mcs.anl.gov<mailto:jczhang at mcs.anl.gov>> wrote:

I met a case where MUMPS numeric factorization returned an error code -9 in mumps->id.INFOG(1) but A->erroriffailure was false in the following code in mumps.c

1199: PetscErrorCode<https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscErrorCode.html#PetscErrorCode> MatFactorNumeric_MUMPS(Mat<https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/Mat.html#Mat> F,Mat<https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/Mat.html#Mat> A,const MatFactorInfo<https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatFactorInfo.html#MatFactorInfo> *info)

1200: {
...
1227:   PetscMUMPS_c(mumps);
1228:   if (mumps->id.INFOG(1) < 0) {
1229:     if (A->erroriffailure) {
1230:       SETERRQ2<https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/SETERRQ2.html#SETERRQ2>(PETSC_COMM_SELF<https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PETSC_COMM_SELF.html#PETSC_COMM_SELF>,PETSC_ERR_LIB,"Error reported by MUMPS in numerical factorization phase: INFOG(1)=%d, INFO(2)=%d\n",mumps->id.INFOG(1),mumps->id.INFO(2));
1231:     } else {
1232:       if (mumps->id.INFOG(1) == -10) { /* numerically singular matrix */
1233:         PetscInfo2(F,"matrix is numerically singular, INFOG(1)=%d, INFO(2)=%d\n",mumps->id.INFOG(1),mumps->id.INFO(2));
1234:         F->factorerrortype = MAT_FACTOR_NUMERIC_ZEROPIVOT;


The code continued to KSPSolve and finished successfully (with wrong answer). The user did not call KSPGetConvergedReason() after KSPSolve. I found I had  to either add -ksp_error_if_not_converged or call KSPSetErrorIfNotConverged(ksp,PETSC_TRUE) to make the code fail.

Is it expected?  In my view, it is dangerous. If MUMPS fails in one stage, PETSc should not proceed to the next stage because it may hang there.

We made the executive decision to have solves complete by default. This is consistent with that decision. Users who want safety can ask for the error. It would be nice of MUMPS would cleanup correctly, but we can report that to them as a bug.

  Matt

--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20181010/f8cd9355/attachment-0001.html>


More information about the petsc-dev mailing list