[petsc-users] Help needed with MUMPS solver

Pierre Jolivet pierre at joliv.et
Sun May 23 12:04:11 CDT 2021



> On 23 May 2021, at 6:17 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Sun, May 23, 2021 at 11:17 AM Karl Yang <y.juntao at hotmail.com <mailto:y.juntao at hotmail.com>> wrote:
> Hello,
> 
> I am using MUMPS direct solver for my project. I used the following options for solving my problem and it works in most cases. But for some cases I encounter a divergence error. But I think it is actually error due to MUMPS?
> 
> I'm not sure how to debug the error. It is appreciated if anyone familiar with MUMPS solver to offer me some guidance.
> 
> This says
> 
> On return from DMUMPS, INFOG(1)=              -9
>  On return from DMUMPS, INFOG(2)=              22
> 
> that the internal work array for MUMPS is too small. I am not sure which option controls that.

-mat_mumps_icntl_14

Juntao, you can usually troubleshoot MUMPS error codes by looking at http://mumps.enseeiht.fr/doc/userguide_5.4.0.pdf#page=92 <http://mumps.enseeiht.fr/doc/userguide_5.4.0.pdf#page=92>

Thanks,
Pierre

>   Thanks,
> 
>      Matt
>  
> regards
> Juntao
> 
> MUMPS options:
> PetscOptionsSetValue(NULL, "-ksp_type", "preonly");
> PetscOptionsSetValue(NULL, "-pc_type", "cholesky");
> PetscOptionsSetValue(NULL, "-pc_factor_mat_solver_type", "mumps");
> PetscOptionsSetValue(NULL, "-mat_mumps_icntl_1", "1");
> PetscOptionsSetValue(NULL, "-mat_mumps_icntl_2", "1");
> PetscOptionsSetValue(NULL, "-mat_mumps_icntl_3", "1");
> PetscOptionsSetValue(NULL, "-mat_mumps_icntl_4", "3");
> PetscOptionsSetValue(NULL, "-mat_mumps_icntl_28", "1");
> PetscOptionsSetValue(NULL, "-mat_mumps_icntl_7", "2");
> PetscOptionsSetValue(NULL, "-mat_mumps_icntl_24", "1");
>  
> 
> Log output from MUMPS and error message from PETSC at the bottom
> Entering DMUMPS 5.2.1 from C interface with JOB, N, NNZ =   1         240           2448
>       executing #MPI =      1, without OMP
> 
>  =================================================
>  MUMPS compiled with option -Dmetis
>  MUMPS compiled with option -Dptscotch
>  MUMPS compiled with option -Dscotch
>  This MUMPS version includes code for SAVE_RESTORE
>  =================================================
> L D L^T Solver for general symmetric matrices
> Type of parallelism: Working host
> 
>  ****** ANALYSIS STEP ********
> 
>  Scaling will be computed during analysis
> Compute maximum matching (Maximum Transversal):  5
>  ... JOB =  5: MAXIMIZE PRODUCT DIAGONAL AND SCALE
> 
> Entering analysis phase with ...
>                 N        NNZ         LIW       INFO(1)
>                 240       2448        5137             0
> Matrix entries:    IRN()   ICN()
>            1      1           1      2           1      3
>            1      4           1      5           1      6
>            1      7           1      8           1      9
>            1     10
>  Average density of rows/columns =   18
>  Average density of rows/columns =   18
>  Ordering based on AMF 
>  Constrained Ordering based on AMF
>  Average density of rows/columns =   18
>  Average density of rows/columns =   18
> NFSIZ(.)  =     0    38    14     0    33    33     0     0     0     0
> 
> FILS (.)  =     0   148     4   -96   224   163    20   -43     8     1
> 
> FRERE(.)  =   241    -5    -6   241     0    -2   241   241   241   241
> 
> 
> Leaving analysis phase with  ...
>  INFOG(1)                                       =               0
>  INFOG(2)                                       =               0
>  -- (20) Number of entries in factors (estim.)  =            3750
>  --  (3) Real space for factors    (estimated)  =            4641
>  --  (4) Integer space for factors (estimated)  =            2816
>  --  (5) Maximum frontal size      (estimated)  =              38
>  --  (6) Number of nodes in the tree            =              56
>  -- (32) Type of analysis effectively used      =               1
>  --  (7) Ordering option effectively used       =               2
>  ICNTL(6) Maximum transversal option            =               0
>  ICNTL(7) Pivot order option                    =               2
>  ICNTL(14) Percentage of memory relaxation      =              20
>  Number of level 2 nodes                        =               0
>  Number of split nodes                          =               0
>  RINFOG(1) Operations during elimination (estim)= 7.137D+04
>  Ordering compressed/constrained (ICNTL(12))    =               3
> 
>  MEMORY ESTIMATIONS ... 
>  Estimations with standard Full-Rank (FR) factorization:
>     Total space in MBytes, IC factorization      (INFOG(17)):           0
>     Total space in MBytes,  OOC factorization    (INFOG(27)):           0
> 
>  Elapsed time in analysis driver=       0.0016
> 
> Entering DMUMPS 5.2.1 from C interface with JOB, N, NNZ =   2         240           2448
>       executing #MPI =      1, without OMP
> 
> 
> 
> ****** FACTORIZATION STEP ********
> 
>  GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
>  Number of working processes                =               1
>  ICNTL(22) Out-of-core option               =               0
>  ICNTL(35) BLR activation (eff. choice)     =               0
>  ICNTL(14) Memory relaxation                =              20
>  INFOG(3) Real space for factors (estimated)=            4641
>  INFOG(4) Integer space for factors (estim.)=            2816
>  Maximum frontal size (estimated)           =              38
>  Number of nodes in the tree                =              56
>  Memory allowed (MB -- 0: N/A )             =               0
>  Memory provided by user, sum of LWK_USER   =               0
>  Relative threshold for pivoting, CNTL(1)   =      0.1000D-01
>   ZERO PIVOT DETECTION ON, THRESHOLD          =   2.8931920285365730E-020
>  INFINITE FIXATION 
>  Effective size of S     (based on INFO(39))=                 7981
>  Elapsed time to reformat/distribute matrix =      0.0001
>  ** Memory allocated, total in Mbytes           (INFOG(19)):           0
>  ** Memory effectively used, total in Mbytes    (INFOG(22)):           0
>  ** Memory dynamically allocated for CB, total in Mbytes   :           0
> 
>  Elapsed time for factorization             =      0.0006
> 
> Leaving factorization with ...
>  RINFOG(2)  Operations in node assembly     = 5.976D+03
>  ------(3)  Operations in node elimination  = 1.197D+05
>  INFOG (9)  Real space for factors          =            6193
>  INFOG(10)  Integer space for factors       =            3036
>  INFOG(11)  Maximum front size              =              42
>  INFOG(29)  Number of entries in factors    =            4896
>  INFOG(12)  Number of negative pivots       =              79
>  INFOG(13)  Number of delayed pivots        =             110
>  Number of 2x2 pivots in type 1 nodes       =               1
>  Number of 2X2 pivots in type 2 nodes       =               0
>  Nb of null pivots detected by ICNTL(24)    =               0
>  INFOG(28)  Estimated deficiency            =               0
>  INFOG(14)  Number of memory compress       =               0
> 
>  Elapsed time in factorization driver=       0.0009
> 
> Entering DMUMPS 5.2.1 from C interface with JOB, N, NNZ =   3         240           2448
>       executing #MPI =      1, without OMP
> 
> 
> 
>  ****** SOLVE & CHECK STEP ********
> 
>  GLOBAL STATISTICS PRIOR SOLVE PHASE ...........
>  Number of right-hand-sides                    =           1
>  Blocking factor for multiple rhs              =           1
>  ICNTL (9)                                     =           1
>   --- (10)                                     =           0
>   --- (11)                                     =           0
>   --- (20)                                     =           0
>   --- (21)                                     =           0
>   --- (30)                                     =           0
>   --- (35)                                     =           0
> 
> 
>  Vector solution for column            1
>  RHS
>   -7.828363D-02 -3.255337D+00  1.054729D+00  1.379822D-01 -3.892113D-01
>    1.433990D-01  1.089250D+00  2.252611D+00  3.215399D+00 -6.788806D-02
>  ** Space in MBYTES used for solve                        :         0
> 
>  Leaving solve with ...
>  Time to build/scatter RHS        =       0.000003
>  Time in solution step (fwd/bwd)  =       0.000167
>   .. Time in forward (fwd) step   =          0.000053
>   .. Time in backward (bwd) step  =          0.000093
>  Time to gather solution(cent.sol)=       0.000000
>  Time to copy/scale dist. solution=       0.000000
> 
>  Elapsed time in solve driver=       0.0004
> *** Warning: Verbose output for PETScKrylovSolver not implemented, calling PETSc KSPView directly.
> KSP Object: 1 MPI processes
>   type: preonly
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>   left preconditioning
>   using NONE norm type for convergence test
> PC Object: 1 MPI processes
>   type: cholesky
>     out-of-place factorization
>     tolerance for zero pivot 2.22045e-14
>     matrix ordering: natural
>     factor fill ratio given 0., needed 0.
>       Factored matrix follows:
>         Mat Object: 1 MPI processes
>           type: mumps
>           rows=240, cols=240
>           package used to perform factorization: mumps
>           total: nonzeros=3750, allocated nonzeros=3750
>           total number of mallocs used during MatSetValues calls=0
>             MUMPS run parameters:
>               SYM (matrix type):                   2 
>               PAR (host participation):            1 
>               ICNTL(1) (output for error):         1 
>               ICNTL(2) (output of diagnostic msg): 1 
>               ICNTL(3) (output for global info):   6 
>               ICNTL(4) (level of printing):        3 
>               ICNTL(5) (input mat struct):         0 
>               ICNTL(6) (matrix prescaling):        7 
>               ICNTL(7) (sequential matrix ordering):2 
>               ICNTL(8) (scaling strategy):        77 
>               ICNTL(10) (max num of refinements):  0 
>               ICNTL(11) (error analysis):          0 
>               ICNTL(12) (efficiency control):                         0 
>               ICNTL(13) (efficiency control):                         1 
>               ICNTL(14) (percentage of estimated workspace increase): 20 
>               ICNTL(18) (input mat struct):                           0 
>               ICNTL(19) (Schur complement info):                      0 
>               ICNTL(20) (rhs sparse pattern):                         0 
>               ICNTL(21) (solution struct):                            0 
>               ICNTL(22) (in-core/out-of-core facility):               0 
>               ICNTL(23) (max size of memory can be allocated locally):0 
>               ICNTL(24) (detection of null pivot rows):               1 
>               ICNTL(25) (computation of a null space basis):          0 
>               ICNTL(26) (Schur options for rhs or solution):          0 
>               ICNTL(27) (experimental parameter):                     -32 
>               ICNTL(28) (use parallel or sequential ordering):        1 
>               ICNTL(29) (parallel ordering):                          0 
>               ICNTL(30) (user-specified set of entries in inv(A)):    0 
>               ICNTL(31) (factors is discarded in the solve phase):    0 
>               ICNTL(33) (compute determinant):                        0 
>               ICNTL(35) (activate BLR based factorization):           0 
>               ICNTL(36) (choice of BLR factorization variant):        0 
>               ICNTL(38) (estimated compression rate of LU factors):   333 
>               CNTL(1) (relative pivoting threshold):      0.01 
>               CNTL(2) (stopping criterion of refinement): 1.49012e-08 
>               CNTL(3) (absolute pivoting threshold):      0. 
>               CNTL(4) (value of static pivoting):         -1. 
>               CNTL(5) (fixation for null pivots):         0. 
>               CNTL(7) (dropping parameter for BLR):       0. 
>               RINFO(1) (local estimated flops for the elimination after analysis): 
>                 [0] 71368. 
>               RINFO(2) (local estimated flops for the assembly after factorization): 
>                 [0]  5976. 
>               RINFO(3) (local estimated flops for the elimination after factorization): 
>                 [0]  119716. 
>               INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): 
>               [0] 0 
>               INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): 
>                 [0] 0 
>               INFO(23) (num of pivots eliminated on this processor after factorization): 
>                 [0] 240 
>               RINFOG(1) (global estimated flops for the elimination after analysis): 71368. 
>               RINFOG(2) (global estimated flops for the assembly after factorization): 5976. 
>               RINFOG(3) (global estimated flops for the elimination after factorization): 119716. 
>               (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0.,0.)*(2^0)
>               INFOG(3) (estimated real workspace for factors on all processors after analysis): 4641 
>               INFOG(4) (estimated integer workspace for factors on all processors after analysis): 2816 
>               INFOG(5) (estimated maximum front size in the complete tree): 38 
>               INFOG(6) (number of nodes in the complete tree): 56 
>               INFOG(7) (ordering option effectively use after analysis): 2 
>               INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 
>               INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 6193 
>               INFOG(10) (total integer space store the matrix factors after factorization): 3036 
>               INFOG(11) (order of largest frontal matrix after factorization): 42 
>               INFOG(12) (number of off-diagonal pivots): 79 
>               INFOG(13) (number of delayed pivots after factorization): 110 
>               INFOG(14) (number of memory compress after factorization): 0 
>               INFOG(15) (number of steps of iterative refinement after solution): 0 
>               INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 0 
>               INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 0 
>               INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 0 
>               INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 0 
>               INFOG(20) (estimated number of entries in the factors): 3750 
>               INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 0 
>               INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 0 
>               INFOG(23) (after analysis: value of ICNTL(6) effectively used): 5 
>               INFOG(24) (after analysis: value of ICNTL(12) effectively used): 3 
>               INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 
>               INFOG(28) (after factorization: number of null pivots encountered): 0
>               INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 4896
>               INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 0, 0
>               INFOG(32) (after analysis: type of analysis done): 1
>               INFOG(33) (value used for ICNTL(8)): -2
>               INFOG(34) (exponent of the determinant if determinant is requested): 0
>               INFOG(35) (after factorization: number of entries taking into account BLR factor compression - sum over all processors): 4896
>               INFOG(36) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - value on the most memory consuming processor): 0 
>               INFOG(37) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - sum over all processors): 0 
>               INFOG(38) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - value on the most memory consuming processor): 0 
>               INFOG(39) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - sum over all processors): 0 
>   linear system matrix = precond matrix:
>   Mat Object: 1 MPI processes
>     type: seqaij
>     rows=240, cols=240
>     total: nonzeros=4656, allocated nonzeros=4656
>     total number of mallocs used during MatSetValues calls=0
>       using I-node routines: found 167 nodes, limit used is 5
> 
> Entering DMUMPS 5.2.1 from C interface with JOB =  -2
>       executing #MPI =      1, without OMP
> rank: 0 coefficient: 0.132368
> 
> Entering DMUMPS 5.2.1 from C interface with JOB, N, NNZ =   1         960           9792
>       executing #MPI =      1, without OMP
> 
>  =================================================
>  MUMPS compiled with option -Dmetis
>  MUMPS compiled with option -Dptscotch
>  MUMPS compiled with option -Dscotch
>  This MUMPS version includes code for SAVE_RESTORE
>  =================================================
> L D L^T Solver for general symmetric matrices
> Type of parallelism: Working host
> 
>  ****** ANALYSIS STEP ********
> 
>  Scaling will be computed during analysis
> Compute maximum matching (Maximum Transversal):  5
>  ... JOB =  5: MAXIMIZE PRODUCT DIAGONAL AND SCALE
> 
> Entering analysis phase with ...
>                 N        NNZ         LIW       INFO(1)
>                 960       9792       20545             0
> Matrix entries:    IRN()   ICN()
>            1      1           1      2           1      3
>            1      4           1      5           1      6
>            1      7           1      8           1      9
>            1     10
>  Average density of rows/columns =   18
>  Average density of rows/columns =   18
>  Ordering based on AMF 
>  Constrained Ordering based on AMF
>  Average density of rows/columns =   18
>  Average density of rows/columns =   18
> NFSIZ(.)  =     0     0     0    58     0     0     0    73    14     0
> 
> FILS (.)  =     0  -747   -80   922   146     5     6   669     3     1
> 
> FRERE(.)  =   961   961   961     0   961   961   961    -4   -69   961
> 
> 
> Leaving analysis phase with  ...
>  INFOG(1)                                       =               0
>  INFOG(2)                                       =               0
>  -- (20) Number of entries in factors (estim.)  =           20336
>  --  (3) Real space for factors    (estimated)  =           24094
>  --  (4) Integer space for factors (estimated)  =           12143
>  --  (5) Maximum frontal size      (estimated)  =              80
>  --  (6) Number of nodes in the tree            =             227
>  -- (32) Type of analysis effectively used      =               1
>  --  (7) Ordering option effectively used       =               2
>  ICNTL(6) Maximum transversal option            =               0
>  ICNTL(7) Pivot order option                    =               2
>  ICNTL(14) Percentage of memory relaxation      =              20
>  Number of level 2 nodes                        =               0
>  Number of split nodes                          =               0
>  RINFOG(1) Operations during elimination (estim)= 6.966D+05
>  Ordering compressed/constrained (ICNTL(12))    =               3
> 
>  MEMORY ESTIMATIONS ... 
>  Estimations with standard Full-Rank (FR) factorization:
>     Total space in MBytes, IC factorization      (INFOG(17)):           1
>     Total space in MBytes,  OOC factorization    (INFOG(27)):           1
> 
>  Elapsed time in analysis driver=       0.0066
> 
> Entering DMUMPS 5.2.1 from C interface with JOB, N, NNZ =   2         960           9792
>       executing #MPI =      1, without OMP
> 
> 
> 
> ****** FACTORIZATION STEP ********
> 
>  GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
>  Number of working processes                =               1
>  ICNTL(22) Out-of-core option               =               0
>  ICNTL(35) BLR activation (eff. choice)     =               0
>  ICNTL(14) Memory relaxation                =              20
>  INFOG(3) Real space for factors (estimated)=           24094
>  INFOG(4) Integer space for factors (estim.)=           12143
>  Maximum frontal size (estimated)           =              80
>  Number of nodes in the tree                =             227
>  Memory allowed (MB -- 0: N/A )             =               0
>  Memory provided by user, sum of LWK_USER   =               0
>  Relative threshold for pivoting, CNTL(1)   =      0.1000D-01
>   ZERO PIVOT DETECTION ON, THRESHOLD          =   2.9434468577175697E-020
>  INFINITE FIXATION 
>  Effective size of S     (based on INFO(39))=                31314
>  Elapsed time to reformat/distribute matrix =      0.0006
>  ** Memory allocated, total in Mbytes           (INFOG(19)):           1
>  ** Memory effectively used, total in Mbytes    (INFOG(22)):           1
>  ** Memory dynamically allocated for CB, total in Mbytes   :           0
> 
>  Elapsed time for (failed) factorization    =      0.0032
> 
> Leaving factorization with ...
>  RINFOG(2)  Operations in node assembly     = 3.366D+04
>  ------(3)  Operations in node elimination  = 9.346D+05
>  INFOG (9)  Real space for factors          =           26980
>  INFOG(10)  Integer space for factors       =           13047
>  INFOG(11)  Maximum front size              =              84
>  INFOG(29)  Number of entries in factors    =           24047
>  INFOG(12)  Number of negative pivots       =             294
>  INFOG(13)  Number of delayed pivots        =             452
>  Number of 2x2 pivots in type 1 nodes       =               0
>  Number of 2X2 pivots in type 2 nodes       =               0
>  Nb of null pivots detected by ICNTL(24)    =               0
>  INFOG(28)  Estimated deficiency            =               0
>  INFOG(14)  Number of memory compress       =               1
> 
>  Elapsed time in factorization driver=       0.0042
>  On return from DMUMPS, INFOG(1)=              -9
>  On return from DMUMPS, INFOG(2)=              22
> terminate called after throwing an instance of 'std::runtime_error'
>   what():  
> 
> *** -------------------------------------------------------------------------
> *** DOLFIN encountered an error. If you are not able to resolve this issue
> *** using the information listed below, you can ask for help at
> ***
> ***     fenics-support at googlegroups.com <mailto:fenics-support at googlegroups.com>
> ***
> *** Remember to include the error message listed below and, if possible,
> *** include a *minimal* running example to reproduce the error.
> ***
> *** -------------------------------------------------------------------------
> *** Error:   Unable to solve linear system using PETSc Krylov solver.
> *** Reason:  Solution failed to converge in 0 iterations (PETSc reason DIVERGED_PC_FAILED, residual norm ||r|| = 0.000000e+00).
> *** Where:   This error was encountered inside PETScKrylovSolver.cpp.
> *** Process: 0
> *** 
> *** DOLFIN version: 2019.1.0
> *** Git changeset:  74d7efe1e84d65e9433fd96c50f1d278fa3e3f3f
> *** -------------------------------------------------------------------------
> 
> Aborted (core dumped)
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210523/8c2786eb/attachment-0001.html>


More information about the petsc-users mailing list