[petsc-users] Help needed with MUMPS solver
Barry Smith
bsmith at petsc.dev
Sun May 23 11:17:58 CDT 2021
Please run with -ksp_error_if_not_converged and send all the output
Barry
> On May 23, 2021, at 10:16 AM, Karl Yang <y.juntao at hotmail.com> wrote:
>
> Hello,
>
> I am using MUMPS direct solver for my project. I used the following options for solving my problem and it works in most cases. But for some cases I encounter a divergence error. But I think it is actually error due to MUMPS?
>
> I'm not sure how to debug the error. It is appreciated if anyone familiar with MUMPS solver to offer me some guidance.
>
> regards
> Juntao
>
> MUMPS options:
> PetscOptionsSetValue(NULL, "-ksp_type", "preonly");
> PetscOptionsSetValue(NULL, "-pc_type", "cholesky");
> PetscOptionsSetValue(NULL, "-pc_factor_mat_solver_type", "mumps");
> PetscOptionsSetValue(NULL, "-mat_mumps_icntl_1", "1");
> PetscOptionsSetValue(NULL, "-mat_mumps_icntl_2", "1");
> PetscOptionsSetValue(NULL, "-mat_mumps_icntl_3", "1");
> PetscOptionsSetValue(NULL, "-mat_mumps_icntl_4", "3");
> PetscOptionsSetValue(NULL, "-mat_mumps_icntl_28", "1");
> PetscOptionsSetValue(NULL, "-mat_mumps_icntl_7", "2");
> PetscOptionsSetValue(NULL, "-mat_mumps_icntl_24", "1");
>
>
> Log output from MUMPS and error message from PETSC at the bottom
> Entering DMUMPS 5.2.1 from C interface with JOB, N, NNZ = 1 240 2448
> executing #MPI = 1, without OMP
>
> =================================================
> MUMPS compiled with option -Dmetis
> MUMPS compiled with option -Dptscotch
> MUMPS compiled with option -Dscotch
> This MUMPS version includes code for SAVE_RESTORE
> =================================================
> L D L^T Solver for general symmetric matrices
> Type of parallelism: Working host
>
> ****** ANALYSIS STEP ********
>
> Scaling will be computed during analysis
> Compute maximum matching (Maximum Transversal): 5
> ... JOB = 5: MAXIMIZE PRODUCT DIAGONAL AND SCALE
>
> Entering analysis phase with ...
> N NNZ LIW INFO(1)
> 240 2448 5137 0
> Matrix entries: IRN() ICN()
> 1 1 1 2 1 3
> 1 4 1 5 1 6
> 1 7 1 8 1 9
> 1 10
> Average density of rows/columns = 18
> Average density of rows/columns = 18
> Ordering based on AMF
> Constrained Ordering based on AMF
> Average density of rows/columns = 18
> Average density of rows/columns = 18
> NFSIZ(.) = 0 38 14 0 33 33 0 0 0 0
>
> FILS (.) = 0 148 4 -96 224 163 20 -43 8 1
>
> FRERE(.) = 241 -5 -6 241 0 -2 241 241 241 241
>
>
> Leaving analysis phase with ...
> INFOG(1) = 0
> INFOG(2) = 0
> -- (20) Number of entries in factors (estim.) = 3750
> -- (3) Real space for factors (estimated) = 4641
> -- (4) Integer space for factors (estimated) = 2816
> -- (5) Maximum frontal size (estimated) = 38
> -- (6) Number of nodes in the tree = 56
> -- (32) Type of analysis effectively used = 1
> -- (7) Ordering option effectively used = 2
> ICNTL(6) Maximum transversal option = 0
> ICNTL(7) Pivot order option = 2
> ICNTL(14) Percentage of memory relaxation = 20
> Number of level 2 nodes = 0
> Number of split nodes = 0
> RINFOG(1) Operations during elimination (estim)= 7.137D+04
> Ordering compressed/constrained (ICNTL(12)) = 3
>
> MEMORY ESTIMATIONS ...
> Estimations with standard Full-Rank (FR) factorization:
> Total space in MBytes, IC factorization (INFOG(17)): 0
> Total space in MBytes, OOC factorization (INFOG(27)): 0
>
> Elapsed time in analysis driver= 0.0016
>
> Entering DMUMPS 5.2.1 from C interface with JOB, N, NNZ = 2 240 2448
> executing #MPI = 1, without OMP
>
>
>
> ****** FACTORIZATION STEP ********
>
> GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
> Number of working processes = 1
> ICNTL(22) Out-of-core option = 0
> ICNTL(35) BLR activation (eff. choice) = 0
> ICNTL(14) Memory relaxation = 20
> INFOG(3) Real space for factors (estimated)= 4641
> INFOG(4) Integer space for factors (estim.)= 2816
> Maximum frontal size (estimated) = 38
> Number of nodes in the tree = 56
> Memory allowed (MB -- 0: N/A ) = 0
> Memory provided by user, sum of LWK_USER = 0
> Relative threshold for pivoting, CNTL(1) = 0.1000D-01
> ZERO PIVOT DETECTION ON, THRESHOLD = 2.8931920285365730E-020
> INFINITE FIXATION
> Effective size of S (based on INFO(39))= 7981
> Elapsed time to reformat/distribute matrix = 0.0001
> ** Memory allocated, total in Mbytes (INFOG(19)): 0
> ** Memory effectively used, total in Mbytes (INFOG(22)): 0
> ** Memory dynamically allocated for CB, total in Mbytes : 0
>
> Elapsed time for factorization = 0.0006
>
> Leaving factorization with ...
> RINFOG(2) Operations in node assembly = 5.976D+03
> ------(3) Operations in node elimination = 1.197D+05
> INFOG (9) Real space for factors = 6193
> INFOG(10) Integer space for factors = 3036
> INFOG(11) Maximum front size = 42
> INFOG(29) Number of entries in factors = 4896
> INFOG(12) Number of negative pivots = 79
> INFOG(13) Number of delayed pivots = 110
> Number of 2x2 pivots in type 1 nodes = 1
> Number of 2X2 pivots in type 2 nodes = 0
> Nb of null pivots detected by ICNTL(24) = 0
> INFOG(28) Estimated deficiency = 0
> INFOG(14) Number of memory compress = 0
>
> Elapsed time in factorization driver= 0.0009
>
> Entering DMUMPS 5.2.1 from C interface with JOB, N, NNZ = 3 240 2448
> executing #MPI = 1, without OMP
>
>
>
> ****** SOLVE & CHECK STEP ********
>
> GLOBAL STATISTICS PRIOR SOLVE PHASE ...........
> Number of right-hand-sides = 1
> Blocking factor for multiple rhs = 1
> ICNTL (9) = 1
> --- (10) = 0
> --- (11) = 0
> --- (20) = 0
> --- (21) = 0
> --- (30) = 0
> --- (35) = 0
>
>
> Vector solution for column 1
> RHS
> -7.828363D-02 -3.255337D+00 1.054729D+00 1.379822D-01 -3.892113D-01
> 1.433990D-01 1.089250D+00 2.252611D+00 3.215399D+00 -6.788806D-02
> ** Space in MBYTES used for solve : 0
>
> Leaving solve with ...
> Time to build/scatter RHS = 0.000003
> Time in solution step (fwd/bwd) = 0.000167
> .. Time in forward (fwd) step = 0.000053
> .. Time in backward (bwd) step = 0.000093
> Time to gather solution(cent.sol)= 0.000000
> Time to copy/scale dist. solution= 0.000000
>
> Elapsed time in solve driver= 0.0004
> *** Warning: Verbose output for PETScKrylovSolver not implemented, calling PETSc KSPView directly.
> KSP Object: 1 MPI processes
> type: preonly
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
> left preconditioning
> using NONE norm type for convergence test
> PC Object: 1 MPI processes
> type: cholesky
> out-of-place factorization
> tolerance for zero pivot 2.22045e-14
> matrix ordering: natural
> factor fill ratio given 0., needed 0.
> Factored matrix follows:
> Mat Object: 1 MPI processes
> type: mumps
> rows=240, cols=240
> package used to perform factorization: mumps
> total: nonzeros=3750, allocated nonzeros=3750
> total number of mallocs used during MatSetValues calls=0
> MUMPS run parameters:
> SYM (matrix type): 2
> PAR (host participation): 1
> ICNTL(1) (output for error): 1
> ICNTL(2) (output of diagnostic msg): 1
> ICNTL(3) (output for global info): 6
> ICNTL(4) (level of printing): 3
> ICNTL(5) (input mat struct): 0
> ICNTL(6) (matrix prescaling): 7
> ICNTL(7) (sequential matrix ordering):2
> ICNTL(8) (scaling strategy): 77
> ICNTL(10) (max num of refinements): 0
> ICNTL(11) (error analysis): 0
> ICNTL(12) (efficiency control): 0
> ICNTL(13) (efficiency control): 1
> ICNTL(14) (percentage of estimated workspace increase): 20
> ICNTL(18) (input mat struct): 0
> ICNTL(19) (Schur complement info): 0
> ICNTL(20) (rhs sparse pattern): 0
> ICNTL(21) (solution struct): 0
> ICNTL(22) (in-core/out-of-core facility): 0
> ICNTL(23) (max size of memory can be allocated locally):0
> ICNTL(24) (detection of null pivot rows): 1
> ICNTL(25) (computation of a null space basis): 0
> ICNTL(26) (Schur options for rhs or solution): 0
> ICNTL(27) (experimental parameter): -32
> ICNTL(28) (use parallel or sequential ordering): 1
> ICNTL(29) (parallel ordering): 0
> ICNTL(30) (user-specified set of entries in inv(A)): 0
> ICNTL(31) (factors is discarded in the solve phase): 0
> ICNTL(33) (compute determinant): 0
> ICNTL(35) (activate BLR based factorization): 0
> ICNTL(36) (choice of BLR factorization variant): 0
> ICNTL(38) (estimated compression rate of LU factors): 333
> CNTL(1) (relative pivoting threshold): 0.01
> CNTL(2) (stopping criterion of refinement): 1.49012e-08
> CNTL(3) (absolute pivoting threshold): 0.
> CNTL(4) (value of static pivoting): -1.
> CNTL(5) (fixation for null pivots): 0.
> CNTL(7) (dropping parameter for BLR): 0.
> RINFO(1) (local estimated flops for the elimination after analysis):
> [0] 71368.
> RINFO(2) (local estimated flops for the assembly after factorization):
> [0] 5976.
> RINFO(3) (local estimated flops for the elimination after factorization):
> [0] 119716.
> INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization):
> [0] 0
> INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization):
> [0] 0
> INFO(23) (num of pivots eliminated on this processor after factorization):
> [0] 240
> RINFOG(1) (global estimated flops for the elimination after analysis): 71368.
> RINFOG(2) (global estimated flops for the assembly after factorization): 5976.
> RINFOG(3) (global estimated flops for the elimination after factorization): 119716.
> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0.,0.)*(2^0)
> INFOG(3) (estimated real workspace for factors on all processors after analysis): 4641
> INFOG(4) (estimated integer workspace for factors on all processors after analysis): 2816
> INFOG(5) (estimated maximum front size in the complete tree): 38
> INFOG(6) (number of nodes in the complete tree): 56
> INFOG(7) (ordering option effectively use after analysis): 2
> INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100
> INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 6193
> INFOG(10) (total integer space store the matrix factors after factorization): 3036
> INFOG(11) (order of largest frontal matrix after factorization): 42
> INFOG(12) (number of off-diagonal pivots): 79
> INFOG(13) (number of delayed pivots after factorization): 110
> INFOG(14) (number of memory compress after factorization): 0
> INFOG(15) (number of steps of iterative refinement after solution): 0
> INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 0
> INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 0
> INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 0
> INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 0
> INFOG(20) (estimated number of entries in the factors): 3750
> INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 0
> INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 0
> INFOG(23) (after analysis: value of ICNTL(6) effectively used): 5
> INFOG(24) (after analysis: value of ICNTL(12) effectively used): 3
> INFOG(25) (after factorization: number of pivots modified by static pivoting): 0
> INFOG(28) (after factorization: number of null pivots encountered): 0
> INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 4896
> INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 0, 0
> INFOG(32) (after analysis: type of analysis done): 1
> INFOG(33) (value used for ICNTL(8)): -2
> INFOG(34) (exponent of the determinant if determinant is requested): 0
> INFOG(35) (after factorization: number of entries taking into account BLR factor compression - sum over all processors): 4896
> INFOG(36) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - value on the most memory consuming processor): 0
> INFOG(37) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - sum over all processors): 0
> INFOG(38) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - value on the most memory consuming processor): 0
> INFOG(39) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - sum over all processors): 0
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=240, cols=240
> total: nonzeros=4656, allocated nonzeros=4656
> total number of mallocs used during MatSetValues calls=0
> using I-node routines: found 167 nodes, limit used is 5
>
> Entering DMUMPS 5.2.1 from C interface with JOB = -2
> executing #MPI = 1, without OMP
> rank: 0 coefficient: 0.132368
>
> Entering DMUMPS 5.2.1 from C interface with JOB, N, NNZ = 1 960 9792
> executing #MPI = 1, without OMP
>
> =================================================
> MUMPS compiled with option -Dmetis
> MUMPS compiled with option -Dptscotch
> MUMPS compiled with option -Dscotch
> This MUMPS version includes code for SAVE_RESTORE
> =================================================
> L D L^T Solver for general symmetric matrices
> Type of parallelism: Working host
>
> ****** ANALYSIS STEP ********
>
> Scaling will be computed during analysis
> Compute maximum matching (Maximum Transversal): 5
> ... JOB = 5: MAXIMIZE PRODUCT DIAGONAL AND SCALE
>
> Entering analysis phase with ...
> N NNZ LIW INFO(1)
> 960 9792 20545 0
> Matrix entries: IRN() ICN()
> 1 1 1 2 1 3
> 1 4 1 5 1 6
> 1 7 1 8 1 9
> 1 10
> Average density of rows/columns = 18
> Average density of rows/columns = 18
> Ordering based on AMF
> Constrained Ordering based on AMF
> Average density of rows/columns = 18
> Average density of rows/columns = 18
> NFSIZ(.) = 0 0 0 58 0 0 0 73 14 0
>
> FILS (.) = 0 -747 -80 922 146 5 6 669 3 1
>
> FRERE(.) = 961 961 961 0 961 961 961 -4 -69 961
>
>
> Leaving analysis phase with ...
> INFOG(1) = 0
> INFOG(2) = 0
> -- (20) Number of entries in factors (estim.) = 20336
> -- (3) Real space for factors (estimated) = 24094
> -- (4) Integer space for factors (estimated) = 12143
> -- (5) Maximum frontal size (estimated) = 80
> -- (6) Number of nodes in the tree = 227
> -- (32) Type of analysis effectively used = 1
> -- (7) Ordering option effectively used = 2
> ICNTL(6) Maximum transversal option = 0
> ICNTL(7) Pivot order option = 2
> ICNTL(14) Percentage of memory relaxation = 20
> Number of level 2 nodes = 0
> Number of split nodes = 0
> RINFOG(1) Operations during elimination (estim)= 6.966D+05
> Ordering compressed/constrained (ICNTL(12)) = 3
>
> MEMORY ESTIMATIONS ...
> Estimations with standard Full-Rank (FR) factorization:
> Total space in MBytes, IC factorization (INFOG(17)): 1
> Total space in MBytes, OOC factorization (INFOG(27)): 1
>
> Elapsed time in analysis driver= 0.0066
>
> Entering DMUMPS 5.2.1 from C interface with JOB, N, NNZ = 2 960 9792
> executing #MPI = 1, without OMP
>
>
>
> ****** FACTORIZATION STEP ********
>
> GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
> Number of working processes = 1
> ICNTL(22) Out-of-core option = 0
> ICNTL(35) BLR activation (eff. choice) = 0
> ICNTL(14) Memory relaxation = 20
> INFOG(3) Real space for factors (estimated)= 24094
> INFOG(4) Integer space for factors (estim.)= 12143
> Maximum frontal size (estimated) = 80
> Number of nodes in the tree = 227
> Memory allowed (MB -- 0: N/A ) = 0
> Memory provided by user, sum of LWK_USER = 0
> Relative threshold for pivoting, CNTL(1) = 0.1000D-01
> ZERO PIVOT DETECTION ON, THRESHOLD = 2.9434468577175697E-020
> INFINITE FIXATION
> Effective size of S (based on INFO(39))= 31314
> Elapsed time to reformat/distribute matrix = 0.0006
> ** Memory allocated, total in Mbytes (INFOG(19)): 1
> ** Memory effectively used, total in Mbytes (INFOG(22)): 1
> ** Memory dynamically allocated for CB, total in Mbytes : 0
>
> Elapsed time for (failed) factorization = 0.0032
>
> Leaving factorization with ...
> RINFOG(2) Operations in node assembly = 3.366D+04
> ------(3) Operations in node elimination = 9.346D+05
> INFOG (9) Real space for factors = 26980
> INFOG(10) Integer space for factors = 13047
> INFOG(11) Maximum front size = 84
> INFOG(29) Number of entries in factors = 24047
> INFOG(12) Number of negative pivots = 294
> INFOG(13) Number of delayed pivots = 452
> Number of 2x2 pivots in type 1 nodes = 0
> Number of 2X2 pivots in type 2 nodes = 0
> Nb of null pivots detected by ICNTL(24) = 0
> INFOG(28) Estimated deficiency = 0
> INFOG(14) Number of memory compress = 1
>
> Elapsed time in factorization driver= 0.0042
> On return from DMUMPS, INFOG(1)= -9
> On return from DMUMPS, INFOG(2)= 22
> terminate called after throwing an instance of 'std::runtime_error'
> what():
>
> *** -------------------------------------------------------------------------
> *** DOLFIN encountered an error. If you are not able to resolve this issue
> *** using the information listed below, you can ask for help at
> ***
> *** fenics-support at googlegroups.com
> ***
> *** Remember to include the error message listed below and, if possible,
> *** include a *minimal* running example to reproduce the error.
> ***
> *** -------------------------------------------------------------------------
> *** Error: Unable to solve linear system using PETSc Krylov solver.
> *** Reason: Solution failed to converge in 0 iterations (PETSc reason DIVERGED_PC_FAILED, residual norm ||r|| = 0.000000e+00).
> *** Where: This error was encountered inside PETScKrylovSolver.cpp.
> *** Process: 0
> ***
> *** DOLFIN version: 2019.1.0
> *** Git changeset: 74d7efe1e84d65e9433fd96c50f1d278fa3e3f3f
> *** -------------------------------------------------------------------------
>
> Aborted (core dumped)
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210523/746f2449/attachment-0001.html>
More information about the petsc-users
mailing list