[petsc-users] Setting up MUMPS in PETSc
Jed Brown
jedbrown at mcs.anl.gov
Tue Oct 23 18:38:21 CDT 2012
On Tue, Oct 23, 2012 at 6:34 PM, Jinquan Zhong <jzhong at scsolutions.com>wrote:
> *
> That is true. I used*
>
> ** **
>
> -pc_type lu -pc_factor_mat_solver_package mumps -ksp_view
> -mat_mumps_icntl_4 2 -mat_mumps_icntl_5 0 -mat_mumps_icntl_18 3
> -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 -ksp_type gmres
> -ksp_monitor_singular_value -ksp_gmres_restart 1000
>
Read my reply about using GMRES to estimate condition number _while_ using
a preconditioner.
> ****
>
> ** **
>
> ** **
>
> *Here is the info I got from the setting I had. I didn’t see the
> condition number appeared:*
>
> ** **
>
> ** **
>
> Entering ZMUMPS driver with JOB, N, NZ = 1 894 0***
> *
>
> ** **
>
> ZMUMPS 4.10.0 ****
>
> L U Solver for unsymmetric matrices****
>
> Type of parallelism: Working host****
>
> ** **
>
> ****** ANALYSIS STEP ************
>
> ** **
>
> ** Max-trans not allowed because matrix is distributed****
>
> Using ParMETIS for parallel ordering.****
>
> Structual symmetry is:100%****
>
> WARNING: Largest root node of size 173 not selected for
> parallel execution****
>
> ** **
>
> Leaving analysis phase with ...****
>
> INFOG(1) = 0****
>
> INFOG(2) = 0****
>
> -- (20) Number of entries in factors (estim.) = 306174****
>
> -- (3) Storage of factors (REAL, estimated) = 306174****
>
> -- (4) Storage of factors (INT , estimated) = 7102****
>
> -- (5) Maximum frontal size (estimated) = 495****
>
> -- (6) Number of nodes in the tree = 66****
>
> -- (32) Type of analysis effectively used = 2****
>
> -- (7) Ordering option effectively used = 2****
>
> ICNTL(6) Maximum transversal option = 0****
>
> ICNTL(7) Pivot order option = 7****
>
> Percentage of memory relaxation (effective) = 25****
>
> Number of level 2 nodes = 0****
>
> Number of split nodes = 0****
>
> RINFOG(1) Operations during elimination (estim)= 8.929D+07****
>
> Distributed matrix entry format (ICNTL(18)) = 3****
>
> ** Rank of proc needing largest memory in IC facto : 1****
>
> ** Estimated corresponding MBYTES for IC facto : 43****
>
> ** Estimated avg. MBYTES per work. proc at facto (IC) : 39****
>
> ** TOTAL space in MBYTES for IC factorization : 156****
>
> ** Rank of proc needing largest memory for OOC facto : 1****
>
> ** Estimated corresponding MBYTES for OOC facto : 46****
>
> ** Estimated avg. MBYTES per work. proc at facto (OOC) : 41****
>
> ** TOTAL space in MBYTES for OOC factorization : 165****
>
> Entering ZMUMPS driver with JOB, N, NZ = 2 894 263360***
> *
>
> ** **
>
> ****** FACTORIZATION STEP ************
>
> ** **
>
> ** **
>
> GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...****
>
> NUMBER OF WORKING PROCESSES = 4****
>
> OUT-OF-CORE OPTION (ICNTL(22)) = 0****
>
> REAL SPACE FOR FACTORS = 306174****
>
> INTEGER SPACE FOR FACTORS = 7102****
>
> MAXIMUM FRONTAL SIZE (ESTIMATED) = 495****
>
> NUMBER OF NODES IN THE TREE = 66****
>
> Convergence error after scaling for ONE-NORM (option 7/8) = 0.31D+00****
>
> Maximum effective relaxed size of S = 306300****
>
> Average effective relaxed size of S = 185353****
>
> ** **
>
> REDISTRIB: TOTAL DATA LOCAL/SENT = 68276 195084****
>
> GLOBAL TIME FOR MATRIX DISTRIBUTION = 0.0145****
>
> ** Memory relaxation parameter ( ICNTL(14) ) : 25****
>
> ** Rank of processor needing largest memory in facto : 1****
>
> ** Space in MBYTES used by this processor for facto : 43****
>
> ** Avg. Space in MBYTES per working proc during facto : 39****
>
> ** **
>
> ELAPSED TIME FOR FACTORIZATION = 0.0862****
>
> Maximum effective space used in S (KEEP8(67) = 245025****
>
> Average effective space used in S (KEEP8(67) = 119077****
>
> ** EFF Min: Rank of processor needing largest memory : 1****
>
> ** EFF Min: Space in MBYTES used by this processor : 42****
>
> ** EFF Min: Avg. Space in MBYTES per working proc : 37****
>
> ** **
>
> GLOBAL STATISTICS ****
>
> RINFOG(2) OPERATIONS IN NODE ASSEMBLY = 2.372D+05****
>
> ------(3) OPERATIONS IN NODE ELIMINATION= 8.929D+07****
>
> INFOG (9) REAL SPACE FOR FACTORS = 306184****
>
> INFOG(10) INTEGER SPACE FOR FACTORS = 7104****
>
> INFOG(11) MAXIMUM FRONT SIZE = 495****
>
> INFOG(29) NUMBER OF ENTRIES IN FACTORS = 306184****
>
> INFOG(12) NB OF OFF DIAGONAL PIVOTS = 29****
>
> INFOG(13) NUMBER OF DELAYED PIVOTS = 1****
>
> INFOG(14) NUMBER OF MEMORY COMPRESS = 0****
>
> KEEP8(108) Extra copies IP stacking = 0****
>
> Entering ZMUMPS driver with JOB, N, NZ = 3 894 263360***
> *
>
> ** **
>
> ** **
>
> ****** SOLVE & CHECK STEP ************
>
> ** **
>
> ** **
>
> STATISTICS PRIOR SOLVE PHASE ...........****
>
> NUMBER OF RIGHT-HAND-SIDES = 1****
>
> BLOCKING FACTOR FOR MULTIPLE RHS = 1****
>
> ICNTL (9) = 1****
>
> --- (10) = 0****
>
> --- (11) = 0****
>
> --- (20) = 0****
>
> --- (21) = 1****
>
> --- (30) = 0****
>
> ** Rank of processor needing largest memory in solve : 1****
>
> ** Space in MBYTES used by this processor for solve : 6****
>
> ** Avg. Space in MBYTES per working proc during solve : 3****
>
> 0 KSP Residual norm 1.890433086271e+01 % max 1.000000000000e+00 min
> 1.000000000000e+00 max/min 1.000000000000e+00****
>
> Entering ZMUMPS driver with JOB, N, NZ = 3 894 263360***
> *
>
> ** **
>
> ** **
>
> ****** SOLVE & CHECK STEP ************
>
> ** **
>
> ** **
>
> STATISTICS PRIOR SOLVE PHASE ...........****
>
> NUMBER OF RIGHT-HAND-SIDES = 1****
>
> BLOCKING FACTOR FOR MULTIPLE RHS = 1****
>
> ICNTL (9) = 1****
>
> --- (10) = 0****
>
> --- (11) = 0****
>
> --- (20) = 0****
>
> --- (21) = 1****
>
> --- (30) = 0****
>
> ** Rank of processor needing largest memory in solve : 1****
>
> ** Space in MBYTES used by this processor for solve : 6****
>
> ** Avg. Space in MBYTES per working proc during solve : 3****
>
> 1 KSP Residual norm 1.804170434909e-06 % max 1.000000180789e+00 min
> 1.000000180789e+00 max/min 1.000000000000e+00****
>
> Entering ZMUMPS driver with JOB, N, NZ = 3 894 263360***
> *
>
> ** **
>
> ** **
>
> ****** SOLVE & CHECK STEP ************
>
> ** **
>
> ** **
>
> STATISTICS PRIOR SOLVE PHASE ...........****
>
> NUMBER OF RIGHT-HAND-SIDES = 1****
>
> BLOCKING FACTOR FOR MULTIPLE RHS = 1****
>
> ICNTL (9) = 1****
>
> --- (10) = 0****
>
> --- (11) = 0****
>
> --- (20) = 0****
>
> --- (21) = 1****
>
> --- (30) = 0****
>
> ** Rank of processor needing largest memory in solve : 1****
>
> ** Space in MBYTES used by this processor for solve : 6****
>
> ** Avg. Space in MBYTES per working proc during solve : 3****
>
> 2 KSP Residual norm 4.466758995607e-13 % max 1.000000467998e+00 min
> 9.999992798573e-01 max/min 1.000001188141e+00
>
Here are your estimates of max and min singular values, and their ratio
(the condition number). These are for the *preconditioned operator*. Since
you use a direct solver, it is almost exactly 1.
> ****
>
> KSP Object: 4 MPI processes****
>
> type: gmres****
>
> GMRES: restart=1000, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement****
>
> GMRES: happy breakdown tolerance 1e-30****
>
> maximum iterations=10000, initial guess is zero****
>
> tolerances: relative=1e-12, absolute=1e-12, divergence=10000****
>
> left preconditioning****
>
> using PRECONDITIONED norm type for convergence test****
>
> PC Object: 4 MPI processes****
>
> type: lu****
>
> LU: out-of-place factorization****
>
> tolerance for zero pivot 2.22045e-14****
>
> matrix ordering: natural****
>
> factor fill ratio given 0, needed 0****
>
> Factored matrix follows:****
>
> Matrix Object: 4 MPI processes****
>
> type: mpiaij****
>
> rows=894, cols=894****
>
> package used to perform factorization: mumps****
>
> total: nonzeros=306174, allocated nonzeros=306174****
>
> total number of mallocs used during MatSetValues calls =0****
>
> MUMPS run parameters:****
>
> SYM (matrix type): 0 ****
>
> PAR (host participation): 1 ****
>
> ICNTL(1) (output for error): 6 ****
>
> ICNTL(2) (output of diagnostic msg): 0 ****
>
> ICNTL(3) (output for global info): 6 ****
>
> ICNTL(4) (level of printing): 2 ****
>
> ICNTL(5) (input mat struct): 0 ****
>
> ICNTL(6) (matrix prescaling): 7 ****
>
> ICNTL(7) (sequentia matrix ordering):7 ****
>
> ICNTL(8) (scalling strategy): 77 ****
>
> ICNTL(10) (max num of refinements): 0 ****
>
> ICNTL(11) (error analysis): 0 ****
>
> ICNTL(12) (efficiency control): 1 **
> **
>
> ICNTL(13) (efficiency control): 0 **
> **
>
> ICNTL(14) (percentage of estimated workspace increase): 20 *
> ***
>
> ICNTL(18) (input mat struct): 3 **
> **
>
> ICNTL(19) (Shur complement info): 0 **
> **
>
> ICNTL(20) (rhs sparse pattern): 0 **
> **
>
> ICNTL(21) (solution struct): 1 **
> **
>
> ICNTL(22) (in-core/out-of-core facility): 0 **
> **
>
> ICNTL(23) (max size of memory can be allocated locally):0 **
> **
>
> ICNTL(24) (detection of null pivot rows): 0 **
> **
>
> ICNTL(25) (computation of a null space basis): 0 **
> **
>
> ICNTL(26) (Schur options for rhs or solution): 0 **
> **
>
> ICNTL(27) (experimental parameter): -8 *
> ***
>
> ICNTL(28) (use parallel or sequential ordering): 2 **
> **
>
> ICNTL(29) (parallel ordering): 2 **
> **
>
> ICNTL(30) (user-specified set of entries in inv(A)): 0 **
> **
>
> ICNTL(31) (factors is discarded in the solve phase): 0 **
> **
>
> ICNTL(33) (compute determinant): 0 **
> **
>
> CNTL(1) (relative pivoting threshold): 0.01 ****
>
> CNTL(2) (stopping criterion of refinement): 1.49012e-08 ****
>
> CNTL(3) (absolute pivoting threshold): 0 ****
>
> CNTL(4) (value of static pivoting): -1 ****
>
> CNTL(5) (fixation for null pivots): 0 ****
>
> RINFO(1) (local estimated flops for the elimination after
> analysis): ****
>
> [0] 4.25638e+06 ****
>
> [1] 7.22129e+07 ****
>
> [2] 8.01308e+06 ****
>
> [3] 4.81122e+06 ****
>
> RINFO(2) (local estimated flops for the assembly after
> factorization): ****
>
> [0] 68060 ****
>
> [1] 5860 ****
>
> [2] 83570 ****
>
> [3] 79676 ****
>
> RINFO(3) (local estimated flops for the elimination after
> factorization): ****
>
> [0] 4.25638e+06 ****
>
> [1] 7.2213e+07 ****
>
> [2] 8.01308e+06 ****
>
> [3] 4.81122e+06 ****
>
> INFO(15) (estimated size of (in MB) MUMPS internal data for
> running numerical factorization): ****
>
> [0] 37 ****
>
> [1] 43 ****
>
> [2] 38 ****
>
> [3] 38 ****
>
> INFO(16) (size of (in MB) MUMPS internal data used during
> numerical factorization): ****
>
> [0] 37 ****
>
> [1] 43 ****
>
> [2] 38 ****
>
> [3] 38 ****
>
> INFO(23) (num of pivots eliminated on this processor after
> factorization): ****
>
> [0] 350 ****
>
> [1] 319 ****
>
> [2] 134 ****
>
> [3] 91 ****
>
> RINFOG(1) (global estimated flops for the elimination after
> analysis): 8.92936e+07 ****
>
> RINFOG(2) (global estimated flops for the assembly after
> factorization): 237166 ****
>
> RINFOG(3) (global estimated flops for the elimination after
> factorization): 8.92937e+07 ****
>
> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant):
> (0,0)*(2^0)****
>
> INFOG(3) (estimated real workspace for factors on all
> processors after analysis): 306174 ****
>
> INFOG(4) (estimated integer workspace for factors on all
> processors after analysis): 7102 ****
>
> INFOG(5) (estimated maximum front size in the complete
> tree): 495 ****
>
> INFOG(6) (number of nodes in the complete tree): 66 ****
>
> INFOG(7) (ordering option effectively use after analysis): 2
> ****
>
> INFOG(8) (structural symmetry in percent of the permuted
> matrix after analysis): 100 ****
>
> INFOG(9) (total real/complex workspace to store the matrix
> factors after factorization): 306184 ****
>
> INFOG(10) (total integer space store the matrix factors
> after factorization): 7104 ****
>
> INFOG(11) (order of largest frontal matrix after
> factorization): 495 ****
>
> INFOG(12) (number of off-diagonal pivots): 29 ****
>
> INFOG(13) (number of delayed pivots after factorization): 1
> ****
>
> INFOG(14) (number of memory compress after factorization): 0
> ****
>
> INFOG(15) (number of steps of iterative refinement after
> solution): 0 ****
>
> INFOG(16) (estimated size (in MB) of all MUMPS internal data
> for factorization after analysis: value on the most memory consuming
> processor): 43 ****
>
> INFOG(17) (estimated size of all MUMPS internal data for
> factorization after analysis: sum over all processors): 156 ****
>
> INFOG(18) (size of all MUMPS internal data allocated during
> factorization: value on the most memory consuming processor): 43 ****
>
> INFOG(19) (size of all MUMPS internal data allocated during
> factorization: sum over all processors): 156 ****
>
> INFOG(20) (estimated number of entries in the factors):
> 306174 ****
>
> INFOG(21) (size in MB of memory effectively used during
> factorization - value on the most memory consuming processor): 42 ****
>
> INFOG(22) (size in MB of memory effectively used during
> factorization - sum over all processors): 150 ****
>
> INFOG(23) (after analysis: value of ICNTL(6) effectively
> used): 0 ****
>
> INFOG(24) (after analysis: value of ICNTL(12) effectively
> used): 1 ****
>
> INFOG(25) (after factorization: number of pivots modified by
> static pivoting): 0 ****
>
> linear system matrix = precond matrix:****
>
> Matrix Object: 4 MPI processes****
>
> type: mpiaij****
>
> rows=894, cols=894****
>
> total: nonzeros=263360, allocated nonzeros=263360****
>
> total number of mallocs used during MatSetValues calls =0****
>
> using I-node (on process 0) routines: found 47 nodes, limit used is 5
> ****
>
> >> # of iterations: 2****
>
> KSPConvergedReason: 3****
>
> Entering ZMUMPS driver with JOB, N, NZ = -2 894 263360***
> *
>
> ** **
>
> ** **
>
> Jinquan****
>
> ** **
>
> *From:* petsc-users-bounces at mcs.anl.gov [mailto:
> petsc-users-bounces at mcs.anl.gov] *On Behalf Of *Matthew Knepley
> *Sent:* Tuesday, October 23, 2012 4:30 PM
>
> *To:* PETSc users list
> *Subject:* Re: [petsc-users] Setting up MUMPS in PETSc****
>
> ** **
>
> On Tue, Oct 23, 2012 at 7:17 PM, Jinquan Zhong <jzhong at scsolutions.com>
> wrote:****
>
> Thanks, Jed.****
>
> ****
>
> Any way to get the condition number. I used ****
>
> ****
>
> -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000****
>
> ****
>
> It didn’t work.****
>
> ** **
>
> This is not helpful. Would you understand this if someone mailed you "It
> didn't work"? Send the output.****
>
> ** **
>
> Matt****
>
> ****
>
> ****
>
> Jinquan****
>
> ****
>
> *From:* petsc-users-bounces at mcs.anl.gov [mailto:
> petsc-users-bounces at mcs.anl.gov] *On Behalf Of *Jed Brown
> *Sent:* Tuesday, October 23, 2012 3:28 PM
> *To:* PETSc users list
> *Subject:* Re: [petsc-users] Setting up MUMPS in PETSc****
>
> ****
>
> On Tue, Oct 23, 2012 at 5:21 PM, Jinquan Zhong <jzhong at scsolutions.com>
> wrote:****
>
> That is new for me. What would you suggest, Matt?****
>
> ****
>
> Were you using LU or Cholesky before? That is the difference between
> -pc_type lu and -pc_type cholesky. Use -pc_factor_mat_solver_package mumps
> to choose MUMPS. You can access the MUMPS options with
> -mat_mumps_icntl_opaquenumber.****
>
> ****
>
> It looks like PETSc's Clique interface does not work in parallel. (A
> student was working on it recently and it seems to work in serial.) When it
> is fixed to work in parallel, that is almost certainly what you should use.
> Alternatively, there may well be a Fast method, depending on the structure
> of the system and that fat dense block.****
>
>
>
> ****
>
> ** **
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener****
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20121023/15738f87/attachment-0001.html>
More information about the petsc-users
mailing list