[petsc-users] Setting up MUMPS in PETSc

Jinquan Zhong jzhong at scsolutions.com
Tue Oct 23 18:42:12 CDT 2012


I am seeing

652 KSP Residual norm 6.568962571627e+02 % max 1.464924876151e+14 min 8.431983181754e+03 max/min 1.737343214015e+10
653 KSP Residual norm 6.562904211497e+02 % max 1.464924876151e+14 min 8.431776971754e+03 max/min 1.737385702988e+10

when I used
-pc_type none

Does that indicate the Cond# is 1.7x10e10?

Jinquan

From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Jed Brown
Sent: Tuesday, October 23, 2012 4:38 PM
To: PETSc users list
Subject: Re: [petsc-users] Setting up MUMPS in PETSc

On Tue, Oct 23, 2012 at 6:34 PM, Jinquan Zhong <jzhong at scsolutions.com<mailto:jzhong at scsolutions.com>> wrote:

That is true.  I used

-pc_type lu -pc_factor_mat_solver_package mumps -ksp_view -mat_mumps_icntl_4 2 -mat_mumps_icntl_5 0 -mat_mumps_icntl_18 3 -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000

Read my reply about using GMRES to estimate condition number _while_ using a preconditioner.



Here is the info I got from the setting I had.  I didn’t see the condition number appeared:


Entering ZMUMPS driver with JOB, N, NZ =   1         894              0

ZMUMPS 4.10.0
L U Solver for unsymmetric matrices
Type of parallelism: Working host

****** ANALYSIS STEP ********

** Max-trans not allowed because matrix is distributed
Using ParMETIS for parallel ordering.
Structual symmetry is:100%
  WARNING: Largest root node of size          173  not selected for parallel execution

Leaving analysis phase with  ...
INFOG(1)                                       =               0
INFOG(2)                                       =               0
-- (20) Number of entries in factors (estim.) =          306174
--  (3) Storage of factors  (REAL, estimated) =          306174
--  (4) Storage of factors  (INT , estimated) =            7102
--  (5) Maximum frontal size      (estimated) =             495
--  (6) Number of nodes in the tree           =              66
-- (32) Type of analysis effectively used     =               2
--  (7) Ordering option effectively used      =               2
ICNTL(6) Maximum transversal option            =               0
ICNTL(7) Pivot order option                    =               7
Percentage of memory relaxation (effective)    =              25
Number of level 2 nodes                        =               0
Number of split nodes                          =               0
RINFOG(1) Operations during elimination (estim)=   8.929D+07
Distributed matrix entry format (ICNTL(18))    =               3
** Rank of proc needing largest memory in IC facto        :         1
** Estimated corresponding MBYTES for IC facto            :        43
** Estimated avg. MBYTES per work. proc at facto (IC)     :        39
** TOTAL     space in MBYTES for IC factorization         :       156
** Rank of proc needing largest memory for OOC facto      :         1
** Estimated corresponding MBYTES for OOC facto           :        46
** Estimated avg. MBYTES per work. proc at facto (OOC)    :        41
** TOTAL     space in MBYTES for OOC factorization        :       165
Entering ZMUMPS driver with JOB, N, NZ =   2         894         263360

****** FACTORIZATION STEP ********


GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
NUMBER OF WORKING PROCESSES              =           4
OUT-OF-CORE OPTION (ICNTL(22))           =           0
REAL SPACE FOR FACTORS                   =      306174
INTEGER SPACE FOR FACTORS                =        7102
MAXIMUM FRONTAL SIZE (ESTIMATED)         =         495
NUMBER OF NODES IN THE TREE              =          66
Convergence error after scaling for ONE-NORM (option 7/8)   = 0.31D+00
Maximum effective relaxed size of S              =      306300
Average effective relaxed size of S              =      185353

REDISTRIB: TOTAL DATA LOCAL/SENT         =       68276      195084
GLOBAL TIME FOR MATRIX DISTRIBUTION       =      0.0145
** Memory relaxation parameter ( ICNTL(14)  )            :        25
** Rank of processor needing largest memory in facto     :         1
** Space in MBYTES used by this processor for facto      :        43
** Avg. Space in MBYTES per working proc during facto    :        39

ELAPSED TIME FOR FACTORIZATION           =      0.0862
Maximum effective space used in S   (KEEP8(67)   =      245025
Average effective space used in S   (KEEP8(67)   =      119077
** EFF Min: Rank of processor needing largest memory :         1
** EFF Min: Space in MBYTES used by this processor   :        42
** EFF Min: Avg. Space in MBYTES per working proc    :        37

GLOBAL STATISTICS
 RINFOG(2)  OPERATIONS IN NODE ASSEMBLY   = 2.372D+05
------(3)  OPERATIONS IN NODE ELIMINATION= 8.929D+07
INFOG (9)  REAL SPACE FOR FACTORS        =      306184
INFOG(10)  INTEGER SPACE FOR FACTORS     =        7104
INFOG(11)  MAXIMUM FRONT SIZE            =         495
INFOG(29)  NUMBER OF ENTRIES IN FACTORS  =      306184
INFOG(12) NB OF OFF DIAGONAL PIVOTS      =          29
INFOG(13)  NUMBER OF DELAYED PIVOTS      =           1
INFOG(14)  NUMBER OF MEMORY COMPRESS     =           0
KEEP8(108) Extra copies IP stacking      =           0
Entering ZMUMPS driver with JOB, N, NZ =   3         894         263360


****** SOLVE & CHECK STEP ********


STATISTICS PRIOR SOLVE PHASE     ...........
NUMBER OF RIGHT-HAND-SIDES                    =           1
BLOCKING FACTOR FOR MULTIPLE RHS              =           1
ICNTL (9)                                     =           1
  --- (10)                                     =           0
  --- (11)                                     =           0
  --- (20)                                     =           0
  --- (21)                                     =           1
 --- (30)                                     =           0
** Rank of processor needing largest memory in solve     :         1
** Space in MBYTES used by this processor for solve      :         6
** Avg. Space in MBYTES per working proc during solve    :         3
  0 KSP Residual norm 1.890433086271e+01 % max 1.000000000000e+00 min 1.000000000000e+00 max/min 1.000000000000e+00
Entering ZMUMPS driver with JOB, N, NZ =   3         894         263360


****** SOLVE & CHECK STEP ********


STATISTICS PRIOR SOLVE PHASE     ...........
NUMBER OF RIGHT-HAND-SIDES                    =           1
BLOCKING FACTOR FOR MULTIPLE RHS              =           1
ICNTL (9)                                     =           1
  --- (10)                                     =           0
  --- (11)                                     =           0
  --- (20)                                     =           0
  --- (21)                                     =           1
  --- (30)                                     =           0
** Rank of processor needing largest memory in solve     :         1
** Space in MBYTES used by this processor for solve      :         6
** Avg. Space in MBYTES per working proc during solve    :         3
  1 KSP Residual norm 1.804170434909e-06 % max 1.000000180789e+00 min 1.000000180789e+00 max/min 1.000000000000e+00
Entering ZMUMPS driver with JOB, N, NZ =   3         894         263360


****** SOLVE & CHECK STEP ********


STATISTICS PRIOR SOLVE PHASE     ...........
NUMBER OF RIGHT-HAND-SIDES                    =           1
BLOCKING FACTOR FOR MULTIPLE RHS              =           1
ICNTL (9)                                     =           1
  --- (10)                                     =           0
  --- (11)                                     =           0
  --- (20)                                     =           0
  --- (21)                                     =           1
  --- (30)                                     =           0
** Rank of processor needing largest memory in solve     :         1
** Space in MBYTES used by this processor for solve      :         6
** Avg. Space in MBYTES per working proc during solve    :         3
  2 KSP Residual norm 4.466758995607e-13 % max 1.000000467998e+00 min 9.999992798573e-01 max/min 1.000001188141e+00

Here are your estimates of max and min singular values, and their ratio (the condition number). These are for the preconditioned operator. Since you use a direct solver, it is almost exactly 1.

KSP Object: 4 MPI processes
  type: gmres
    GMRES: restart=1000, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-12, absolute=1e-12, divergence=10000
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 4 MPI processes
  type: lu
    LU: out-of-place factorization
    tolerance for zero pivot 2.22045e-14
    matrix ordering: natural
    factor fill ratio given 0, needed 0
      Factored matrix follows:
        Matrix Object:         4 MPI processes
          type: mpiaij
          rows=894, cols=894
          package used to perform factorization: mumps
          total: nonzeros=306174, allocated nonzeros=306174
          total number of mallocs used during MatSetValues calls =0
            MUMPS run parameters:
              SYM (matrix type):                   0
              PAR (host participation):            1
              ICNTL(1) (output for error):         6
              ICNTL(2) (output of diagnostic msg): 0
              ICNTL(3) (output for global info):   6
              ICNTL(4) (level of printing):        2
              ICNTL(5) (input mat struct):         0
              ICNTL(6) (matrix prescaling):        7
              ICNTL(7) (sequentia matrix ordering):7
              ICNTL(8) (scalling strategy):        77
              ICNTL(10) (max num of refinements):  0
              ICNTL(11) (error analysis):          0
              ICNTL(12) (efficiency control):                         1
              ICNTL(13) (efficiency control):                         0
              ICNTL(14) (percentage of estimated workspace increase): 20
              ICNTL(18) (input mat struct):                           3
              ICNTL(19) (Shur complement info):                       0
              ICNTL(20) (rhs sparse pattern):                         0
              ICNTL(21) (solution struct):                            1
              ICNTL(22) (in-core/out-of-core facility):               0
              ICNTL(23) (max size of memory can be allocated locally):0
              ICNTL(24) (detection of null pivot rows):               0
              ICNTL(25) (computation of a null space basis):          0
              ICNTL(26) (Schur options for rhs or solution):          0
              ICNTL(27) (experimental parameter):                     -8
              ICNTL(28) (use parallel or sequential ordering):        2
              ICNTL(29) (parallel ordering):                          2
              ICNTL(30) (user-specified set of entries in inv(A)):    0
              ICNTL(31) (factors is discarded in the solve phase):    0
              ICNTL(33) (compute determinant):                        0
              CNTL(1) (relative pivoting threshold):      0.01
              CNTL(2) (stopping criterion of refinement): 1.49012e-08
              CNTL(3) (absolute pivoting threshold):      0
              CNTL(4) (value of static pivoting):         -1
              CNTL(5) (fixation for null pivots):         0
              RINFO(1) (local estimated flops for the elimination after analysis):
                [0] 4.25638e+06
                [1] 7.22129e+07
                [2] 8.01308e+06
                [3] 4.81122e+06
              RINFO(2) (local estimated flops for the assembly after factorization):
                [0]  68060
                [1]  5860
                [2]  83570
                [3]  79676
              RINFO(3) (local estimated flops for the elimination after factorization):
                [0]  4.25638e+06
                [1]  7.2213e+07
                [2]  8.01308e+06
                [3]  4.81122e+06
              INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization):
              [0] 37
              [1] 43
              [2] 38
              [3] 38
              INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization):
                [0] 37
                [1] 43
                [2] 38
                [3] 38
              INFO(23) (num of pivots eliminated on this processor after factorization):
                [0] 350
                [1] 319
                [2] 134
                [3] 91
              RINFOG(1) (global estimated flops for the elimination after analysis): 8.92936e+07
              RINFOG(2) (global estimated flops for the assembly after factorization): 237166
              RINFOG(3) (global estimated flops for the elimination after factorization): 8.92937e+07
              (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0)
              INFOG(3) (estimated real workspace for factors on all processors after analysis): 306174
              INFOG(4) (estimated integer workspace for factors on all processors after analysis): 7102
              INFOG(5) (estimated maximum front size in the complete tree): 495
              INFOG(6) (number of nodes in the complete tree): 66
              INFOG(7) (ordering option effectively use after analysis): 2
              INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100
              INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 306184
              INFOG(10) (total integer space store the matrix factors after factorization): 7104
              INFOG(11) (order of largest frontal matrix after factorization): 495
              INFOG(12) (number of off-diagonal pivots): 29
              INFOG(13) (number of delayed pivots after factorization): 1
              INFOG(14) (number of memory compress after factorization): 0
              INFOG(15) (number of steps of iterative refinement after solution): 0
              INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 43
              INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 156
              INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 43
              INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 156
              INFOG(20) (estimated number of entries in the factors): 306174
              INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 42
              INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 150
              INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0
              INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1
              INFOG(25) (after factorization: number of pivots modified by static pivoting): 0
  linear system matrix = precond matrix:
  Matrix Object:   4 MPI processes
    type: mpiaij
    rows=894, cols=894
    total: nonzeros=263360, allocated nonzeros=263360
    total number of mallocs used during MatSetValues calls =0
      using I-node (on process 0) routines: found 47 nodes, limit used is 5
>>  # of iterations: 2
KSPConvergedReason: 3
Entering ZMUMPS driver with JOB, N, NZ =  -2         894         263360


Jinquan

From: petsc-users-bounces at mcs.anl.gov<mailto:petsc-users-bounces at mcs.anl.gov> [mailto:petsc-users-bounces at mcs.anl.gov<mailto:petsc-users-bounces at mcs.anl.gov>] On Behalf Of Matthew Knepley
Sent: Tuesday, October 23, 2012 4:30 PM

To: PETSc users list
Subject: Re: [petsc-users] Setting up MUMPS in PETSc

On Tue, Oct 23, 2012 at 7:17 PM, Jinquan Zhong <jzhong at scsolutions.com<mailto:jzhong at scsolutions.com>> wrote:
Thanks, Jed.

Any way to get the condition number.  I used

-ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000

It didn’t work.

This is not helpful. Would you understand this if someone mailed you "It didn't work"? Send the output.

   Matt


Jinquan

From: petsc-users-bounces at mcs.anl.gov<mailto:petsc-users-bounces at mcs.anl.gov> [mailto:petsc-users-bounces at mcs.anl.gov<mailto:petsc-users-bounces at mcs.anl.gov>] On Behalf Of Jed Brown
Sent: Tuesday, October 23, 2012 3:28 PM
To: PETSc users list
Subject: Re: [petsc-users] Setting up MUMPS in PETSc

On Tue, Oct 23, 2012 at 5:21 PM, Jinquan Zhong <jzhong at scsolutions.com<mailto:jzhong at scsolutions.com>> wrote:
That is new for me.  What would you suggest, Matt?

Were you using LU or Cholesky before? That is the difference between -pc_type lu and -pc_type cholesky. Use -pc_factor_mat_solver_package mumps to choose MUMPS. You can access the MUMPS options with -mat_mumps_icntl_opaquenumber.

It looks like PETSc's Clique interface does not work in parallel. (A student was working on it recently and it seems to work in serial.) When it is fixed to work in parallel, that is almost certainly what you should use. Alternatively, there may well be a Fast method, depending on the structure of the system and that fat dense block.



--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20121023/3011f676/attachment-0001.html>


More information about the petsc-users mailing list