[petsc-dev] GAMG broken in master

Wed Feb 18 23:01:57 CST 2015

 WTF? This makes no sense, can it be reverted?

> On Feb 18, 2015, at 10:57 PM, Tobin Isaac <tisaac at ices.utexas.edu> wrote:
> 
> On Wed, Feb 18, 2015 at 10:06:18PM -0600, Barry Smith wrote:
>> 
>>  Hmm, it seems GAMG is only doing 2 levels in master for all problems?
>> 
>> ./ex29 -da_refine 8 -pc_type gamg -ksp_view  
>> 
>> uses only two levels. Makes no sense.
>> 
>> Did someone break it?
> 
> Here's the culprit:
> 
> https://bitbucket.org/petsc/petsc/commits/25a145a7bcab6e5b3c8766679c77bee80f328690#Lsrc/ksp/pc/impls/gamg/gamg.cT664
> 
> It now always stops when there is only one active process.
> 
>  Toby
> 
>> 
>> 
>> 
>> 
>> 
>> 
>>> On Feb 18, 2015, at 9:48 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>> 
>>> 
>>> Mark,
>>> 
>>>  When I run ksp/ksp/examples/tutorials/ex45 I get a VERY large coarse problem. It seems to ignore the -pc_gamg_coarse_eq_limit 200 argument. Any idea what is going on?
>>> 
>>>  Thanks
>>> 
>>>   Barry
>>> 
>>> 
>>> $ ./ex45 -da_refine 3 -pc_type gamg -ksp_monitor -ksp_view -log_summary -pc_gamg_coarse_eq_limit 200
>>> 0 KSP Residual norm 2.790769524030e+02 
>>> 1 KSP Residual norm 4.484052193577e+01 
>>> 2 KSP Residual norm 2.409368790441e+00 
>>> 3 KSP Residual norm 1.553421589919e-01 
>>> 4 KSP Residual norm 9.821441923699e-03 
>>> 5 KSP Residual norm 5.610434857134e-04 
>>> KSP Object: 1 MPI processes
>>> type: gmres
>>>   GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>   GMRES: happy breakdown tolerance 1e-30
>>> maximum iterations=10000
>>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>> left preconditioning
>>> using nonzero initial guess
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI processes
>>> type: gamg
>>>   MG: type is MULTIPLICATIVE, levels=2 cycles=v
>>>     Cycles per PCApply=1
>>>     Using Galerkin computed coarse grid matrices
>>> Coarse grid solver -- level -------------------------------
>>>   KSP Object:    (mg_coarse_)     1 MPI processes
>>>     type: gmres
>>>       GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>       GMRES: happy breakdown tolerance 1e-30
>>>     maximum iterations=1, initial guess is zero
>>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>     left preconditioning
>>>     using NONE norm type for convergence test
>>>   PC Object:    (mg_coarse_)     1 MPI processes
>>>     type: bjacobi
>>>       block Jacobi: number of blocks = 1
>>>       Local solve is same for all blocks, in the following KSP and PC objects:
>>>       KSP Object:        (mg_coarse_sub_)         1 MPI processes
>>>         type: preonly
>>>         maximum iterations=1, initial guess is zero
>>>         tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>         left preconditioning
>>>         using NONE norm type for convergence test
>>>       PC Object:        (mg_coarse_sub_)         1 MPI processes
>>>         type: lu
>>>           LU: out-of-place factorization
>>>           tolerance for zero pivot 2.22045e-14
>>>           using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
>>>           matrix ordering: nd
>>>           factor fill ratio given 5, needed 36.4391
>>>             Factored matrix follows:
>>>               Mat Object:                 1 MPI processes
>>>                 type: seqaij
>>>                 rows=16587, cols=16587
>>>                 package used to perform factorization: petsc
>>>                 total: nonzeros=1.8231e+07, allocated nonzeros=1.8231e+07
>>>                 total number of mallocs used during MatSetValues calls =0
>>>                   not using I-node routines
>>>         linear system matrix = precond matrix:
>>>         Mat Object:           1 MPI processes
>>>           type: seqaij
>>>           rows=16587, cols=16587
>>>           total: nonzeros=500315, allocated nonzeros=500315
>>>           total number of mallocs used during MatSetValues calls =0
>>>             not using I-node routines
>>>     linear system matrix = precond matrix:
>>>     Mat Object:       1 MPI processes
>>>       type: seqaij
>>>       rows=16587, cols=16587
>>>       total: nonzeros=500315, allocated nonzeros=500315
>>>       total number of mallocs used during MatSetValues calls =0
>>>         not using I-node routines
>>> Down solver (pre-smoother) on level 1 -------------------------------
>>>   KSP Object:    (mg_levels_1_)     1 MPI processes
>>>     type: chebyshev
>>>       Chebyshev: eigenvalue estimates:  min = 0.0976343, max = 2.05032
>>>     maximum iterations=2
>>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>     left preconditioning
>>>     using nonzero initial guess
>>>     using NONE norm type for convergence test
>>>   PC Object:    (mg_levels_1_)     1 MPI processes
>>>     type: sor
>>>       SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
>>>     linear system matrix = precond matrix:
>>>     Mat Object:       1 MPI processes
>>>       type: seqaij
>>>       rows=117649, cols=117649
>>>       total: nonzeros=809137, allocated nonzeros=809137
>>>       total number of mallocs used during MatSetValues calls =0
>>>         not using I-node routines
>>> Up solver (post-smoother) same as down solver (pre-smoother)
>>> linear system matrix = precond matrix:
>>> Mat Object:   1 MPI processes
>>>   type: seqaij
>>>   rows=117649, cols=117649
>>>   total: nonzeros=809137, allocated nonzeros=809137
>>>   total number of mallocs used during MatSetValues calls =0
>>>     not using I-node routines
>>> Residual norm 3.81135e-05
>>> ************************************************************************************************************************
>>> ***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
>>> ************************************************************************************************************************
>>> 
>>> ---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
>>> 
>>> ./ex45 on a arch-opt named Barrys-MacBook-Pro.local with 1 processor, by barrysmith Wed Feb 18 21:38:03 2015
>>> Using Petsc Development GIT revision: v3.5.3-1998-geddef31  GIT Date: 2015-02-18 11:05:09 -0600
>>> 
>>>                        Max       Max/Min        Avg      Total 
>>> Time (sec):           1.103e+01      1.00000   1.103e+01
>>> Objects:              9.200e+01      1.00000   9.200e+01
>>> Flops:                1.756e+10      1.00000   1.756e+10  1.756e+10
>>> Flops/sec:            1.592e+09      1.00000   1.592e+09  1.592e+09
>>> MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
>>> MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
>>> MPI Reductions:       0.000e+00      0.00000
>>> 
>>> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
>>>                           e.g., VecAXPY() for real vectors of length N --> 2N flops
>>>                           and VecAXPY() for complex vectors of length N --> 8N flops
>>> 
>>> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
>>>                       Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
>>> 0:      Main Stage: 1.1030e+01 100.0%  1.7556e+10 100.0%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0% 
>>> 
>>> ------------------------------------------------------------------------------------------------------------------------
>>> See the 'Profiling' chapter of the users' manual for details on interpreting output.
>>> Phase summary info:
>>>  Count: number of times phase was executed
>>>  Time and Flops: Max - maximum over all processors
>>>                  Ratio - ratio of maximum to minimum over all processors
>>>  Mess: number of messages sent
>>>  Avg. len: average message length (bytes)
>>>  Reduct: number of global reductions
>>>  Global: entire computation
>>>  Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
>>>     %T - percent time in this phase         %F - percent flops in this phase
>>>     %M - percent messages in this phase     %L - percent message lengths in this phase
>>>     %R - percent reductions in this phase
>>>  Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
>>> ------------------------------------------------------------------------------------------------------------------------
>>> Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
>>>                  Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>>> ------------------------------------------------------------------------------------------------------------------------
>>> 
>>> --- Event Stage 0: Main Stage
>>> 
>>> KSPGMRESOrthog        21 1.0 8.8868e-03 1.0 3.33e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  3752
>>> KSPSetUp               5 1.0 4.3986e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> KSPSolve               1 1.0 1.0995e+01 1.0 1.76e+10 1.0 0.0e+00 0.0e+00 0.0e+00100100  0  0  0 100100  0  0  0  1596
>>> VecMDot               21 1.0 4.7335e-03 1.0 1.67e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  3522
>>> VecNorm               30 1.0 9.4804e-04 1.0 4.63e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  4887
>>> VecScale              29 1.0 7.8293e-04 1.0 2.20e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2809
>>> VecCopy               14 1.0 7.7058e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> VecSet               102 1.0 1.4530e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> VecAXPY                9 1.0 3.8154e-04 1.0 9.05e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2372
>>> VecAYPX               48 1.0 5.6449e-03 1.0 7.06e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1251
>>> VecAXPBYCZ            24 1.0 4.0700e-03 1.0 1.41e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  3469
>>> VecMAXPY              29 1.0 5.1512e-03 1.0 2.04e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  3960
>>> VecAssemblyBegin       1 1.0 6.7055e-08 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> VecAssemblyEnd         1 1.0 8.1025e-08 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> VecPointwiseMult      11 1.0 1.8083e-03 1.0 1.29e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   716
>>> VecSetRandom           1 1.0 1.7628e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> VecNormalize          29 1.0 1.7100e-03 1.0 6.60e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  3858
>>> MatMult               58 1.0 5.0949e-02 1.0 8.39e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1647
>>> MatMultAdd             6 1.0 5.2584e-03 1.0 5.01e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   952
>>> MatMultTranspose       6 1.0 6.1330e-03 1.0 5.01e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   816
>>> MatSolve              12 1.0 2.0657e-01 1.0 4.37e+08 1.0 0.0e+00 0.0e+00 0.0e+00  2  2  0  0  0   2  2  0  0  0  2117
>>> MatSOR                36 1.0 7.1355e-02 1.0 5.84e+07 1.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0   818
>>> MatLUFactorSym         1 1.0 3.4310e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   3  0  0  0  0     0
>>> MatLUFactorNum         1 1.0 9.8038e+00 1.0 1.69e+10 1.0 0.0e+00 0.0e+00 0.0e+00 89 96  0  0  0  89 96  0  0  0  1721
>>> MatConvert             1 1.0 5.6955e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MatScale               3 1.0 2.7223e-03 1.0 2.45e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   901
>>> MatResidual            6 1.0 6.2142e-03 1.0 9.71e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1562
>>> MatAssemblyBegin      12 1.0 2.7413e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MatAssemblyEnd        12 1.0 2.4857e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MatGetRow         470596 1.0 2.4337e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MatGetRowIJ            1 1.0 2.3254e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MatGetOrdering         1 1.0 1.7668e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MatCoarsen             1 1.0 8.5790e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MatView                5 1.0 2.2273e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MatAXPY                1 1.0 1.8864e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MatMatMult             1 1.0 2.4513e-02 1.0 2.03e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    83
>>> MatMatMultSym          1 1.0 1.7885e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MatMatMultNum          1 1.0 6.6144e-03 1.0 2.03e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   307
>>> MatPtAP                1 1.0 1.1460e-01 1.0 1.30e+07 1.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0   114
>>> MatPtAPSymbolic        1 1.0 4.6803e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MatPtAPNumeric         1 1.0 6.7781e-02 1.0 1.30e+07 1.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0   192
>>> MatTrnMatMult          1 1.0 9.1702e-02 1.0 1.02e+07 1.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0   111
>>> MatTrnMatMultSym       1 1.0 6.0173e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
>>> MatTrnMatMultNum       1 1.0 3.1526e-02 1.0 1.02e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   324
>>> MatGetSymTrans         2 1.0 4.2753e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> PCGAMGgraph_AGG        1 1.0 6.9175e-02 1.0 1.62e+06 1.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0    23
>>> PCGAMGcoarse_AGG       1 1.0 1.1130e-01 1.0 1.02e+07 1.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0    92
>>> PCGAMGProl_AGG         1 1.0 2.9380e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> PCGAMGPOpt_AGG         1 1.0 9.1377e-02 1.0 5.15e+07 1.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0   564
>>> PCSetUp                2 1.0 1.0587e+01 1.0 1.69e+10 1.0 0.0e+00 0.0e+00 0.0e+00 96 97  0  0  0  96 97  0  0  0  1601
>>> PCSetUpOnBlocks        6 1.0 1.0165e+01 1.0 1.69e+10 1.0 0.0e+00 0.0e+00 0.0e+00 92 96  0  0  0  92 96  0  0  0  1660
>>> PCApply                6 1.0 1.0503e+01 1.0 1.75e+10 1.0 0.0e+00 0.0e+00 0.0e+00 95 99  0  0  0  95 99  0  0  0  1662
>>> ------------------------------------------------------------------------------------------------------------------------
>>> 
>>> 
>>> 
>>