[petsc-dev] GAMG broken in master
Mark Adams
mfadams at lbl.gov
Fri Feb 20 15:02:59 CST 2015
I pushed a fix to next.
Mark
On Fri, Feb 20, 2015 at 3:19 PM, Mark Adams <mfadams at lbl.gov> wrote:
> Whoops did not configure.
>
> On Fri, Feb 20, 2015 at 3:17 PM, Mark Adams <mfadams at lbl.gov> wrote:
>
>> Humm, I'm not seeing this...
>>
>> 15:13 1 mark/gamg-serial ~/Codes/petsc/src/ksp/ksp/examples/tutorials$
>> mpirun -n 1 ./ex45 -da_refine 3 -pc_type gamg -ksp_monitor -ksp_view
>> -log_summary -pc_gamg_coarse_eq_limit 200
>> [0]PCSetUp_GAMG level 0 N=117649, n data rows=1, n data cols=1,
>> nnz/row (ave)=6, np=1
>> [0]PCGAMGFilterGraph 91.528% nnz after filtering, with threshold
>> 0, 6.87755 nnz ave. (N=117649)
>> [0]PCGAMGCoarsen_AGG square graph
>> [0]PCGAMGCoarsen_AGG coarsen graph
>> [0]maxIndSetAgg removed 572 of 117649 vertices. (572 local)
>> 16587 selected.
>> [0]PCGAMGProlongator_AGG New grid 16587 nodes
>> PCGAMGOptprol_AGG smooth P0: max
>> eigen=1.952686e+00 min=9.933674e-03 PC=jacobi
>> [0]PCSetUp_GAMG 1) N=16587, n data cols=1, nnz/row
>> (ave)=30, 1 active pes
>> [0]PCGAMGFilterGraph 84.7708% nnz after filtering, with threshold
>> 0, 30.1631 nnz ave. (N=16587)
>> [0]PCGAMGCoarsen_AGG square graph
>> [0]PCGAMGCoarsen_AGG coarsen graph
>> [0]maxIndSetAgg removed 0 of 16587 vertices. (0 local) 353
>> selected.
>> [0]PCGAMGProlongator_AGG New grid 353 nodes
>> PCGAMGOptprol_AGG smooth P0: max
>> eigen=1.393979e+00 min=2.197135e-02 PC=jacobi
>> [0]PCSetUp_GAMG 2) N=353, n data cols=1, nnz/row
>> (ave)=47, 1 active pes
>> [0]PCGAMGFilterGraph 99.7358% nnz after filtering, with threshold
>> 0, 47.1756 nnz ave. (N=353)
>> [0]PCGAMGCoarsen_AGG square graph
>> [0]PCGAMGCoarsen_AGG coarsen graph
>> [0]maxIndSetAgg removed 0 of 353 vertices. (0 local) 3 selected.
>> [0]PCGAMGProlongator_AGG New grid 3 nodes
>> PCGAMGOptprol_AGG smooth P0: max
>> eigen=1.983212e+00 min=2.830095e-01 PC=jacobi
>> [0]PCSetUp_GAMG 3) N=3, n data cols=1, nnz/row (ave)=3, 1
>> active pes
>> [0]PCSetUp_GAMG 4 levels, grid complexity = 1.63892
>> 0 KSP Residual norm 2.706652282076e+02
>> 1 KSP Residual norm 4.940773628648e+01
>> 2 KSP Residual norm 3.718259719599e+00
>> 3 KSP Residual norm 2.082059791607e-01
>> 4 KSP Residual norm 1.700581360081e-02
>> 5 KSP Residual norm 9.430563655174e-04
>>
>>
>> On Wed, Feb 18, 2015 at 11:06 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>>
>>> Hmm, it seems GAMG is only doing 2 levels in master for all problems?
>>>
>>>
>> ksp/ex54 seems to do three levels (with -ne 149) with one proc?
>>
>> I'll try
>>
>>
>>> ./ex29 -da_refine 8 -pc_type gamg -ksp_view
>>>
>>> uses only two levels. Makes no sense.
>>>
>>> Did someone break it?
>>>
>>>
>>>
>>>
>>>
>>>
>>> > On Feb 18, 2015, at 9:48 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>> >
>>> >
>>> > Mark,
>>> >
>>> > When I run ksp/ksp/examples/tutorials/ex45 I get a VERY large coarse
>>> problem. It seems to ignore the -pc_gamg_coarse_eq_limit 200 argument. Any
>>> idea what is going on?
>>> >
>>> > Thanks
>>> >
>>> > Barry
>>> >
>>> >
>>> > $ ./ex45 -da_refine 3 -pc_type gamg -ksp_monitor -ksp_view
>>> -log_summary -pc_gamg_coarse_eq_limit 200
>>> > 0 KSP Residual norm 2.790769524030e+02
>>> > 1 KSP Residual norm 4.484052193577e+01
>>> > 2 KSP Residual norm 2.409368790441e+00
>>> > 3 KSP Residual norm 1.553421589919e-01
>>> > 4 KSP Residual norm 9.821441923699e-03
>>> > 5 KSP Residual norm 5.610434857134e-04
>>> > KSP Object: 1 MPI processes
>>> > type: gmres
>>> > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>> > GMRES: happy breakdown tolerance 1e-30
>>> > maximum iterations=10000
>>> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>>> > left preconditioning
>>> > using nonzero initial guess
>>> > using PRECONDITIONED norm type for convergence test
>>> > PC Object: 1 MPI processes
>>> > type: gamg
>>> > MG: type is MULTIPLICATIVE, levels=2 cycles=v
>>> > Cycles per PCApply=1
>>> > Using Galerkin computed coarse grid matrices
>>> > Coarse grid solver -- level -------------------------------
>>> > KSP Object: (mg_coarse_) 1 MPI processes
>>> > type: gmres
>>> > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>> > GMRES: happy breakdown tolerance 1e-30
>>> > maximum iterations=1, initial guess is zero
>>> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>>> > left preconditioning
>>> > using NONE norm type for convergence test
>>> > PC Object: (mg_coarse_) 1 MPI processes
>>> > type: bjacobi
>>> > block Jacobi: number of blocks = 1
>>> > Local solve is same for all blocks, in the following KSP and PC
>>> objects:
>>> > KSP Object: (mg_coarse_sub_) 1 MPI processes
>>> > type: preonly
>>> > maximum iterations=1, initial guess is zero
>>> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>>> > left preconditioning
>>> > using NONE norm type for convergence test
>>> > PC Object: (mg_coarse_sub_) 1 MPI processes
>>> > type: lu
>>> > LU: out-of-place factorization
>>> > tolerance for zero pivot 2.22045e-14
>>> > using diagonal shift on blocks to prevent zero pivot
>>> [INBLOCKS]
>>> > matrix ordering: nd
>>> > factor fill ratio given 5, needed 36.4391
>>> > Factored matrix follows:
>>> > Mat Object: 1 MPI processes
>>> > type: seqaij
>>> > rows=16587, cols=16587
>>> > package used to perform factorization: petsc
>>> > total: nonzeros=1.8231e+07, allocated
>>> nonzeros=1.8231e+07
>>> > total number of mallocs used during MatSetValues
>>> calls =0
>>> > not using I-node routines
>>> > linear system matrix = precond matrix:
>>> > Mat Object: 1 MPI processes
>>> > type: seqaij
>>> > rows=16587, cols=16587
>>> > total: nonzeros=500315, allocated nonzeros=500315
>>> > total number of mallocs used during MatSetValues calls =0
>>> > not using I-node routines
>>> > linear system matrix = precond matrix:
>>> > Mat Object: 1 MPI processes
>>> > type: seqaij
>>> > rows=16587, cols=16587
>>> > total: nonzeros=500315, allocated nonzeros=500315
>>> > total number of mallocs used during MatSetValues calls =0
>>> > not using I-node routines
>>> > Down solver (pre-smoother) on level 1 -------------------------------
>>> > KSP Object: (mg_levels_1_) 1 MPI processes
>>> > type: chebyshev
>>> > Chebyshev: eigenvalue estimates: min = 0.0976343, max = 2.05032
>>> > maximum iterations=2
>>> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>>> > left preconditioning
>>> > using nonzero initial guess
>>> > using NONE norm type for convergence test
>>> > PC Object: (mg_levels_1_) 1 MPI processes
>>> > type: sor
>>> > SOR: type = local_symmetric, iterations = 1, local iterations =
>>> 1, omega = 1
>>> > linear system matrix = precond matrix:
>>> > Mat Object: 1 MPI processes
>>> > type: seqaij
>>> > rows=117649, cols=117649
>>> > total: nonzeros=809137, allocated nonzeros=809137
>>> > total number of mallocs used during MatSetValues calls =0
>>> > not using I-node routines
>>> > Up solver (post-smoother) same as down solver (pre-smoother)
>>> > linear system matrix = precond matrix:
>>> > Mat Object: 1 MPI processes
>>> > type: seqaij
>>> > rows=117649, cols=117649
>>> > total: nonzeros=809137, allocated nonzeros=809137
>>> > total number of mallocs used during MatSetValues calls =0
>>> > not using I-node routines
>>> > Residual norm 3.81135e-05
>>> >
>>> ************************************************************************************************************************
>>> > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r
>>> -fCourier9' to print this document ***
>>> >
>>> ************************************************************************************************************************
>>> >
>>> > ---------------------------------------------- PETSc Performance
>>> Summary: ----------------------------------------------
>>> >
>>> > ./ex45 on a arch-opt named Barrys-MacBook-Pro.local with 1 processor,
>>> by barrysmith Wed Feb 18 21:38:03 2015
>>> > Using Petsc Development GIT revision: v3.5.3-1998-geddef31 GIT Date:
>>> 2015-02-18 11:05:09 -0600
>>> >
>>> > Max Max/Min Avg Total
>>> > Time (sec): 1.103e+01 1.00000 1.103e+01
>>> > Objects: 9.200e+01 1.00000 9.200e+01
>>> > Flops: 1.756e+10 1.00000 1.756e+10 1.756e+10
>>> > Flops/sec: 1.592e+09 1.00000 1.592e+09 1.592e+09
>>> > MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
>>> > MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
>>> > MPI Reductions: 0.000e+00 0.00000
>>> >
>>> > Flop counting convention: 1 flop = 1 real number operation of type
>>> (multiply/divide/add/subtract)
>>> > e.g., VecAXPY() for real vectors of length
>>> N --> 2N flops
>>> > and VecAXPY() for complex vectors of length
>>> N --> 8N flops
>>> >
>>> > Summary of Stages: ----- Time ------ ----- Flops ----- ---
>>> Messages --- -- Message Lengths -- -- Reductions --
>>> > Avg %Total Avg %Total counts
>>> %Total Avg %Total counts %Total
>>> > 0: Main Stage: 1.1030e+01 100.0% 1.7556e+10 100.0% 0.000e+00
>>> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
>>> >
>>> >
>>> ------------------------------------------------------------------------------------------------------------------------
>>> > See the 'Profiling' chapter of the users' manual for details on
>>> interpreting output.
>>> > Phase summary info:
>>> > Count: number of times phase was executed
>>> > Time and Flops: Max - maximum over all processors
>>> > Ratio - ratio of maximum to minimum over all
>>> processors
>>> > Mess: number of messages sent
>>> > Avg. len: average message length (bytes)
>>> > Reduct: number of global reductions
>>> > Global: entire computation
>>> > Stage: stages of a computation. Set stages with PetscLogStagePush()
>>> and PetscLogStagePop().
>>> > %T - percent time in this phase %F - percent flops in
>>> this phase
>>> > %M - percent messages in this phase %L - percent message
>>> lengths in this phase
>>> > %R - percent reductions in this phase
>>> > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
>>> over all processors)
>>> >
>>> ------------------------------------------------------------------------------------------------------------------------
>>> > Event Count Time (sec) Flops
>>> --- Global --- --- Stage --- Total
>>> > Max Ratio Max Ratio Max Ratio Mess Avg
>>> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
>>> >
>>> ------------------------------------------------------------------------------------------------------------------------
>>> >
>>> > --- Event Stage 0: Main Stage
>>> >
>>> > KSPGMRESOrthog 21 1.0 8.8868e-03 1.0 3.33e+07 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3752
>>> > KSPSetUp 5 1.0 4.3986e-03 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > KSPSolve 1 1.0 1.0995e+01 1.0 1.76e+10 1.0 0.0e+00
>>> 0.0e+00 0.0e+00100100 0 0 0 100100 0 0 0 1596
>>> > VecMDot 21 1.0 4.7335e-03 1.0 1.67e+07 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3522
>>> > VecNorm 30 1.0 9.4804e-04 1.0 4.63e+06 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4887
>>> > VecScale 29 1.0 7.8293e-04 1.0 2.20e+06 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2809
>>> > VecCopy 14 1.0 7.7058e-04 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > VecSet 102 1.0 1.4530e-02 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > VecAXPY 9 1.0 3.8154e-04 1.0 9.05e+05 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2372
>>> > VecAYPX 48 1.0 5.6449e-03 1.0 7.06e+06 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1251
>>> > VecAXPBYCZ 24 1.0 4.0700e-03 1.0 1.41e+07 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3469
>>> > VecMAXPY 29 1.0 5.1512e-03 1.0 2.04e+07 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3960
>>> > VecAssemblyBegin 1 1.0 6.7055e-08 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > VecAssemblyEnd 1 1.0 8.1025e-08 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > VecPointwiseMult 11 1.0 1.8083e-03 1.0 1.29e+06 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 716
>>> > VecSetRandom 1 1.0 1.7628e-03 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > VecNormalize 29 1.0 1.7100e-03 1.0 6.60e+06 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3858
>>> > MatMult 58 1.0 5.0949e-02 1.0 8.39e+07 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1647
>>> > MatMultAdd 6 1.0 5.2584e-03 1.0 5.01e+06 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 952
>>> > MatMultTranspose 6 1.0 6.1330e-03 1.0 5.01e+06 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 816
>>> > MatSolve 12 1.0 2.0657e-01 1.0 4.37e+08 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 2 2 0 0 0 2 2 0 0 0 2117
>>> > MatSOR 36 1.0 7.1355e-02 1.0 5.84e+07 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 818
>>> > MatLUFactorSym 1 1.0 3.4310e-01 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0
>>> > MatLUFactorNum 1 1.0 9.8038e+00 1.0 1.69e+10 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 89 96 0 0 0 89 96 0 0 0 1721
>>> > MatConvert 1 1.0 5.6955e-03 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > MatScale 3 1.0 2.7223e-03 1.0 2.45e+06 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 901
>>> > MatResidual 6 1.0 6.2142e-03 1.0 9.71e+06 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1562
>>> > MatAssemblyBegin 12 1.0 2.7413e-06 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > MatAssemblyEnd 12 1.0 2.4857e-02 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > MatGetRow 470596 1.0 2.4337e-02 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > MatGetRowIJ 1 1.0 2.3254e-03 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > MatGetOrdering 1 1.0 1.7668e-02 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > MatCoarsen 1 1.0 8.5790e-03 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > MatView 5 1.0 2.2273e-04 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > MatAXPY 1 1.0 1.8864e-02 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > MatMatMult 1 1.0 2.4513e-02 1.0 2.03e+06 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 83
>>> > MatMatMultSym 1 1.0 1.7885e-02 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > MatMatMultNum 1 1.0 6.6144e-03 1.0 2.03e+06 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 307
>>> > MatPtAP 1 1.0 1.1460e-01 1.0 1.30e+07 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 114
>>> > MatPtAPSymbolic 1 1.0 4.6803e-02 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > MatPtAPNumeric 1 1.0 6.7781e-02 1.0 1.30e+07 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 192
>>> > MatTrnMatMult 1 1.0 9.1702e-02 1.0 1.02e+07 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 111
>>> > MatTrnMatMultSym 1 1.0 6.0173e-02 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
>>> > MatTrnMatMultNum 1 1.0 3.1526e-02 1.0 1.02e+07 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 324
>>> > MatGetSymTrans 2 1.0 4.2753e-03 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > PCGAMGgraph_AGG 1 1.0 6.9175e-02 1.0 1.62e+06 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 23
>>> > PCGAMGcoarse_AGG 1 1.0 1.1130e-01 1.0 1.02e+07 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 92
>>> > PCGAMGProl_AGG 1 1.0 2.9380e-02 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>>> > PCGAMGPOpt_AGG 1 1.0 9.1377e-02 1.0 5.15e+07 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 564
>>> > PCSetUp 2 1.0 1.0587e+01 1.0 1.69e+10 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 96 97 0 0 0 96 97 0 0 0 1601
>>> > PCSetUpOnBlocks 6 1.0 1.0165e+01 1.0 1.69e+10 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 92 96 0 0 0 92 96 0 0 0 1660
>>> > PCApply 6 1.0 1.0503e+01 1.0 1.75e+10 1.0 0.0e+00
>>> 0.0e+00 0.0e+00 95 99 0 0 0 95 99 0 0 0 1662
>>> >
>>> ------------------------------------------------------------------------------------------------------------------------
>>> >
>>> >
>>> >
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20150220/e1da077e/attachment.html>
More information about the petsc-dev
mailing list