[petsc-users] GAMG advice
Barry Smith
bsmith at mcs.anl.gov
Fri Oct 20 22:10:02 CDT 2017
David,
GAMG picks the number of levels based on how the coarsening process etc proceeds. You cannot hardwire it to a particular value. You can run with -info to get more info potentially on the decisions GAMG is making.
Barry
> On Oct 20, 2017, at 2:06 PM, David Nolte <dnolte at dim.uchile.cl> wrote:
>
> PS: I didn't realize at first, it looks as if the -pc_mg_levels 3 option
> was not taken into account:
> type: gamg
> MG: type is MULTIPLICATIVE, levels=1 cycles=v
>
>
>
> On 10/20/2017 03:32 PM, David Nolte wrote:
>> Dear all,
>>
>> I have some problems using GAMG as a preconditioner for (F)GMRES.
>> Background: I am solving the incompressible, unsteady Navier-Stokes
>> equations with a coupled mixed FEM approach, using P1/P1 elements for
>> velocity and pressure on an unstructured tetrahedron mesh with about
>> 2mio DOFs (and up to 15mio). The method is stabilized with SUPG/PSPG,
>> hence, no zeros on the diagonal of the pressure block. Time
>> discretization with semi-implicit backward Euler. The flow is a
>> convection dominated flow through a nozzle.
>>
>> So far, for this setup, I have been quite happy with a simple FGMRES/ML
>> solver for the full system (rather bruteforce, I admit, but much faster
>> than any block/Schur preconditioners I tried):
>>
>> -ksp_converged_reason
>> -ksp_monitor_true_residual
>> -ksp_type fgmres
>> -ksp_rtol 1.0e-6
>> -ksp_initial_guess_nonzero
>>
>> -pc_type ml
>> -pc_ml_Threshold 0.03
>> -pc_ml_maxNlevels 3
>>
>> This setup converges in ~100 iterations (see below the ksp_view output)
>> to rtol:
>>
>> 119 KSP unpreconditioned resid norm 4.004030812027e-05 true resid norm
>> 4.004030812037e-05 ||r(i)||/||b|| 1.621791251517e-06
>> 120 KSP unpreconditioned resid norm 3.256863709982e-05 true resid norm
>> 3.256863709982e-05 ||r(i)||/||b|| 1.319158947617e-06
>> 121 KSP unpreconditioned resid norm 2.751959681502e-05 true resid norm
>> 2.751959681503e-05 ||r(i)||/||b|| 1.114652795021e-06
>> 122 KSP unpreconditioned resid norm 2.420611122789e-05 true resid norm
>> 2.420611122788e-05 ||r(i)||/||b|| 9.804434897105e-07
>>
>>
>> Now I'd like to try GAMG instead of ML. However, I don't know how to set
>> it up to get similar performance.
>> The obvious/naive
>>
>> -pc_type gamg
>> -pc_gamg_type agg
>>
>> # with and without
>> -pc_gamg_threshold 0.03
>> -pc_mg_levels 3
>>
>> converges very slowly on 1 proc and much worse on 8 (~200k dofs per
>> proc), for instance:
>> np = 1:
>> 980 KSP unpreconditioned resid norm 1.065009356215e-02 true resid norm
>> 1.065009356215e-02 ||r(i)||/||b|| 4.532259705508e-04
>> 981 KSP unpreconditioned resid norm 1.064978578182e-02 true resid norm
>> 1.064978578182e-02 ||r(i)||/||b|| 4.532128726342e-04
>> 982 KSP unpreconditioned resid norm 1.064956706598e-02 true resid norm
>> 1.064956706598e-02 ||r(i)||/||b|| 4.532035649508e-04
>>
>> np = 8:
>> 980 KSP unpreconditioned resid norm 3.179946748495e-02 true resid norm
>> 3.179946748495e-02 ||r(i)||/||b|| 1.353259896710e-03
>> 981 KSP unpreconditioned resid norm 3.179946748317e-02 true resid norm
>> 3.179946748317e-02 ||r(i)||/||b|| 1.353259896634e-03
>> 982 KSP unpreconditioned resid norm 3.179946748317e-02 true resid norm
>> 3.179946748317e-02 ||r(i)||/||b|| 1.353259896634e-03
>>
>> A very high threshold seems to improve the GAMG PC, for instance with
>> 0.75 I get convergence to rtol=1e-6 after 744 iterations.
>> What else should I try?
>>
>> I would very much appreciate any advice on configuring GAMG and
>> differences w.r.t ML to be taken into account (not a multigrid expert
>> though).
>>
>> Thanks, best wishes
>> David
>>
>>
>> ------
>> ksp_view for -pc_type gamg -pc_gamg_threshold 0.75 -pc_mg_levels 3
>>
>> KSP Object: 1 MPI processes
>> type: fgmres
>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>> GMRES: happy breakdown tolerance 1e-30
>> maximum iterations=10000
>> tolerances: relative=1e-06, absolute=1e-50, divergence=10000.
>> right preconditioning
>> using nonzero initial guess
>> using UNPRECONDITIONED norm type for convergence test
>> PC Object: 1 MPI processes
>> type: gamg
>> MG: type is MULTIPLICATIVE, levels=1 cycles=v
>> Cycles per PCApply=1
>> Using Galerkin computed coarse grid matrices
>> GAMG specific options
>> Threshold for dropping small values from graph 0.75
>> AGG specific options
>> Symmetric graph false
>> Coarse grid solver -- level -------------------------------
>> KSP Object: (mg_levels_0_) 1 MPI processes
>> type: preonly
>> maximum iterations=2, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>> left preconditioning
>> using NONE norm type for convergence test
>> PC Object: (mg_levels_0_) 1 MPI processes
>> type: sor
>> SOR: type = local_symmetric, iterations = 1, local iterations =
>> 1, omega = 1.
>> linear system matrix = precond matrix:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=1745224, cols=1745224
>> total: nonzeros=99452608, allocated nonzeros=99452608
>> total number of mallocs used during MatSetValues calls =0
>> using I-node routines: found 1037847 nodes, limit used is 5
>> linear system matrix = precond matrix:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=1745224, cols=1745224
>> total: nonzeros=99452608, allocated nonzeros=99452608
>> total number of mallocs used during MatSetValues calls =0
>> using I-node routines: found 1037847 nodes, limit used is 5
>>
>>
>> ------
>> ksp_view for -pc_type ml:
>>
>> KSP Object: 8 MPI processes
>> type: fgmres
>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>> GMRES: happy breakdown tolerance 1e-30
>> maximum iterations=10000
>> tolerances: relative=1e-06, absolute=1e-50, divergence=10000.
>> right preconditioning
>> using nonzero initial guess
>> using UNPRECONDITIONED norm type for convergence test
>> PC Object: 8 MPI processes
>> type: ml
>> MG: type is MULTIPLICATIVE, levels=3 cycles=v
>> Cycles per PCApply=1
>> Using Galerkin computed coarse grid matrices
>> Coarse grid solver -- level -------------------------------
>> KSP Object: (mg_coarse_) 8 MPI processes
>> type: preonly
>> maximum iterations=10000, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>> left preconditioning
>> using NONE norm type for convergence test
>> PC Object: (mg_coarse_) 8 MPI processes
>> type: redundant
>> Redundant preconditioner: First (color=0) of 8 PCs follows
>> KSP Object: (mg_coarse_redundant_) 1 MPI processes
>> type: preonly
>> maximum iterations=10000, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>> left preconditioning
>> using NONE norm type for convergence test
>> PC Object: (mg_coarse_redundant_) 1 MPI processes
>> type: lu
>> LU: out-of-place factorization
>> tolerance for zero pivot 2.22045e-14
>> using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
>> matrix ordering: nd
>> factor fill ratio given 5., needed 10.4795
>> Factored matrix follows:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=6822, cols=6822
>> package used to perform factorization: petsc
>> total: nonzeros=9575688, allocated nonzeros=9575688
>> total number of mallocs used during MatSetValues calls =0
>> not using I-node routines
>> linear system matrix = precond matrix:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=6822, cols=6822
>> total: nonzeros=913758, allocated nonzeros=913758
>> total number of mallocs used during MatSetValues calls =0
>> not using I-node routines
>> linear system matrix = precond matrix:
>> Mat Object: 8 MPI processes
>> type: mpiaij
>> rows=6822, cols=6822
>> total: nonzeros=913758, allocated nonzeros=913758
>> total number of mallocs used during MatSetValues calls =0
>> not using I-node (on process 0) routines
>> Down solver (pre-smoother) on level 1 -------------------------------
>> KSP Object: (mg_levels_1_) 8 MPI processes
>> type: richardson
>> Richardson: damping factor=1.
>> maximum iterations=2
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>> left preconditioning
>> using nonzero initial guess
>> using NONE norm type for convergence test
>> PC Object: (mg_levels_1_) 8 MPI processes
>> type: sor
>> SOR: type = local_symmetric, iterations = 1, local iterations =
>> 1, omega = 1.
>> linear system matrix = precond matrix:
>> Mat Object: 8 MPI processes
>> type: mpiaij
>> rows=67087, cols=67087
>> total: nonzeros=9722749, allocated nonzeros=9722749
>> total number of mallocs used during MatSetValues calls =0
>> not using I-node (on process 0) routines
>> Up solver (post-smoother) same as down solver (pre-smoother)
>> Down solver (pre-smoother) on level 2 -------------------------------
>> KSP Object: (mg_levels_2_) 8 MPI processes
>> type: richardson
>> Richardson: damping factor=1.
>> maximum iterations=2
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>> left preconditioning
>> using nonzero initial guess
>> using NONE norm type for convergence test
>> PC Object: (mg_levels_2_) 8 MPI processes
>> type: sor
>> SOR: type = local_symmetric, iterations = 1, local iterations =
>> 1, omega = 1.
>> linear system matrix = precond matrix:
>> Mat Object: 8 MPI processes
>> type: mpiaij
>> rows=1745224, cols=1745224
>> total: nonzeros=99452608, allocated nonzeros=99452608
>> total number of mallocs used during MatSetValues calls =0
>> using I-node (on process 0) routines: found 126690 nodes,
>> limit used is 5
>> Up solver (post-smoother) same as down solver (pre-smoother)
>> linear system matrix = precond matrix:
>> Mat Object: 8 MPI processes
>> type: mpiaij
>> rows=1745224, cols=1745224
>> total: nonzeros=99452608, allocated nonzeros=99452608
>> total number of mallocs used during MatSetValues calls =0
>> using I-node (on process 0) routines: found 126690 nodes, limit
>> used is 5
>>
>
More information about the petsc-users
mailing list