[petsc-users] GAMG advice
David Nolte
dnolte at dim.uchile.cl
Fri Oct 20 14:06:34 CDT 2017
PS: I didn't realize at first, it looks as if the -pc_mg_levels 3 option
was not taken into account:
type: gamg
MG: type is MULTIPLICATIVE, levels=1 cycles=v
On 10/20/2017 03:32 PM, David Nolte wrote:
> Dear all,
>
> I have some problems using GAMG as a preconditioner for (F)GMRES.
> Background: I am solving the incompressible, unsteady Navier-Stokes
> equations with a coupled mixed FEM approach, using P1/P1 elements for
> velocity and pressure on an unstructured tetrahedron mesh with about
> 2mio DOFs (and up to 15mio). The method is stabilized with SUPG/PSPG,
> hence, no zeros on the diagonal of the pressure block. Time
> discretization with semi-implicit backward Euler. The flow is a
> convection dominated flow through a nozzle.
>
> So far, for this setup, I have been quite happy with a simple FGMRES/ML
> solver for the full system (rather bruteforce, I admit, but much faster
> than any block/Schur preconditioners I tried):
>
> -ksp_converged_reason
> -ksp_monitor_true_residual
> -ksp_type fgmres
> -ksp_rtol 1.0e-6
> -ksp_initial_guess_nonzero
>
> -pc_type ml
> -pc_ml_Threshold 0.03
> -pc_ml_maxNlevels 3
>
> This setup converges in ~100 iterations (see below the ksp_view output)
> to rtol:
>
> 119 KSP unpreconditioned resid norm 4.004030812027e-05 true resid norm
> 4.004030812037e-05 ||r(i)||/||b|| 1.621791251517e-06
> 120 KSP unpreconditioned resid norm 3.256863709982e-05 true resid norm
> 3.256863709982e-05 ||r(i)||/||b|| 1.319158947617e-06
> 121 KSP unpreconditioned resid norm 2.751959681502e-05 true resid norm
> 2.751959681503e-05 ||r(i)||/||b|| 1.114652795021e-06
> 122 KSP unpreconditioned resid norm 2.420611122789e-05 true resid norm
> 2.420611122788e-05 ||r(i)||/||b|| 9.804434897105e-07
>
>
> Now I'd like to try GAMG instead of ML. However, I don't know how to set
> it up to get similar performance.
> The obvious/naive
>
> -pc_type gamg
> -pc_gamg_type agg
>
> # with and without
> -pc_gamg_threshold 0.03
> -pc_mg_levels 3
>
> converges very slowly on 1 proc and much worse on 8 (~200k dofs per
> proc), for instance:
> np = 1:
> 980 KSP unpreconditioned resid norm 1.065009356215e-02 true resid norm
> 1.065009356215e-02 ||r(i)||/||b|| 4.532259705508e-04
> 981 KSP unpreconditioned resid norm 1.064978578182e-02 true resid norm
> 1.064978578182e-02 ||r(i)||/||b|| 4.532128726342e-04
> 982 KSP unpreconditioned resid norm 1.064956706598e-02 true resid norm
> 1.064956706598e-02 ||r(i)||/||b|| 4.532035649508e-04
>
> np = 8:
> 980 KSP unpreconditioned resid norm 3.179946748495e-02 true resid norm
> 3.179946748495e-02 ||r(i)||/||b|| 1.353259896710e-03
> 981 KSP unpreconditioned resid norm 3.179946748317e-02 true resid norm
> 3.179946748317e-02 ||r(i)||/||b|| 1.353259896634e-03
> 982 KSP unpreconditioned resid norm 3.179946748317e-02 true resid norm
> 3.179946748317e-02 ||r(i)||/||b|| 1.353259896634e-03
>
> A very high threshold seems to improve the GAMG PC, for instance with
> 0.75 I get convergence to rtol=1e-6 after 744 iterations.
> What else should I try?
>
> I would very much appreciate any advice on configuring GAMG and
> differences w.r.t ML to be taken into account (not a multigrid expert
> though).
>
> Thanks, best wishes
> David
>
>
> ------
> ksp_view for -pc_type gamg -pc_gamg_threshold 0.75 -pc_mg_levels 3
>
> KSP Object: 1 MPI processes
> type: fgmres
> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> GMRES: happy breakdown tolerance 1e-30
> maximum iterations=10000
> tolerances: relative=1e-06, absolute=1e-50, divergence=10000.
> right preconditioning
> using nonzero initial guess
> using UNPRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
> type: gamg
> MG: type is MULTIPLICATIVE, levels=1 cycles=v
> Cycles per PCApply=1
> Using Galerkin computed coarse grid matrices
> GAMG specific options
> Threshold for dropping small values from graph 0.75
> AGG specific options
> Symmetric graph false
> Coarse grid solver -- level -------------------------------
> KSP Object: (mg_levels_0_) 1 MPI processes
> type: preonly
> maximum iterations=2, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
> left preconditioning
> using NONE norm type for convergence test
> PC Object: (mg_levels_0_) 1 MPI processes
> type: sor
> SOR: type = local_symmetric, iterations = 1, local iterations =
> 1, omega = 1.
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=1745224, cols=1745224
> total: nonzeros=99452608, allocated nonzeros=99452608
> total number of mallocs used during MatSetValues calls =0
> using I-node routines: found 1037847 nodes, limit used is 5
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=1745224, cols=1745224
> total: nonzeros=99452608, allocated nonzeros=99452608
> total number of mallocs used during MatSetValues calls =0
> using I-node routines: found 1037847 nodes, limit used is 5
>
>
> ------
> ksp_view for -pc_type ml:
>
> KSP Object: 8 MPI processes
> type: fgmres
> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> GMRES: happy breakdown tolerance 1e-30
> maximum iterations=10000
> tolerances: relative=1e-06, absolute=1e-50, divergence=10000.
> right preconditioning
> using nonzero initial guess
> using UNPRECONDITIONED norm type for convergence test
> PC Object: 8 MPI processes
> type: ml
> MG: type is MULTIPLICATIVE, levels=3 cycles=v
> Cycles per PCApply=1
> Using Galerkin computed coarse grid matrices
> Coarse grid solver -- level -------------------------------
> KSP Object: (mg_coarse_) 8 MPI processes
> type: preonly
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
> left preconditioning
> using NONE norm type for convergence test
> PC Object: (mg_coarse_) 8 MPI processes
> type: redundant
> Redundant preconditioner: First (color=0) of 8 PCs follows
> KSP Object: (mg_coarse_redundant_) 1 MPI processes
> type: preonly
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
> left preconditioning
> using NONE norm type for convergence test
> PC Object: (mg_coarse_redundant_) 1 MPI processes
> type: lu
> LU: out-of-place factorization
> tolerance for zero pivot 2.22045e-14
> using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
> matrix ordering: nd
> factor fill ratio given 5., needed 10.4795
> Factored matrix follows:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=6822, cols=6822
> package used to perform factorization: petsc
> total: nonzeros=9575688, allocated nonzeros=9575688
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=6822, cols=6822
> total: nonzeros=913758, allocated nonzeros=913758
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> linear system matrix = precond matrix:
> Mat Object: 8 MPI processes
> type: mpiaij
> rows=6822, cols=6822
> total: nonzeros=913758, allocated nonzeros=913758
> total number of mallocs used during MatSetValues calls =0
> not using I-node (on process 0) routines
> Down solver (pre-smoother) on level 1 -------------------------------
> KSP Object: (mg_levels_1_) 8 MPI processes
> type: richardson
> Richardson: damping factor=1.
> maximum iterations=2
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
> left preconditioning
> using nonzero initial guess
> using NONE norm type for convergence test
> PC Object: (mg_levels_1_) 8 MPI processes
> type: sor
> SOR: type = local_symmetric, iterations = 1, local iterations =
> 1, omega = 1.
> linear system matrix = precond matrix:
> Mat Object: 8 MPI processes
> type: mpiaij
> rows=67087, cols=67087
> total: nonzeros=9722749, allocated nonzeros=9722749
> total number of mallocs used during MatSetValues calls =0
> not using I-node (on process 0) routines
> Up solver (post-smoother) same as down solver (pre-smoother)
> Down solver (pre-smoother) on level 2 -------------------------------
> KSP Object: (mg_levels_2_) 8 MPI processes
> type: richardson
> Richardson: damping factor=1.
> maximum iterations=2
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
> left preconditioning
> using nonzero initial guess
> using NONE norm type for convergence test
> PC Object: (mg_levels_2_) 8 MPI processes
> type: sor
> SOR: type = local_symmetric, iterations = 1, local iterations =
> 1, omega = 1.
> linear system matrix = precond matrix:
> Mat Object: 8 MPI processes
> type: mpiaij
> rows=1745224, cols=1745224
> total: nonzeros=99452608, allocated nonzeros=99452608
> total number of mallocs used during MatSetValues calls =0
> using I-node (on process 0) routines: found 126690 nodes,
> limit used is 5
> Up solver (post-smoother) same as down solver (pre-smoother)
> linear system matrix = precond matrix:
> Mat Object: 8 MPI processes
> type: mpiaij
> rows=1745224, cols=1745224
> total: nonzeros=99452608, allocated nonzeros=99452608
> total number of mallocs used during MatSetValues calls =0
> using I-node (on process 0) routines: found 126690 nodes, limit
> used is 5
>
More information about the petsc-users
mailing list