[petsc-users] Advice on improving Stokes Schur preconditioners
David Nolte
dnolte at dim.uchile.cl
Sat Jun 10 20:25:35 CDT 2017
Dear all,
I am solving a Stokes problem in 3D aorta geometries, using a P2/P1
finite elements discretization on tetrahedral meshes resulting in
~1-1.5M DOFs. Viscosity is uniform (can be adjusted arbitrarily), and
the right hand side is a function of noisy measurement data.
In other settings of "standard" Stokes flow problems I have obtained
good convergence with an "upper" Schur complement preconditioner, using
AMG (ML or Hypre) on the velocity block and approximating the Schur
complement matrix by the diagonal of the pressure mass matrix:
-ksp_converged_reason
-ksp_monitor_true_residual
-ksp_initial_guess_nonzero
-ksp_diagonal_scale
-ksp_diagonal_scale_fix
-ksp_type fgmres
-ksp_rtol 1.0e-8
-pc_type fieldsplit
-pc_fieldsplit_type schur
-pc_fieldsplit_detect_saddle_point
-pc_fieldsplit_schur_fact_type upper
-pc_fieldsplit_schur_precondition user # <-- pressure mass matrix
-fieldsplit_0_ksp_type preonly
-fieldsplit_0_pc_type ml
-fieldsplit_1_ksp_type preonly
-fieldsplit_1_pc_type jacobi
In my present case this setup gives rather slow convergence (varies for
different geometries between 200-500 or several thousands!). I obtain
better convergence with "-pc_fieldsplit_schur_precondition selfp"and
using multigrid on S, with "-fieldsplit_1_pc_type ml" (I don't think
this is optimal, though).
I don't understand why the pressure mass matrix approach performs so
poorly and wonder what I could try to improve the convergence. Until now
I have been using ML and Hypre BoomerAMG mostly with default parameters.
Surely they can be improved by tuning some parameters. Which could be a
good starting point? Are there other options I should consider?
With the above setup (jacobi) for a case that works better than others,
the KSP terminates with
467 KSP unpreconditioned resid norm 2.072014323515e-09 true resid norm
2.072014322600e-09 ||r(i)||/||b|| 9.939098100674e-09
You can find the output of -ksp_view below. Let me know if you need more
details.
Thanks in advance for your advice!
Best wishes
David
KSP Object: 1 MPI processes
type: fgmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=10000
tolerances: relative=1e-08, absolute=1e-50, divergence=10000.
right preconditioning
diagonally scaled system
using nonzero initial guess
using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
type: fieldsplit
FieldSplit with Schur preconditioner, factorization UPPER
Preconditioner for the Schur complement formed from user provided matrix
Split info:
Split number 0 Defined by IS
Split number 1 Defined by IS
KSP solver for A00 block
KSP Object: (fieldsplit_0_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (fieldsplit_0_) 1 MPI processes
type: ml
MG: type is MULTIPLICATIVE, levels=5 cycles=v
Cycles per PCApply=1
Using Galerkin computed coarse grid matrices
Coarse grid solver -- level -------------------------------
KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI
processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (fieldsplit_0_mg_coarse_) 1 MPI
processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot
[INBLOCKS]
matrix ordering: nd
factor fill ratio given 5., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=3, cols=3
package used to perform factorization: petsc
total: nonzeros=3, allocated nonzeros=3
total number of mallocs used during MatSetValues
calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=3, cols=3
total: nonzeros=3, allocated nonzeros=3
total number of mallocs used during MatSetValues calls =0
not using I-node routines
Down solver (pre-smoother) on level 1
-------------------------------
KSP Object: (fieldsplit_0_mg_levels_1_) 1
MPI processes
type: richardson
Richardson: damping factor=1.
maximum iterations=2
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (fieldsplit_0_mg_levels_1_) 1
MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local
iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=15, cols=15
total: nonzeros=69, allocated nonzeros=69
total number of mallocs used during MatSetValues calls =0
not using I-node routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 2
-------------------------------
KSP Object: (fieldsplit_0_mg_levels_2_) 1
MPI processes
type: richardson
Richardson: damping factor=1.
maximum iterations=2
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (fieldsplit_0_mg_levels_2_) 1
MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local
iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=304, cols=304
total: nonzeros=7354, allocated nonzeros=7354
total number of mallocs used during MatSetValues calls =0
not using I-node routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 3
-------------------------------
KSP Object: (fieldsplit_0_mg_levels_3_) 1
MPI processes
type: richardson
Richardson: damping factor=1.
maximum iterations=2
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (fieldsplit_0_mg_levels_3_) 1
MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local
iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=30236, cols=30236
total: nonzeros=2730644, allocated nonzeros=2730644
total number of mallocs used during MatSetValues calls =0
not using I-node routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 4
-------------------------------
KSP Object: (fieldsplit_0_mg_levels_4_) 1
MPI processes
type: richardson
Richardson: damping factor=1.
maximum iterations=2
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (fieldsplit_0_mg_levels_4_) 1
MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local
iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: (fieldsplit_0_) 1 MPI
processes
type: seqaij
rows=894132, cols=894132
total: nonzeros=70684164, allocated nonzeros=70684164
total number of mallocs used during MatSetValues calls =0
not using I-node routines
Up solver (post-smoother) same as down solver (pre-smoother)
linear system matrix = precond matrix:
Mat Object: (fieldsplit_0_) 1 MPI processes
type: seqaij
rows=894132, cols=894132
total: nonzeros=70684164, allocated nonzeros=70684164
total number of mallocs used during MatSetValues calls =0
not using I-node routines
KSP solver for S = A11 - A10 inv(A00) A01
KSP Object: (fieldsplit_1_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (fieldsplit_1_) 1 MPI processes
type: jacobi
linear system matrix followed by preconditioner matrix:
Mat Object: (fieldsplit_1_) 1 MPI processes
type: schurcomplement
rows=42025, cols=42025
Schur complement A11 - A10 inv(A00) A01
A11
Mat Object: (fieldsplit_1_) 1
MPI processes
type: seqaij
rows=42025, cols=42025
total: nonzeros=554063, allocated nonzeros=554063
total number of mallocs used during MatSetValues calls =0
not using I-node routines
A10
Mat Object: 1 MPI processes
type: seqaij
rows=42025, cols=894132
total: nonzeros=6850107, allocated nonzeros=6850107
total number of mallocs used during MatSetValues calls =0
not using I-node routines
KSP of A00
KSP Object: (fieldsplit_0_) 1
MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (fieldsplit_0_) 1
MPI processes
type: ml
MG: type is MULTIPLICATIVE, levels=5 cycles=v
Cycles per PCApply=1
Using Galerkin computed coarse grid matrices
Coarse grid solver -- level -------------------------------
KSP Object:
(fieldsplit_0_mg_coarse_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object:
(fieldsplit_0_mg_coarse_) 1 MPI processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero
pivot [INBLOCKS]
matrix ordering: nd
factor fill ratio given 5., needed 1.
Factored matrix follows:
Mat Object: 1 MPI
processes
type: seqaij
rows=3, cols=3
package used to perform factorization: petsc
total: nonzeros=3, allocated nonzeros=3
total number of mallocs used during
MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=3, cols=3
total: nonzeros=3, allocated nonzeros=3
total number of mallocs used during MatSetValues
calls =0
not using I-node routines
Down solver (pre-smoother) on level 1
-------------------------------
KSP Object:
(fieldsplit_0_mg_levels_1_) 1 MPI processes
type: richardson
Richardson: damping factor=1.
maximum iterations=2
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000.
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object:
(fieldsplit_0_mg_levels_1_) 1 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local
iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=15, cols=15
total: nonzeros=69, allocated nonzeros=69
total number of mallocs used during MatSetValues
calls =0
not using I-node routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 2
-------------------------------
KSP Object:
(fieldsplit_0_mg_levels_2_) 1 MPI processes
type: richardson
Richardson: damping factor=1.
maximum iterations=2
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000.
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object:
(fieldsplit_0_mg_levels_2_) 1 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local
iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=304, cols=304
total: nonzeros=7354, allocated nonzeros=7354
total number of mallocs used during MatSetValues
calls =0
not using I-node routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 3
-------------------------------
KSP Object:
(fieldsplit_0_mg_levels_3_) 1 MPI processes
type: richardson
Richardson: damping factor=1.
maximum iterations=2
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000.
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object:
(fieldsplit_0_mg_levels_3_) 1 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local
iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=30236, cols=30236
total: nonzeros=2730644, allocated nonzeros=2730644
total number of mallocs used during MatSetValues
calls =0
not using I-node routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 4
-------------------------------
KSP Object:
(fieldsplit_0_mg_levels_4_) 1 MPI processes
type: richardson
Richardson: damping factor=1.
maximum iterations=2
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000.
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object:
(fieldsplit_0_mg_levels_4_) 1 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local
iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object:
(fieldsplit_0_) 1 MPI processes
type: seqaij
rows=894132, cols=894132
total: nonzeros=70684164, allocated nonzeros=70684164
total number of mallocs used during MatSetValues
calls =0
not using I-node routines
Up solver (post-smoother) same as down solver (pre-smoother)
linear system matrix = precond matrix:
Mat Object:
(fieldsplit_0_) 1 MPI processes
type: seqaij
rows=894132, cols=894132
total: nonzeros=70684164, allocated nonzeros=70684164
total number of mallocs used during MatSetValues calls =0
not using I-node routines
A01
Mat Object: 1 MPI processes
type: seqaij
rows=894132, cols=42025
total: nonzeros=6850107, allocated nonzeros=6850107
total number of mallocs used during MatSetValues calls =0
not using I-node routines
Mat Object: 1 MPI processes
type: seqaij
rows=42025, cols=42025
total: nonzeros=554063, allocated nonzeros=554063
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=936157, cols=936157
total: nonzeros=84938441, allocated nonzeros=84938441
total number of mallocs used during MatSetValues calls =0
not using I-node routines
More information about the petsc-users
mailing list