<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Sat, Jun 10, 2017 at 8:25 PM, David Nolte <span dir="ltr"><<a href="mailto:dnolte@dim.uchile.cl" target="_blank">dnolte@dim.uchile.cl</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dear all,<br>
<br>
I am solving a Stokes problem in 3D aorta geometries, using a P2/P1<br>
finite elements discretization on tetrahedral meshes resulting in<br>
~1-1.5M DOFs. Viscosity is uniform (can be adjusted arbitrarily), and<br>
the right hand side is a function of noisy measurement data.<br>
<br>
In other settings of "standard" Stokes flow problems I have obtained<br>
good convergence with an "upper" Schur complement preconditioner, using<br>
AMG (ML or Hypre) on the velocity block and approximating the Schur<br>
complement matrix by the diagonal of the pressure mass matrix:<br>
<br>
-ksp_converged_reason<br>
-ksp_monitor_true_residual<br>
-ksp_initial_guess_nonzero<br>
-ksp_diagonal_scale<br>
-ksp_diagonal_scale_fix<br>
-ksp_type fgmres<br>
-ksp_rtol 1.0e-8<br>
<br>
-pc_type fieldsplit<br>
-pc_fieldsplit_type schur<br>
-pc_fieldsplit_detect_saddle_<wbr>point<br>
-pc_fieldsplit_schur_fact_type upper<br>
-pc_fieldsplit_schur_<wbr>precondition user # <-- pressure mass matrix<br>
<br>
-fieldsplit_0_ksp_type preonly<br>
-fieldsplit_0_pc_type ml<br>
<br>
-fieldsplit_1_ksp_type preonly<br>
-fieldsplit_1_pc_type jacobi<br></blockquote><div><br></div><div>1) I always recommend starting from an exact solver and backing off in small steps for optimization. Thus</div><div> I would start with LU on the upper block and GMRES/LU with toelrance 1e-10 on the Schur block.</div><div> This should converge in 1 iterate.</div><div><br></div><div>2) I don't think you want preonly on the Schur system. You might want GMRES/Jacobi to invert the mass matrix.</div><div><br></div><div>3) You probably want to tighten the tolerance on the Schur solve, at least to start, and then slowly let it out. The</div><div> tight tolerance will show you how effective the preconditioner is using that Schur operator. Then you can start</div><div> to evaluate how effective the Schur linear sovler is.</div><div><br></div><div>Does this make sense?</div><div><br></div><div> Thanks,</div><div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
In my present case this setup gives rather slow convergence (varies for<br>
different geometries between 200-500 or several thousands!). I obtain<br>
better convergence with "-pc_fieldsplit_schur_<wbr>precondition selfp"and<br>
using multigrid on S, with "-fieldsplit_1_pc_type ml" (I don't think<br>
this is optimal, though).<br>
<br>
I don't understand why the pressure mass matrix approach performs so<br>
poorly and wonder what I could try to improve the convergence. Until now<br>
I have been using ML and Hypre BoomerAMG mostly with default parameters.<br>
Surely they can be improved by tuning some parameters. Which could be a<br>
good starting point? Are there other options I should consider?<br>
<br>
With the above setup (jacobi) for a case that works better than others,<br>
the KSP terminates with<br>
467 KSP unpreconditioned resid norm 2.072014323515e-09 true resid norm<br>
2.072014322600e-09 ||r(i)||/||b|| 9.939098100674e-09<br>
<br>
You can find the output of -ksp_view below. Let me know if you need more<br>
details.<br>
<br>
Thanks in advance for your advice!<br>
Best wishes<br>
David<br>
<br>
<br>
KSP Object: 1 MPI processes<br>
type: fgmres<br>
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt<br>
Orthogonalization with no iterative refinement<br>
GMRES: happy breakdown tolerance 1e-30<br>
maximum iterations=10000<br>
tolerances: relative=1e-08, absolute=1e-50, divergence=10000.<br>
right preconditioning<br>
diagonally scaled system<br>
using nonzero initial guess<br>
using UNPRECONDITIONED norm type for convergence test<br>
PC Object: 1 MPI processes<br>
type: fieldsplit<br>
FieldSplit with Schur preconditioner, factorization UPPER<br>
Preconditioner for the Schur complement formed from user provided matrix<br>
Split info:<br>
Split number 0 Defined by IS<br>
Split number 1 Defined by IS<br>
KSP solver for A00 block<br>
KSP Object: (fieldsplit_0_) 1 MPI processes<br>
type: preonly<br>
maximum iterations=10000, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.<br>
left preconditioning<br>
using NONE norm type for convergence test<br>
PC Object: (fieldsplit_0_) 1 MPI processes<br>
type: ml<br>
MG: type is MULTIPLICATIVE, levels=5 cycles=v<br>
Cycles per PCApply=1<br>
Using Galerkin computed coarse grid matrices<br>
Coarse grid solver -- level ------------------------------<wbr>-<br>
KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI<br>
processes<br>
type: preonly<br>
maximum iterations=10000, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.<br>
left preconditioning<br>
using NONE norm type for convergence test<br>
PC Object: (fieldsplit_0_mg_coarse_) 1 MPI<br>
processes<br>
type: lu<br>
LU: out-of-place factorization<br>
tolerance for zero pivot 2.22045e-14<br>
using diagonal shift on blocks to prevent zero pivot<br>
[INBLOCKS]<br>
matrix ordering: nd<br>
factor fill ratio given 5., needed 1.<br>
Factored matrix follows:<br>
Mat Object: 1 MPI processes<br>
type: seqaij<br>
rows=3, cols=3<br>
package used to perform factorization: petsc<br>
total: nonzeros=3, allocated nonzeros=3<br>
total number of mallocs used during MatSetValues<br>
calls =0<br>
not using I-node routines<br>
linear system matrix = precond matrix:<br>
Mat Object: 1 MPI processes<br>
type: seqaij<br>
rows=3, cols=3<br>
total: nonzeros=3, allocated nonzeros=3<br>
total number of mallocs used during MatSetValues calls =0<br>
not using I-node routines<br>
Down solver (pre-smoother) on level 1<br>
------------------------------<wbr>-<br>
KSP Object: (fieldsplit_0_mg_levels_1_) 1<br>
MPI processes<br>
type: richardson<br>
Richardson: damping factor=1.<br>
maximum iterations=2<br>
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.<br>
left preconditioning<br>
using nonzero initial guess<br>
using NONE norm type for convergence test<br>
PC Object: (fieldsplit_0_mg_levels_1_) 1<br>
MPI processes<br>
type: sor<br>
SOR: type = local_symmetric, iterations = 1, local<br>
iterations = 1, omega = 1.<br>
linear system matrix = precond matrix:<br>
Mat Object: 1 MPI processes<br>
type: seqaij<br>
rows=15, cols=15<br>
total: nonzeros=69, allocated nonzeros=69<br>
total number of mallocs used during MatSetValues calls =0<br>
not using I-node routines<br>
Up solver (post-smoother) same as down solver (pre-smoother)<br>
Down solver (pre-smoother) on level 2<br>
------------------------------<wbr>-<br>
KSP Object: (fieldsplit_0_mg_levels_2_) 1<br>
MPI processes<br>
type: richardson<br>
Richardson: damping factor=1.<br>
maximum iterations=2<br>
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.<br>
left preconditioning<br>
using nonzero initial guess<br>
using NONE norm type for convergence test<br>
PC Object: (fieldsplit_0_mg_levels_2_) 1<br>
MPI processes<br>
type: sor<br>
SOR: type = local_symmetric, iterations = 1, local<br>
iterations = 1, omega = 1.<br>
linear system matrix = precond matrix:<br>
Mat Object: 1 MPI processes<br>
type: seqaij<br>
rows=304, cols=304<br>
total: nonzeros=7354, allocated nonzeros=7354<br>
total number of mallocs used during MatSetValues calls =0<br>
not using I-node routines<br>
Up solver (post-smoother) same as down solver (pre-smoother)<br>
Down solver (pre-smoother) on level 3<br>
------------------------------<wbr>-<br>
KSP Object: (fieldsplit_0_mg_levels_3_) 1<br>
MPI processes<br>
type: richardson<br>
Richardson: damping factor=1.<br>
maximum iterations=2<br>
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.<br>
left preconditioning<br>
using nonzero initial guess<br>
using NONE norm type for convergence test<br>
PC Object: (fieldsplit_0_mg_levels_3_) 1<br>
MPI processes<br>
type: sor<br>
SOR: type = local_symmetric, iterations = 1, local<br>
iterations = 1, omega = 1.<br>
linear system matrix = precond matrix:<br>
Mat Object: 1 MPI processes<br>
type: seqaij<br>
rows=30236, cols=30236<br>
total: nonzeros=2730644, allocated nonzeros=2730644<br>
total number of mallocs used during MatSetValues calls =0<br>
not using I-node routines<br>
Up solver (post-smoother) same as down solver (pre-smoother)<br>
Down solver (pre-smoother) on level 4<br>
------------------------------<wbr>-<br>
KSP Object: (fieldsplit_0_mg_levels_4_) 1<br>
MPI processes<br>
type: richardson<br>
Richardson: damping factor=1.<br>
maximum iterations=2<br>
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.<br>
left preconditioning<br>
using nonzero initial guess<br>
using NONE norm type for convergence test<br>
PC Object: (fieldsplit_0_mg_levels_4_) 1<br>
MPI processes<br>
type: sor<br>
SOR: type = local_symmetric, iterations = 1, local<br>
iterations = 1, omega = 1.<br>
linear system matrix = precond matrix:<br>
Mat Object: (fieldsplit_0_) 1 MPI<br>
processes<br>
type: seqaij<br>
rows=894132, cols=894132<br>
total: nonzeros=70684164, allocated nonzeros=70684164<br>
total number of mallocs used during MatSetValues calls =0<br>
not using I-node routines<br>
Up solver (post-smoother) same as down solver (pre-smoother)<br>
linear system matrix = precond matrix:<br>
Mat Object: (fieldsplit_0_) 1 MPI processes<br>
type: seqaij<br>
rows=894132, cols=894132<br>
total: nonzeros=70684164, allocated nonzeros=70684164<br>
total number of mallocs used during MatSetValues calls =0<br>
not using I-node routines<br>
KSP solver for S = A11 - A10 inv(A00) A01<br>
KSP Object: (fieldsplit_1_) 1 MPI processes<br>
type: preonly<br>
maximum iterations=10000, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.<br>
left preconditioning<br>
using NONE norm type for convergence test<br>
PC Object: (fieldsplit_1_) 1 MPI processes<br>
type: jacobi<br>
linear system matrix followed by preconditioner matrix:<br>
Mat Object: (fieldsplit_1_) 1 MPI processes<br>
type: schurcomplement<br>
rows=42025, cols=42025<br>
Schur complement A11 - A10 inv(A00) A01<br>
A11<br>
Mat Object: (fieldsplit_1_) 1<br>
MPI processes<br>
type: seqaij<br>
rows=42025, cols=42025<br>
total: nonzeros=554063, allocated nonzeros=554063<br>
total number of mallocs used during MatSetValues calls =0<br>
not using I-node routines<br>
A10<br>
Mat Object: 1 MPI processes<br>
type: seqaij<br>
rows=42025, cols=894132<br>
total: nonzeros=6850107, allocated nonzeros=6850107<br>
total number of mallocs used during MatSetValues calls =0<br>
not using I-node routines<br>
KSP of A00<br>
KSP Object: (fieldsplit_0_) 1<br>
MPI processes<br>
type: preonly<br>
maximum iterations=10000, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50,<br>
divergence=10000.<br>
left preconditioning<br>
using NONE norm type for convergence test<br>
PC Object: (fieldsplit_0_) 1<br>
MPI processes<br>
type: ml<br>
MG: type is MULTIPLICATIVE, levels=5 cycles=v<br>
Cycles per PCApply=1<br>
Using Galerkin computed coarse grid matrices<br>
Coarse grid solver -- level ------------------------------<wbr>-<br>
KSP Object:<br>
(fieldsplit_0_mg_coarse_) 1 MPI processes<br>
type: preonly<br>
maximum iterations=10000, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50,<br>
divergence=10000.<br>
left preconditioning<br>
using NONE norm type for convergence test<br>
PC Object:<br>
(fieldsplit_0_mg_coarse_) 1 MPI processes<br>
type: lu<br>
LU: out-of-place factorization<br>
tolerance for zero pivot 2.22045e-14<br>
using diagonal shift on blocks to prevent zero<br>
pivot [INBLOCKS]<br>
matrix ordering: nd<br>
factor fill ratio given 5., needed 1.<br>
Factored matrix follows:<br>
Mat Object: 1 MPI<br>
processes<br>
type: seqaij<br>
rows=3, cols=3<br>
package used to perform factorization: petsc<br>
total: nonzeros=3, allocated nonzeros=3<br>
total number of mallocs used during<br>
MatSetValues calls =0<br>
not using I-node routines<br>
linear system matrix = precond matrix:<br>
Mat Object: 1 MPI processes<br>
type: seqaij<br>
rows=3, cols=3<br>
total: nonzeros=3, allocated nonzeros=3<br>
total number of mallocs used during MatSetValues<br>
calls =0<br>
not using I-node routines<br>
Down solver (pre-smoother) on level 1<br>
------------------------------<wbr>-<br>
KSP Object:<br>
(fieldsplit_0_mg_levels_1_) 1 MPI processes<br>
type: richardson<br>
Richardson: damping factor=1.<br>
maximum iterations=2<br>
tolerances: relative=1e-05, absolute=1e-50,<br>
divergence=10000.<br>
left preconditioning<br>
using nonzero initial guess<br>
using NONE norm type for convergence test<br>
PC Object:<br>
(fieldsplit_0_mg_levels_1_) 1 MPI processes<br>
type: sor<br>
SOR: type = local_symmetric, iterations = 1, local<br>
iterations = 1, omega = 1.<br>
linear system matrix = precond matrix:<br>
Mat Object: 1 MPI processes<br>
type: seqaij<br>
rows=15, cols=15<br>
total: nonzeros=69, allocated nonzeros=69<br>
total number of mallocs used during MatSetValues<br>
calls =0<br>
not using I-node routines<br>
Up solver (post-smoother) same as down solver (pre-smoother)<br>
Down solver (pre-smoother) on level 2<br>
------------------------------<wbr>-<br>
KSP Object:<br>
(fieldsplit_0_mg_levels_2_) 1 MPI processes<br>
type: richardson<br>
Richardson: damping factor=1.<br>
maximum iterations=2<br>
tolerances: relative=1e-05, absolute=1e-50,<br>
divergence=10000.<br>
left preconditioning<br>
using nonzero initial guess<br>
using NONE norm type for convergence test<br>
PC Object:<br>
(fieldsplit_0_mg_levels_2_) 1 MPI processes<br>
type: sor<br>
SOR: type = local_symmetric, iterations = 1, local<br>
iterations = 1, omega = 1.<br>
linear system matrix = precond matrix:<br>
Mat Object: 1 MPI processes<br>
type: seqaij<br>
rows=304, cols=304<br>
total: nonzeros=7354, allocated nonzeros=7354<br>
total number of mallocs used during MatSetValues<br>
calls =0<br>
not using I-node routines<br>
Up solver (post-smoother) same as down solver (pre-smoother)<br>
Down solver (pre-smoother) on level 3<br>
------------------------------<wbr>-<br>
KSP Object:<br>
(fieldsplit_0_mg_levels_3_) 1 MPI processes<br>
type: richardson<br>
Richardson: damping factor=1.<br>
maximum iterations=2<br>
tolerances: relative=1e-05, absolute=1e-50,<br>
divergence=10000.<br>
left preconditioning<br>
using nonzero initial guess<br>
using NONE norm type for convergence test<br>
PC Object:<br>
(fieldsplit_0_mg_levels_3_) 1 MPI processes<br>
type: sor<br>
SOR: type = local_symmetric, iterations = 1, local<br>
iterations = 1, omega = 1.<br>
linear system matrix = precond matrix:<br>
Mat Object: 1 MPI processes<br>
type: seqaij<br>
rows=30236, cols=30236<br>
total: nonzeros=2730644, allocated nonzeros=2730644<br>
total number of mallocs used during MatSetValues<br>
calls =0<br>
not using I-node routines<br>
Up solver (post-smoother) same as down solver (pre-smoother)<br>
Down solver (pre-smoother) on level 4<br>
------------------------------<wbr>-<br>
KSP Object:<br>
(fieldsplit_0_mg_levels_4_) 1 MPI processes<br>
type: richardson<br>
Richardson: damping factor=1.<br>
maximum iterations=2<br>
tolerances: relative=1e-05, absolute=1e-50,<br>
divergence=10000.<br>
left preconditioning<br>
using nonzero initial guess<br>
using NONE norm type for convergence test<br>
PC Object:<br>
(fieldsplit_0_mg_levels_4_) 1 MPI processes<br>
type: sor<br>
SOR: type = local_symmetric, iterations = 1, local<br>
iterations = 1, omega = 1.<br>
linear system matrix = precond matrix:<br>
Mat Object:<br>
(fieldsplit_0_) 1 MPI processes<br>
type: seqaij<br>
rows=894132, cols=894132<br>
total: nonzeros=70684164, allocated nonzeros=70684164<br>
total number of mallocs used during MatSetValues<br>
calls =0<br>
not using I-node routines<br>
Up solver (post-smoother) same as down solver (pre-smoother)<br>
linear system matrix = precond matrix:<br>
Mat Object:<br>
(fieldsplit_0_) 1 MPI processes<br>
type: seqaij<br>
rows=894132, cols=894132<br>
total: nonzeros=70684164, allocated nonzeros=70684164<br>
total number of mallocs used during MatSetValues calls =0<br>
not using I-node routines<br>
A01<br>
Mat Object: 1 MPI processes<br>
type: seqaij<br>
rows=894132, cols=42025<br>
total: nonzeros=6850107, allocated nonzeros=6850107<br>
total number of mallocs used during MatSetValues calls =0<br>
not using I-node routines<br>
Mat Object: 1 MPI processes<br>
type: seqaij<br>
rows=42025, cols=42025<br>
total: nonzeros=554063, allocated nonzeros=554063<br>
total number of mallocs used during MatSetValues calls =0<br>
not using I-node routines<br>
linear system matrix = precond matrix:<br>
Mat Object: 1 MPI processes<br>
type: seqaij<br>
rows=936157, cols=936157<br>
total: nonzeros=84938441, allocated nonzeros=84938441<br>
total number of mallocs used during MatSetValues calls =0<br>
not using I-node routines<br>
<br>
<br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.caam.rice.edu/~mk51/" target="_blank">http://www.caam.rice.edu/~mk51/</a><br></div></div></div>
</div></div>