[petsc-users] Advice on improving Stokes Schur preconditioners

Matthew Knepley knepley at gmail.com
Sun Jun 11 07:53:38 CDT 2017


On Sat, Jun 10, 2017 at 8:25 PM, David Nolte <dnolte at dim.uchile.cl> wrote:

> Dear all,
>
> I am solving a Stokes problem in 3D aorta geometries, using a P2/P1
> finite elements discretization on tetrahedral meshes resulting in
> ~1-1.5M DOFs. Viscosity is uniform (can be adjusted arbitrarily), and
> the right hand side is a function of noisy measurement data.
>
> In other settings of "standard" Stokes flow problems I have obtained
> good convergence with an "upper" Schur complement preconditioner, using
> AMG (ML or Hypre) on the velocity block and approximating the Schur
> complement matrix by the diagonal of the pressure mass matrix:
>
>     -ksp_converged_reason
>     -ksp_monitor_true_residual
>     -ksp_initial_guess_nonzero
>     -ksp_diagonal_scale
>     -ksp_diagonal_scale_fix
>     -ksp_type fgmres
>     -ksp_rtol 1.0e-8
>
>     -pc_type fieldsplit
>     -pc_fieldsplit_type schur
>     -pc_fieldsplit_detect_saddle_point
>     -pc_fieldsplit_schur_fact_type upper
>     -pc_fieldsplit_schur_precondition user    # <-- pressure mass matrix
>
>     -fieldsplit_0_ksp_type preonly
>     -fieldsplit_0_pc_type ml
>
>     -fieldsplit_1_ksp_type preonly
>     -fieldsplit_1_pc_type jacobi
>

1) I always recommend starting from an exact solver and backing off in
small steps for optimization. Thus
    I would start with LU on the upper block and GMRES/LU with toelrance
1e-10 on the Schur block.
    This should converge in 1 iterate.

2) I don't think you want preonly on the Schur system. You might want
GMRES/Jacobi to invert the mass matrix.

3) You probably want to tighten the tolerance on the Schur solve, at least
to start, and then slowly let it out. The
    tight tolerance will show you how effective the preconditioner is using
that Schur operator. Then you can start
    to evaluate how effective the Schur linear sovler is.

Does this make sense?

  Thanks,

     Matt


> In my present case this setup gives rather slow convergence (varies for
> different geometries between 200-500 or several thousands!). I obtain
> better convergence with "-pc_fieldsplit_schur_precondition selfp"and
> using multigrid on S, with "-fieldsplit_1_pc_type ml" (I don't think
> this is optimal, though).
>
> I don't understand why the pressure mass matrix approach performs so
> poorly and wonder what I could try to improve the convergence. Until now
> I have been using ML and Hypre BoomerAMG mostly with default parameters.
> Surely they can be improved by tuning some parameters. Which could be a
> good starting point? Are there other options I should consider?
>
> With the above setup (jacobi) for a case that works better than others,
> the KSP terminates with
> 467 KSP unpreconditioned resid norm 2.072014323515e-09 true resid norm
> 2.072014322600e-09 ||r(i)||/||b|| 9.939098100674e-09
>
> You can find the output of -ksp_view below. Let me know if you need more
> details.
>
> Thanks in advance for your advice!
> Best wishes
> David
>
>
> KSP Object: 1 MPI processes
>   type: fgmres
>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=10000
>   tolerances:  relative=1e-08, absolute=1e-50, divergence=10000.
>   right preconditioning
>   diagonally scaled system
>   using nonzero initial guess
>   using UNPRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: fieldsplit
>     FieldSplit with Schur preconditioner, factorization UPPER
>     Preconditioner for the Schur complement formed from user provided
> matrix
>     Split info:
>     Split number 0 Defined by IS
>     Split number 1 Defined by IS
>     KSP solver for A00 block
>       KSP Object:      (fieldsplit_0_)       1 MPI processes
>         type: preonly
>         maximum iterations=10000, initial guess is zero
>         tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>         left preconditioning
>         using NONE norm type for convergence test
>       PC Object:      (fieldsplit_0_)       1 MPI processes
>         type: ml
>           MG: type is MULTIPLICATIVE, levels=5 cycles=v
>             Cycles per PCApply=1
>             Using Galerkin computed coarse grid matrices
>         Coarse grid solver -- level -------------------------------
>           KSP Object:          (fieldsplit_0_mg_coarse_)           1 MPI
> processes
>             type: preonly
>             maximum iterations=10000, initial guess is zero
>             tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>             left preconditioning
>             using NONE norm type for convergence test
>           PC Object:          (fieldsplit_0_mg_coarse_)           1 MPI
> processes
>             type: lu
>               LU: out-of-place factorization
>               tolerance for zero pivot 2.22045e-14
>               using diagonal shift on blocks to prevent zero pivot
> [INBLOCKS]
>               matrix ordering: nd
>               factor fill ratio given 5., needed 1.
>                 Factored matrix follows:
>                   Mat Object:                   1 MPI processes
>                     type: seqaij
>                     rows=3, cols=3
>                     package used to perform factorization: petsc
>                     total: nonzeros=3, allocated nonzeros=3
>                     total number of mallocs used during MatSetValues
> calls =0
>                       not using I-node routines
>             linear system matrix = precond matrix:
>             Mat Object:             1 MPI processes
>               type: seqaij
>               rows=3, cols=3
>               total: nonzeros=3, allocated nonzeros=3
>               total number of mallocs used during MatSetValues calls =0
>                 not using I-node routines
>         Down solver (pre-smoother) on level 1
> -------------------------------
>           KSP Object:          (fieldsplit_0_mg_levels_1_)           1
> MPI processes
>             type: richardson
>               Richardson: damping factor=1.
>             maximum iterations=2
>             tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>             left preconditioning
>             using nonzero initial guess
>             using NONE norm type for convergence test
>           PC Object:          (fieldsplit_0_mg_levels_1_)           1
> MPI processes
>             type: sor
>               SOR: type = local_symmetric, iterations = 1, local
> iterations = 1, omega = 1.
>             linear system matrix = precond matrix:
>             Mat Object:             1 MPI processes
>               type: seqaij
>               rows=15, cols=15
>               total: nonzeros=69, allocated nonzeros=69
>               total number of mallocs used during MatSetValues calls =0
>                 not using I-node routines
>         Up solver (post-smoother) same as down solver (pre-smoother)
>         Down solver (pre-smoother) on level 2
> -------------------------------
>           KSP Object:          (fieldsplit_0_mg_levels_2_)           1
> MPI processes
>             type: richardson
>               Richardson: damping factor=1.
>             maximum iterations=2
>             tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>             left preconditioning
>             using nonzero initial guess
>             using NONE norm type for convergence test
>           PC Object:          (fieldsplit_0_mg_levels_2_)           1
> MPI processes
>             type: sor
>               SOR: type = local_symmetric, iterations = 1, local
> iterations = 1, omega = 1.
>             linear system matrix = precond matrix:
>             Mat Object:             1 MPI processes
>               type: seqaij
>               rows=304, cols=304
>               total: nonzeros=7354, allocated nonzeros=7354
>               total number of mallocs used during MatSetValues calls =0
>                 not using I-node routines
>         Up solver (post-smoother) same as down solver (pre-smoother)
>         Down solver (pre-smoother) on level 3
> -------------------------------
>           KSP Object:          (fieldsplit_0_mg_levels_3_)           1
> MPI processes
>             type: richardson
>               Richardson: damping factor=1.
>             maximum iterations=2
>             tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>             left preconditioning
>             using nonzero initial guess
>             using NONE norm type for convergence test
>           PC Object:          (fieldsplit_0_mg_levels_3_)           1
> MPI processes
>             type: sor
>               SOR: type = local_symmetric, iterations = 1, local
> iterations = 1, omega = 1.
>             linear system matrix = precond matrix:
>             Mat Object:             1 MPI processes
>               type: seqaij
>               rows=30236, cols=30236
>               total: nonzeros=2730644, allocated nonzeros=2730644
>               total number of mallocs used during MatSetValues calls =0
>                 not using I-node routines
>         Up solver (post-smoother) same as down solver (pre-smoother)
>         Down solver (pre-smoother) on level 4
> -------------------------------
>           KSP Object:          (fieldsplit_0_mg_levels_4_)           1
> MPI processes
>             type: richardson
>               Richardson: damping factor=1.
>             maximum iterations=2
>             tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>             left preconditioning
>             using nonzero initial guess
>             using NONE norm type for convergence test
>           PC Object:          (fieldsplit_0_mg_levels_4_)           1
> MPI processes
>             type: sor
>               SOR: type = local_symmetric, iterations = 1, local
> iterations = 1, omega = 1.
>             linear system matrix = precond matrix:
>             Mat Object:            (fieldsplit_0_)             1 MPI
> processes
>               type: seqaij
>               rows=894132, cols=894132
>               total: nonzeros=70684164, allocated nonzeros=70684164
>               total number of mallocs used during MatSetValues calls =0
>                 not using I-node routines
>         Up solver (post-smoother) same as down solver (pre-smoother)
>         linear system matrix = precond matrix:
>         Mat Object:        (fieldsplit_0_)         1 MPI processes
>           type: seqaij
>           rows=894132, cols=894132
>           total: nonzeros=70684164, allocated nonzeros=70684164
>           total number of mallocs used during MatSetValues calls =0
>             not using I-node routines
>     KSP solver for S = A11 - A10 inv(A00) A01
>       KSP Object:      (fieldsplit_1_)       1 MPI processes
>         type: preonly
>         maximum iterations=10000, initial guess is zero
>         tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>         left preconditioning
>         using NONE norm type for convergence test
>       PC Object:      (fieldsplit_1_)       1 MPI processes
>         type: jacobi
>         linear system matrix followed by preconditioner matrix:
>         Mat Object:        (fieldsplit_1_)         1 MPI processes
>           type: schurcomplement
>           rows=42025, cols=42025
>             Schur complement A11 - A10 inv(A00) A01
>             A11
>               Mat Object:              (fieldsplit_1_)               1
> MPI processes
>                 type: seqaij
>                 rows=42025, cols=42025
>                 total: nonzeros=554063, allocated nonzeros=554063
>                 total number of mallocs used during MatSetValues calls =0
>                   not using I-node routines
>             A10
>               Mat Object:               1 MPI processes
>                 type: seqaij
>                 rows=42025, cols=894132
>                 total: nonzeros=6850107, allocated nonzeros=6850107
>                 total number of mallocs used during MatSetValues calls =0
>                   not using I-node routines
>             KSP of A00
>               KSP Object:              (fieldsplit_0_)               1
> MPI processes
>                 type: preonly
>                 maximum iterations=10000, initial guess is zero
>                 tolerances:  relative=1e-05, absolute=1e-50,
> divergence=10000.
>                 left preconditioning
>                 using NONE norm type for convergence test
>               PC Object:              (fieldsplit_0_)               1
> MPI processes
>                 type: ml
>                   MG: type is MULTIPLICATIVE, levels=5 cycles=v
>                     Cycles per PCApply=1
>                     Using Galerkin computed coarse grid matrices
>                 Coarse grid solver -- level ------------------------------
> -
>                   KSP Object:
> (fieldsplit_0_mg_coarse_)                   1 MPI processes
>                     type: preonly
>                     maximum iterations=10000, initial guess is zero
>                     tolerances:  relative=1e-05, absolute=1e-50,
> divergence=10000.
>                     left preconditioning
>                     using NONE norm type for convergence test
>                   PC Object:
> (fieldsplit_0_mg_coarse_)                   1 MPI processes
>                     type: lu
>                       LU: out-of-place factorization
>                       tolerance for zero pivot 2.22045e-14
>                       using diagonal shift on blocks to prevent zero
> pivot [INBLOCKS]
>                       matrix ordering: nd
>                       factor fill ratio given 5., needed 1.
>                         Factored matrix follows:
>                           Mat Object:                           1 MPI
> processes
>                             type: seqaij
>                             rows=3, cols=3
>                             package used to perform factorization: petsc
>                             total: nonzeros=3, allocated nonzeros=3
>                             total number of mallocs used during
> MatSetValues calls =0
>                               not using I-node routines
>                     linear system matrix = precond matrix:
>                     Mat Object:                     1 MPI processes
>                       type: seqaij
>                       rows=3, cols=3
>                       total: nonzeros=3, allocated nonzeros=3
>                       total number of mallocs used during MatSetValues
> calls =0
>                         not using I-node routines
>                 Down solver (pre-smoother) on level 1
> -------------------------------
>                   KSP Object:
> (fieldsplit_0_mg_levels_1_)                   1 MPI processes
>                     type: richardson
>                       Richardson: damping factor=1.
>                     maximum iterations=2
>                     tolerances:  relative=1e-05, absolute=1e-50,
> divergence=10000.
>                     left preconditioning
>                     using nonzero initial guess
>                     using NONE norm type for convergence test
>                   PC Object:
> (fieldsplit_0_mg_levels_1_)                   1 MPI processes
>                     type: sor
>                       SOR: type = local_symmetric, iterations = 1, local
> iterations = 1, omega = 1.
>                     linear system matrix = precond matrix:
>                     Mat Object:                     1 MPI processes
>                       type: seqaij
>                       rows=15, cols=15
>                       total: nonzeros=69, allocated nonzeros=69
>                       total number of mallocs used during MatSetValues
> calls =0
>                         not using I-node routines
>                 Up solver (post-smoother) same as down solver
> (pre-smoother)
>                 Down solver (pre-smoother) on level 2
> -------------------------------
>                   KSP Object:
> (fieldsplit_0_mg_levels_2_)                   1 MPI processes
>                     type: richardson
>                       Richardson: damping factor=1.
>                     maximum iterations=2
>                     tolerances:  relative=1e-05, absolute=1e-50,
> divergence=10000.
>                     left preconditioning
>                     using nonzero initial guess
>                     using NONE norm type for convergence test
>                   PC Object:
> (fieldsplit_0_mg_levels_2_)                   1 MPI processes
>                     type: sor
>                       SOR: type = local_symmetric, iterations = 1, local
> iterations = 1, omega = 1.
>                     linear system matrix = precond matrix:
>                     Mat Object:                     1 MPI processes
>                       type: seqaij
>                       rows=304, cols=304
>                       total: nonzeros=7354, allocated nonzeros=7354
>                       total number of mallocs used during MatSetValues
> calls =0
>                         not using I-node routines
>                 Up solver (post-smoother) same as down solver
> (pre-smoother)
>                 Down solver (pre-smoother) on level 3
> -------------------------------
>                   KSP Object:
> (fieldsplit_0_mg_levels_3_)                   1 MPI processes
>                     type: richardson
>                       Richardson: damping factor=1.
>                     maximum iterations=2
>                     tolerances:  relative=1e-05, absolute=1e-50,
> divergence=10000.
>                     left preconditioning
>                     using nonzero initial guess
>                     using NONE norm type for convergence test
>                   PC Object:
> (fieldsplit_0_mg_levels_3_)                   1 MPI processes
>                     type: sor
>                       SOR: type = local_symmetric, iterations = 1, local
> iterations = 1, omega = 1.
>                     linear system matrix = precond matrix:
>                     Mat Object:                     1 MPI processes
>                       type: seqaij
>                       rows=30236, cols=30236
>                       total: nonzeros=2730644, allocated nonzeros=2730644
>                       total number of mallocs used during MatSetValues
> calls =0
>                         not using I-node routines
>                 Up solver (post-smoother) same as down solver
> (pre-smoother)
>                 Down solver (pre-smoother) on level 4
> -------------------------------
>                   KSP Object:
> (fieldsplit_0_mg_levels_4_)                   1 MPI processes
>                     type: richardson
>                       Richardson: damping factor=1.
>                     maximum iterations=2
>                     tolerances:  relative=1e-05, absolute=1e-50,
> divergence=10000.
>                     left preconditioning
>                     using nonzero initial guess
>                     using NONE norm type for convergence test
>                   PC Object:
> (fieldsplit_0_mg_levels_4_)                   1 MPI processes
>                     type: sor
>                       SOR: type = local_symmetric, iterations = 1, local
> iterations = 1, omega = 1.
>                     linear system matrix = precond matrix:
>                     Mat Object:
> (fieldsplit_0_)                     1 MPI processes
>                       type: seqaij
>                       rows=894132, cols=894132
>                       total: nonzeros=70684164, allocated nonzeros=70684164
>                       total number of mallocs used during MatSetValues
> calls =0
>                         not using I-node routines
>                 Up solver (post-smoother) same as down solver
> (pre-smoother)
>                 linear system matrix = precond matrix:
>                 Mat Object:
> (fieldsplit_0_)                 1 MPI processes
>                   type: seqaij
>                   rows=894132, cols=894132
>                   total: nonzeros=70684164, allocated nonzeros=70684164
>                   total number of mallocs used during MatSetValues calls =0
>                     not using I-node routines
>             A01
>               Mat Object:               1 MPI processes
>                 type: seqaij
>                 rows=894132, cols=42025
>                 total: nonzeros=6850107, allocated nonzeros=6850107
>                 total number of mallocs used during MatSetValues calls =0
>                   not using I-node routines
>         Mat Object:         1 MPI processes
>           type: seqaij
>           rows=42025, cols=42025
>           total: nonzeros=554063, allocated nonzeros=554063
>           total number of mallocs used during MatSetValues calls =0
>             not using I-node routines
>   linear system matrix = precond matrix:
>   Mat Object:   1 MPI processes
>     type: seqaij
>     rows=936157, cols=936157
>     total: nonzeros=84938441, allocated nonzeros=84938441
>     total number of mallocs used during MatSetValues calls =0
>       not using I-node routines
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170611/5069d15f/attachment-0001.html>


More information about the petsc-users mailing list