<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Sat, Jun 10, 2017 at 8:25 PM, David Nolte <span dir="ltr"><<a href="mailto:dnolte@dim.uchile.cl" target="_blank">dnolte@dim.uchile.cl</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dear all,<br>


<br>


I am solving a Stokes problem in 3D aorta geometries, using a P2/P1<br>


finite elements discretization on tetrahedral meshes resulting in<br>


~1-1.5M DOFs. Viscosity is uniform (can be adjusted arbitrarily), and<br>


the right hand side is a function of noisy measurement data.<br>


<br>


In other settings of "standard" Stokes flow problems I have obtained<br>


good convergence with an "upper" Schur complement preconditioner, using<br>


AMG (ML or Hypre) on the velocity block and approximating the Schur<br>


complement matrix by the diagonal of the pressure mass matrix:<br>


<br>


    -ksp_converged_reason<br>


    -ksp_monitor_true_residual<br>


    -ksp_initial_guess_nonzero<br>


    -ksp_diagonal_scale<br>


    -ksp_diagonal_scale_fix<br>


    -ksp_type fgmres<br>


    -ksp_rtol 1.0e-8<br>


<br>


    -pc_type fieldsplit<br>


    -pc_fieldsplit_type schur<br>


    -pc_fieldsplit_detect_saddle_<wbr>point<br>


    -pc_fieldsplit_schur_fact_type upper<br>


    -pc_fieldsplit_schur_<wbr>precondition user    # <-- pressure mass matrix<br>


<br>


    -fieldsplit_0_ksp_type preonly<br>


    -fieldsplit_0_pc_type ml<br>


<br>


    -fieldsplit_1_ksp_type preonly<br>


    -fieldsplit_1_pc_type jacobi<br></blockquote><div><br></div><div>1) I always recommend starting from an exact solver and backing off in small steps for optimization. Thus</div><div>    I would start with LU on the upper block and GMRES/LU with toelrance 1e-10 on the Schur block.</div><div>    This should converge in 1 iterate.</div><div><br></div><div>2) I don't think you want preonly on the Schur system. You might want GMRES/Jacobi to invert the mass matrix.</div><div><br></div><div>3) You probably want to tighten the tolerance on the Schur solve, at least to start, and then slowly let it out. The</div><div>    tight tolerance will show you how effective the preconditioner is using that Schur operator. Then you can start</div><div>    to evaluate how effective the Schur linear sovler is.</div><div><br></div><div>Does this make sense?</div><div><br></div><div>  Thanks,</div><div><br></div><div>     Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


In my present case this setup gives rather slow convergence (varies for<br>


different geometries between 200-500 or several thousands!). I obtain<br>


better convergence with "-pc_fieldsplit_schur_<wbr>precondition selfp"and<br>


using multigrid on S, with "-fieldsplit_1_pc_type ml" (I don't think<br>


this is optimal, though).<br>


<br>


I don't understand why the pressure mass matrix approach performs so<br>


poorly and wonder what I could try to improve the convergence. Until now<br>


I have been using ML and Hypre BoomerAMG mostly with default parameters.<br>


Surely they can be improved by tuning some parameters. Which could be a<br>


good starting point? Are there other options I should consider?<br>


<br>


With the above setup (jacobi) for a case that works better than others,<br>


the KSP terminates with<br>


467 KSP unpreconditioned resid norm 2.072014323515e-09 true resid norm<br>


2.072014322600e-09 ||r(i)||/||b|| 9.939098100674e-09<br>


<br>


You can find the output of -ksp_view below. Let me know if you need more<br>


details.<br>


<br>


Thanks in advance for your advice!<br>


Best wishes<br>


David<br>


<br>


<br>


KSP Object: 1 MPI processes<br>


  type: fgmres<br>


    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt<br>


Orthogonalization with no iterative refinement<br>


    GMRES: happy breakdown tolerance 1e-30<br>


  maximum iterations=10000<br>


  tolerances:  relative=1e-08, absolute=1e-50, divergence=10000.<br>


  right preconditioning<br>


  diagonally scaled system<br>


  using nonzero initial guess<br>


  using UNPRECONDITIONED norm type for convergence test<br>


PC Object: 1 MPI processes<br>


  type: fieldsplit<br>


    FieldSplit with Schur preconditioner, factorization UPPER<br>


    Preconditioner for the Schur complement formed from user provided matrix<br>


    Split info:<br>


    Split number 0 Defined by IS<br>


    Split number 1 Defined by IS<br>


    KSP solver for A00 block<br>


      KSP Object:      (fieldsplit_0_)       1 MPI processes<br>


        type: preonly<br>


        maximum iterations=10000, initial guess is zero<br>


        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>


        left preconditioning<br>


        using NONE norm type for convergence test<br>


      PC Object:      (fieldsplit_0_)       1 MPI processes<br>


        type: ml<br>


          MG: type is MULTIPLICATIVE, levels=5 cycles=v<br>


            Cycles per PCApply=1<br>


            Using Galerkin computed coarse grid matrices<br>


        Coarse grid solver -- level ------------------------------<wbr>-<br>


          KSP Object:          (fieldsplit_0_mg_coarse_)           1 MPI<br>


processes<br>


            type: preonly<br>


            maximum iterations=10000, initial guess is zero<br>


            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>


            left preconditioning<br>


            using NONE norm type for convergence test<br>


          PC Object:          (fieldsplit_0_mg_coarse_)           1 MPI<br>


processes<br>


            type: lu<br>


              LU: out-of-place factorization<br>


              tolerance for zero pivot 2.22045e-14<br>


              using diagonal shift on blocks to prevent zero pivot<br>


[INBLOCKS]<br>


              matrix ordering: nd<br>


              factor fill ratio given 5., needed 1.<br>


                Factored matrix follows:<br>


                  Mat Object:                   1 MPI processes<br>


                    type: seqaij<br>


                    rows=3, cols=3<br>


                    package used to perform factorization: petsc<br>


                    total: nonzeros=3, allocated nonzeros=3<br>


                    total number of mallocs used during MatSetValues<br>


calls =0<br>


                      not using I-node routines<br>


            linear system matrix = precond matrix:<br>


            Mat Object:             1 MPI processes<br>


              type: seqaij<br>


              rows=3, cols=3<br>


              total: nonzeros=3, allocated nonzeros=3<br>


              total number of mallocs used during MatSetValues calls =0<br>


                not using I-node routines<br>


        Down solver (pre-smoother) on level 1<br>


------------------------------<wbr>-<br>


          KSP Object:          (fieldsplit_0_mg_levels_1_)           1<br>


MPI processes<br>


            type: richardson<br>


              Richardson: damping factor=1.<br>


            maximum iterations=2<br>


            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>


            left preconditioning<br>


            using nonzero initial guess<br>


            using NONE norm type for convergence test<br>


          PC Object:          (fieldsplit_0_mg_levels_1_)           1<br>


MPI processes<br>


            type: sor<br>


              SOR: type = local_symmetric, iterations = 1, local<br>


iterations = 1, omega = 1.<br>


            linear system matrix = precond matrix:<br>


            Mat Object:             1 MPI processes<br>


              type: seqaij<br>


              rows=15, cols=15<br>


              total: nonzeros=69, allocated nonzeros=69<br>


              total number of mallocs used during MatSetValues calls =0<br>


                not using I-node routines<br>


        Up solver (post-smoother) same as down solver (pre-smoother)<br>


        Down solver (pre-smoother) on level 2<br>


------------------------------<wbr>-<br>


          KSP Object:          (fieldsplit_0_mg_levels_2_)           1<br>


MPI processes<br>


            type: richardson<br>


              Richardson: damping factor=1.<br>


            maximum iterations=2<br>


            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>


            left preconditioning<br>


            using nonzero initial guess<br>


            using NONE norm type for convergence test<br>


          PC Object:          (fieldsplit_0_mg_levels_2_)           1<br>


MPI processes<br>


            type: sor<br>


              SOR: type = local_symmetric, iterations = 1, local<br>


iterations = 1, omega = 1.<br>


            linear system matrix = precond matrix:<br>


            Mat Object:             1 MPI processes<br>


              type: seqaij<br>


              rows=304, cols=304<br>


              total: nonzeros=7354, allocated nonzeros=7354<br>


              total number of mallocs used during MatSetValues calls =0<br>


                not using I-node routines<br>


        Up solver (post-smoother) same as down solver (pre-smoother)<br>


        Down solver (pre-smoother) on level 3<br>


------------------------------<wbr>-<br>


          KSP Object:          (fieldsplit_0_mg_levels_3_)           1<br>


MPI processes<br>


            type: richardson<br>


              Richardson: damping factor=1.<br>


            maximum iterations=2<br>


            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>


            left preconditioning<br>


            using nonzero initial guess<br>


            using NONE norm type for convergence test<br>


          PC Object:          (fieldsplit_0_mg_levels_3_)           1<br>


MPI processes<br>


            type: sor<br>


              SOR: type = local_symmetric, iterations = 1, local<br>


iterations = 1, omega = 1.<br>


            linear system matrix = precond matrix:<br>


            Mat Object:             1 MPI processes<br>


              type: seqaij<br>


              rows=30236, cols=30236<br>


              total: nonzeros=2730644, allocated nonzeros=2730644<br>


              total number of mallocs used during MatSetValues calls =0<br>


                not using I-node routines<br>


        Up solver (post-smoother) same as down solver (pre-smoother)<br>


        Down solver (pre-smoother) on level 4<br>


------------------------------<wbr>-<br>


          KSP Object:          (fieldsplit_0_mg_levels_4_)           1<br>


MPI processes<br>


            type: richardson<br>


              Richardson: damping factor=1.<br>


            maximum iterations=2<br>


            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>


            left preconditioning<br>


            using nonzero initial guess<br>


            using NONE norm type for convergence test<br>


          PC Object:          (fieldsplit_0_mg_levels_4_)           1<br>


MPI processes<br>


            type: sor<br>


              SOR: type = local_symmetric, iterations = 1, local<br>


iterations = 1, omega = 1.<br>


            linear system matrix = precond matrix:<br>


            Mat Object:            (fieldsplit_0_)             1 MPI<br>


processes<br>


              type: seqaij<br>


              rows=894132, cols=894132<br>


              total: nonzeros=70684164, allocated nonzeros=70684164<br>


              total number of mallocs used during MatSetValues calls =0<br>


                not using I-node routines<br>


        Up solver (post-smoother) same as down solver (pre-smoother)<br>


        linear system matrix = precond matrix:<br>


        Mat Object:        (fieldsplit_0_)         1 MPI processes<br>


          type: seqaij<br>


          rows=894132, cols=894132<br>


          total: nonzeros=70684164, allocated nonzeros=70684164<br>


          total number of mallocs used during MatSetValues calls =0<br>


            not using I-node routines<br>


    KSP solver for S = A11 - A10 inv(A00) A01<br>


      KSP Object:      (fieldsplit_1_)       1 MPI processes<br>


        type: preonly<br>


        maximum iterations=10000, initial guess is zero<br>


        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>


        left preconditioning<br>


        using NONE norm type for convergence test<br>


      PC Object:      (fieldsplit_1_)       1 MPI processes<br>


        type: jacobi<br>


        linear system matrix followed by preconditioner matrix:<br>


        Mat Object:        (fieldsplit_1_)         1 MPI processes<br>


          type: schurcomplement<br>


          rows=42025, cols=42025<br>


            Schur complement A11 - A10 inv(A00) A01<br>


            A11<br>


              Mat Object:              (fieldsplit_1_)               1<br>


MPI processes<br>


                type: seqaij<br>


                rows=42025, cols=42025<br>


                total: nonzeros=554063, allocated nonzeros=554063<br>


                total number of mallocs used during MatSetValues calls =0<br>


                  not using I-node routines<br>


            A10<br>


              Mat Object:               1 MPI processes<br>


                type: seqaij<br>


                rows=42025, cols=894132<br>


                total: nonzeros=6850107, allocated nonzeros=6850107<br>


                total number of mallocs used during MatSetValues calls =0<br>


                  not using I-node routines<br>


            KSP of A00<br>


              KSP Object:              (fieldsplit_0_)               1<br>


MPI processes<br>


                type: preonly<br>


                maximum iterations=10000, initial guess is zero<br>


                tolerances:  relative=1e-05, absolute=1e-50,<br>


divergence=10000.<br>


                left preconditioning<br>


                using NONE norm type for convergence test<br>


              PC Object:              (fieldsplit_0_)               1<br>


MPI processes<br>


                type: ml<br>


                  MG: type is MULTIPLICATIVE, levels=5 cycles=v<br>


                    Cycles per PCApply=1<br>


                    Using Galerkin computed coarse grid matrices<br>


                Coarse grid solver -- level ------------------------------<wbr>-<br>


                  KSP Object:<br>


(fieldsplit_0_mg_coarse_)                   1 MPI processes<br>


                    type: preonly<br>


                    maximum iterations=10000, initial guess is zero<br>


                    tolerances:  relative=1e-05, absolute=1e-50,<br>


divergence=10000.<br>


                    left preconditioning<br>


                    using NONE norm type for convergence test<br>


                  PC Object:<br>


(fieldsplit_0_mg_coarse_)                   1 MPI processes<br>


                    type: lu<br>


                      LU: out-of-place factorization<br>


                      tolerance for zero pivot 2.22045e-14<br>


                      using diagonal shift on blocks to prevent zero<br>


pivot [INBLOCKS]<br>


                      matrix ordering: nd<br>


                      factor fill ratio given 5., needed 1.<br>


                        Factored matrix follows:<br>


                          Mat Object:                           1 MPI<br>


processes<br>


                            type: seqaij<br>


                            rows=3, cols=3<br>


                            package used to perform factorization: petsc<br>


                            total: nonzeros=3, allocated nonzeros=3<br>


                            total number of mallocs used during<br>


MatSetValues calls =0<br>


                              not using I-node routines<br>


                    linear system matrix = precond matrix:<br>


                    Mat Object:                     1 MPI processes<br>


                      type: seqaij<br>


                      rows=3, cols=3<br>


                      total: nonzeros=3, allocated nonzeros=3<br>


                      total number of mallocs used during MatSetValues<br>


calls =0<br>


                        not using I-node routines<br>


                Down solver (pre-smoother) on level 1<br>


------------------------------<wbr>-<br>


                  KSP Object:<br>


(fieldsplit_0_mg_levels_1_)                   1 MPI processes<br>


                    type: richardson<br>


                      Richardson: damping factor=1.<br>


                    maximum iterations=2<br>


                    tolerances:  relative=1e-05, absolute=1e-50,<br>


divergence=10000.<br>


                    left preconditioning<br>


                    using nonzero initial guess<br>


                    using NONE norm type for convergence test<br>


                  PC Object:<br>


(fieldsplit_0_mg_levels_1_)                   1 MPI processes<br>


                    type: sor<br>


                      SOR: type = local_symmetric, iterations = 1, local<br>


iterations = 1, omega = 1.<br>


                    linear system matrix = precond matrix:<br>


                    Mat Object:                     1 MPI processes<br>


                      type: seqaij<br>


                      rows=15, cols=15<br>


                      total: nonzeros=69, allocated nonzeros=69<br>


                      total number of mallocs used during MatSetValues<br>


calls =0<br>


                        not using I-node routines<br>


                Up solver (post-smoother) same as down solver (pre-smoother)<br>


                Down solver (pre-smoother) on level 2<br>


------------------------------<wbr>-<br>


                  KSP Object:<br>


(fieldsplit_0_mg_levels_2_)                   1 MPI processes<br>


                    type: richardson<br>


                      Richardson: damping factor=1.<br>


                    maximum iterations=2<br>


                    tolerances:  relative=1e-05, absolute=1e-50,<br>


divergence=10000.<br>


                    left preconditioning<br>


                    using nonzero initial guess<br>


                    using NONE norm type for convergence test<br>


                  PC Object:<br>


(fieldsplit_0_mg_levels_2_)                   1 MPI processes<br>


                    type: sor<br>


                      SOR: type = local_symmetric, iterations = 1, local<br>


iterations = 1, omega = 1.<br>


                    linear system matrix = precond matrix:<br>


                    Mat Object:                     1 MPI processes<br>


                      type: seqaij<br>


                      rows=304, cols=304<br>


                      total: nonzeros=7354, allocated nonzeros=7354<br>


                      total number of mallocs used during MatSetValues<br>


calls =0<br>


                        not using I-node routines<br>


                Up solver (post-smoother) same as down solver (pre-smoother)<br>


                Down solver (pre-smoother) on level 3<br>


------------------------------<wbr>-<br>


                  KSP Object:<br>


(fieldsplit_0_mg_levels_3_)                   1 MPI processes<br>


                    type: richardson<br>


                      Richardson: damping factor=1.<br>


                    maximum iterations=2<br>


                    tolerances:  relative=1e-05, absolute=1e-50,<br>


divergence=10000.<br>


                    left preconditioning<br>


                    using nonzero initial guess<br>


                    using NONE norm type for convergence test<br>


                  PC Object:<br>


(fieldsplit_0_mg_levels_3_)                   1 MPI processes<br>


                    type: sor<br>


                      SOR: type = local_symmetric, iterations = 1, local<br>


iterations = 1, omega = 1.<br>


                    linear system matrix = precond matrix:<br>


                    Mat Object:                     1 MPI processes<br>


                      type: seqaij<br>


                      rows=30236, cols=30236<br>


                      total: nonzeros=2730644, allocated nonzeros=2730644<br>


                      total number of mallocs used during MatSetValues<br>


calls =0<br>


                        not using I-node routines<br>


                Up solver (post-smoother) same as down solver (pre-smoother)<br>


                Down solver (pre-smoother) on level 4<br>


------------------------------<wbr>-<br>


                  KSP Object:<br>


(fieldsplit_0_mg_levels_4_)                   1 MPI processes<br>


                    type: richardson<br>


                      Richardson: damping factor=1.<br>


                    maximum iterations=2<br>


                    tolerances:  relative=1e-05, absolute=1e-50,<br>


divergence=10000.<br>


                    left preconditioning<br>


                    using nonzero initial guess<br>


                    using NONE norm type for convergence test<br>


                  PC Object:<br>


(fieldsplit_0_mg_levels_4_)                   1 MPI processes<br>


                    type: sor<br>


                      SOR: type = local_symmetric, iterations = 1, local<br>


iterations = 1, omega = 1.<br>


                    linear system matrix = precond matrix:<br>


                    Mat Object:<br>


(fieldsplit_0_)                     1 MPI processes<br>


                      type: seqaij<br>


                      rows=894132, cols=894132<br>


                      total: nonzeros=70684164, allocated nonzeros=70684164<br>


                      total number of mallocs used during MatSetValues<br>


calls =0<br>


                        not using I-node routines<br>


                Up solver (post-smoother) same as down solver (pre-smoother)<br>


                linear system matrix = precond matrix:<br>


                Mat Object:<br>


(fieldsplit_0_)                 1 MPI processes<br>


                  type: seqaij<br>


                  rows=894132, cols=894132<br>


                  total: nonzeros=70684164, allocated nonzeros=70684164<br>


                  total number of mallocs used during MatSetValues calls =0<br>


                    not using I-node routines<br>


            A01<br>


              Mat Object:               1 MPI processes<br>


                type: seqaij<br>


                rows=894132, cols=42025<br>


                total: nonzeros=6850107, allocated nonzeros=6850107<br>


                total number of mallocs used during MatSetValues calls =0<br>


                  not using I-node routines<br>


        Mat Object:         1 MPI processes<br>


          type: seqaij<br>


          rows=42025, cols=42025<br>


          total: nonzeros=554063, allocated nonzeros=554063<br>


          total number of mallocs used during MatSetValues calls =0<br>


            not using I-node routines<br>


  linear system matrix = precond matrix:<br>


  Mat Object:   1 MPI processes<br>


    type: seqaij<br>


    rows=936157, cols=936157<br>


    total: nonzeros=84938441, allocated nonzeros=84938441<br>


    total number of mallocs used during MatSetValues calls =0<br>


      not using I-node routines<br>


<br>


<br>


</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.caam.rice.edu/~mk51/" target="_blank">http://www.caam.rice.edu/~mk51/</a><br></div></div></div>


</div></div>