<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <br>
    <br>
    <div class="moz-cite-prefix">On 06/14/2017 02:26 PM, Dave May wrote:<br>
    </div>
    <blockquote
cite="mid:CAJ98EDptgOtc6aSEKdbvDe=mS-9ijpCZ8mMgg=z+wD8--RyGGg@mail.gmail.com"
      type="cite">
      <div><br>
        <div class="gmail_quote">
          <div>On Wed, 14 Jun 2017 at 19:42, David Nolte <<a
              moz-do-not-send="true" href="mailto:dnolte@dim.uchile.cl">dnolte@dim.uchile.cl</a>>
            wrote:<br>
          </div>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000000"> Dave, thanks a lot
              for your great answer and for sharing your experience. I
              have a much clearer picture now. :)<br>
              <br>
              The experiments 3/ give the desired results for examples
              of cavity flow. The (1/mu scaled) mass matrix seems OK.<br>
              <br>
              I followed your and Matt's recommendations, used a FULL
              Schur factorization, LU in the 0th split, and gradually
              relaxed the tolerance of GMRES/Jacobi in split 1 (observed
              the gradual increase in outer iterations). Then I replaced
              the split_0 LU with AMG (further increase of outer
              iterations and iterations on the Schur complement). <br>
              Doing so I converged to using hypre boomeramg (smooth_type
              Euclid, strong_threshold 0.75) and 3 iterations of
              GMRES/Jacobi on the Schur block, which gave the best
              time-to-solution in my particular setup and convergence to
              rtol=1e-8 within 60 outer iterations.<br>
              In my cases, using GMRES in the 0th split (with rtol 1e-1
              or 1e-2) instead of "preonly" did not help convergence (on
              the contrary).<br>
              <br>
              I also repeated the experiments with
              "-pc_fieldsplit_schur_precondition selfp", with hypre(ilu)
              in split 0 and hypre in split 1, just to check, and
              somewhat disappointingly ( ;-) ) the wall time is less
              than half than when using gmres/Jac and Sp = mass matrix.<br>
              I am aware that this says nothing about scaling and
              robustness with respect to h-refinement...</div>
          </blockquote>
          <div><br>
          </div>
          <div>- selfp defines the schur pc as A10 inv(diag(A00)) A01.
            This operator is not spectrally equivalent to S</div>
          <div><br>
          </div>
          <div>- For split 0 did you use preonly-hypre(ilu)?</div>
          <div><br>
          </div>
          <div>- For split 1 did you also use hypre(ilu) (you just wrote
            hypre)?</div>
          <div><br>
          </div>
          <div>- What was the iteration count for the saddle point
            problem with hypre and selfp? Iterates will increase if you
            refine the mesh and a cross over will occur at some
            (unknown) resolution and the mass matrix variant will be
            faster.</div>
        </div>
      </div>
    </blockquote>
    <br>
    Ok, this makes sense.<br>
    <br>
    split 1 has hypre with the default smoother (Schwarz-smoothers), the
    setup is:<br>
    <br>
       <tt>  -pc_type fieldsplit</tt><tt><br>
    </tt><tt>    -pc_fieldsplit_type schur</tt><tt><br>
    </tt><tt>    -pc_fieldsplit_detect_saddle_point</tt><tt><br>
    </tt><tt>    -pc_fieldsplit_schur_fact_type full</tt><tt><br>
    </tt><tt>    -pc_fieldsplit_schur_precondition selfp</tt><tt><br>
    </tt><tt><br>
    </tt><tt>    -fieldsplit_0_ksp_type richardson</tt><tt><br>
    </tt><tt>    -fieldsplit_0_ksp_max_it 1</tt><tt><br>
    </tt><tt>    -fieldsplit_0_pc_type hypre</tt><tt><br>
    </tt><tt>    -fieldsplit_0_pc_hypre_type boomeramg</tt><tt><br>
    </tt><tt>    -fieldsplit_0_pc_hypre_boomeramg_strong_threshold 0.75</tt><tt><br>
    </tt><tt>    -fieldsplit_0_pc_hypre_boomeramg_smooth_type Euclid</tt><tt><br>
    </tt><tt>    -fieldsplit_0_pc_hypre_boomeramg_eu_bj</tt><tt><br>
    </tt><tt><br>
    </tt><tt>    -fieldsplit_1_ksp_type richardson</tt><tt><br>
    </tt><tt>    -fieldsplit_1_ksp_max_it 1</tt><tt><br>
    </tt><tt>    -fieldsplit_1_pc_type hypre</tt><tt><br>
    </tt><tt>    -fieldsplit_1_pc_hypre_type boomeramg</tt><tt><br>
    </tt><tt>    -fieldsplit_1_pc_hypre_boomeramg_strong_threshold 0.75</tt><br>
    <br>
    Iteration counts were in two different cases 90 and 113, while the
    mass matrix variant (gmres/jacobi iterations on the Schur
    complement) took 56 and 59.<br>
    <br>
    <br>
    <blockquote
cite="mid:CAJ98EDptgOtc6aSEKdbvDe=mS-9ijpCZ8mMgg=z+wD8--RyGGg@mail.gmail.com"
      type="cite">
      <div>
        <div class="gmail_quote">
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000000"> Would you agree that
              these configurations "make sense"?</div>
          </blockquote>
          <div><br>
          </div>
          <div>If you want to weak scale, the configuration with the
            mass matrix makes the most sense.</div>
          <div><br>
          </div>
          <div>If you are only interested in solving many problems on
            one mesh, then do what ever you can to make the solve time
            as fast as possible - including using preconditioners
            defined with non-spectrally equivalent operators :D</div>
          <div><br>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    I see. That's exactly my case, many problems on one mesh (they are
    generated from medical images with fixed resolution). The
    hypre/selfp variant is 2-3x faster, so I'll just stick with that for
    the moment and try tuning the hypre parameters.<br>
    <br>
    Thanks again!<br>
    <br>
    <br>
    <br>
    <blockquote
cite="mid:CAJ98EDptgOtc6aSEKdbvDe=mS-9ijpCZ8mMgg=z+wD8--RyGGg@mail.gmail.com"
      type="cite">
      <div>
        <div class="gmail_quote">
          <div>Thanks,</div>
          <div>  Dave</div>
          <div><br>
          </div>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000000"><br>
              Furthermore, maybe anyone has a hint where to start tuning
              multigrid? So far hypre worked better than ML, but I have
              not experimented much with the parameters.</div>
          </blockquote>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000000"><br>
              <br>
              <br>
              Thanks again for your help!<br>
              <br>
              Best wishes,<br>
              David</div>
            <div bgcolor="#FFFFFF" text="#000000"><br>
              <br>
              <br>
              <br>
              <div class="m_-7814857284138451345moz-cite-prefix">On
                06/12/2017 04:52 PM, Dave May wrote:<br>
              </div>
              <blockquote type="cite">
                <div>I've been following the discussion and have a
                  couple of comments:
                  <div><br>
                  </div>
                  <div>1/ For the preconditioners that you are using
                    (Schur factorisation LDU, or upper block triangular
                    DU), the convergence properties (e.g. 1 iterate for
                    LDU and 2 iterates for DU) come from analysis
                    involving exact inverses of A_00 and S</div>
                  <div><br>
                  </div>
                  <div>Once you switch from using exact inverses of A_00
                    and S, you have to rely on spectral equivalence of
                    operators. That is fine, but the spectral
                    equivalence does not tell you how many iterates LDU
                    or DU will require to converge. What it does inform
                    you about is that if you have a spectrally
                    equivalent operator for A_00 and S (Schur
                    complement), then under mesh refinement, your
                    iteration count (whatever it was prior to
                    refinement) will not increase.<br>
                  </div>
                  <div><br>
                  </div>
                  <div>2/ Looking at your first set of options, I see
                    you have opted to use -fieldsplit_ksp_type preonly
                    (for both split 0 and 1). That is nice as it creates
                    a linear operator thus you don't need something like
                    FGMRES or GCR applied to the saddle point problem. </div>
                  <div><br>
                  </div>
                  <div>Your choice for Schur is fine in the sense that
                    the diagonal of M is spectrally equivalent to M, and
                    M is spectrally equivalent to S. Whether it is
                    "fine" in terms of the iteration count for Schur
                    systems, we cannot say apriori (since the spectral
                    equivalence doesn't give us direct info about the
                    iterations we should expect). </div>
                  <div><br>
                  </div>
                  <div>Your preconditioner for A_00 relies on AMG
                    producing a spectrally equivalent operator with
                    bounds which are tight enough to ensure convergence
                    of the saddle point problem. I'll try explain this.</div>
                  <div><br>
                  </div>
                  <div>In my experience, for many problems (unstructured
                    FE with variable coefficients, structured FE meshes
                    with variable coefficients) AMG and preonly is not a
                    robust choice. To control the approximation (the
                    spectral equiv bounds), I typically run a stationary
                    or Krylov method on split 0 (e.g.
                    -fieldsplit_0_ksp_type xxx -fieldsplit_0_kps_rtol
                    yyy). Since the AMG preconditioner generated is
                    spectrally equivalent (usually!), these solves will
                    converge to a chosen rtol in a constant number of
                    iterates under h-refinement. In practice, if I don't
                    enforce that I hit something like rtol=1.0e-1 (or
                    1.0e-2) on the 0th split, saddle point iterates will
                    typically increase for "hard" problems under mesh
                    refinement (1e4-1e7 coefficient variation), and may
                    not even converge at all when just using
                    -fieldsplit_0_ksp_type preonly. Failure ultimately
                    depends on how "strong" the preconditioner for A_00
                    block is (consider re-discretized geometric
                    multigrid versus AMG). Running an iterative solve on
                    the 0th split lets you control and recover from
                    weak/poor, but spectrally equivalent preconditioners
                    for A_00. Note that people hate this approach as it
                    invariably nests Krylov methods, and subsequently
                    adds more global reductions. However, it is
                    scalable, optimal, tuneable and converges faster
                    than the case which didn't converge at all :D</div>
                  <div><br>
                  </div>
                  <div>3/ I agree with Matt's comments, but I'd do a
                    couple of other things first.</div>
                  <div><br>
                  </div>
                  <div>* I'd first check the discretization is
                    implemented correctly. Your P2/P1 element is inf-sup
                    stable - thus the condition number of S
                    (unpreconditioned) should be independent of the mesh
                    resolution (h). An easy way to verify this is to run
                    either LDU (schur_fact_type full) or DU
                    (schur_fact_type upper) and monitor the iterations
                    required for those S solves. Use
                    -fieldsplit_1_pc_type none -fieldsplit_1_ksp_rtol
                    1.0e-8 -fieldsplit_1_ksp_monitor_true_residual
                    -fieldsplit_1_ksp_pc_right -fieldsplit_1_ksp_type
                    gmres -fieldsplit_0_pc_type lu</div>
                  <div><br>
                  </div>
                  <div>Then refine the mesh (ideally via sub-division)
                    and repeat the experiment.</div>
                  <div>If the S iterates don't asymptote, but instead
                    grow with each refinement - you likely have a
                    problem with the discretisation.</div>
                  <div><br>
                  </div>
                  <div>* Do the same experiment, but this time use your
                    mass matrix as the preconditioner for S and use
                    -fieldsplit_1_pc_type lu. If the iterates, compared
                    with the previous experiments (without a Schur PC)
                    have gone up your mass matrix is not defined
                    correctly. If in the previous experiment (without a
                    Schur PC) iterates on the S solves were bounded, but
                    now when preconditioned with the mass matrix the
                    iterates go up, then your mass matrix is definitely
                    not correct.</div>
                  <div><br>
                  </div>
                  <div>4/ Lastly, to finally get to your question
                    regarding does  +400 iterates for the solving the
                    Schur seem "reasonable" and what is "normal
                    behaviour"? </div>
                  <div><br>
                  </div>
                  <div>It seems "high" to me. However the specifics of
                    your discretisation, mesh topology, element quality,
                    boundary conditions render it almost impossible to
                    say what should be expected. When I use a Q2-P2*
                    discretisation on a structured mesh with a
                    non-constant viscosity I'd expect something like
                    50-60 for 1.0e-10 with a mass matrix scaled by the
                    inverse (local) viscosity. For constant viscosity
                    maybe 30 iterates. I think this kind of statement is
                    not particularly useful or helpful though.</div>
                  <div><br>
                  </div>
                  <div>
                    <div>Given you use an unstructured tet mesh, it is
                      possible that some elements have very bad quality
                      (high aspect ratio (AR), highly skewed). I am
                      certain that P2/P1 has an inf-sup constant which
                      is sensitive to the element aspect ratio (I don't
                      recall the exact scaling wrt AR). From experience
                      I know that using the mass matrix as a
                      preconditioner for Schur is not robust as AR
                      increases (e.g. iterations for the S solve grow).
                      Hence, with a couple of "bad" element in your
                      mesh, I could imagine that you could end up having
                      to perform +400 iterations </div>
                  </div>
                  <div><br>
                  </div>
                  <div>5/ Lastly, definitely don't impose one Dirichlet
                    BC on pressure to make the pressure unique. This
                    really screws up all the nice properties of your
                    matrices. Just enforce the constant null space for
                    p. And as you noticed, GMRES magically just does it
                    automatically if the RHS of your original system was
                    consistent.</div>
                  <div> </div>
                  <div>Thanks,</div>
                  <div>  Dave</div>
                  <div><br>
                  </div>
                  <div class="gmail_extra"><br>
                    <div class="gmail_quote">On 12 June 2017 at 20:20,
                      David Nolte <span><<a moz-do-not-send="true"
                          href="mailto:dnolte@dim.uchile.cl"
                          target="_blank">dnolte@dim.uchile.cl</a>></span>
                      wrote:<br>
                      <blockquote class="gmail_quote" style="margin:0px
                        0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                        <div bgcolor="#FFFFFF"> Ok. With <tt>"-pc_fieldsplit_schur_fact_type
                            full" </tt>the outer iteration converges in
                          1 step. The problem remain the Schur
                          iterations.<br>
                          <br>
                          I was not sure if the problem was maybe the
                          singular pressure or the pressure Dirichlet
                          BC. I tested the solver with a standard Stokes
                          flow in a pipe with a constriction (zero
                          Neumann BC for the pressure at the outlet) and
                          in a 3D cavity (enclosed flow, no pressure BC
                          or fixed at one point). I am not sure if I
                          need to attach the constant pressure nullspace
                          to the matrix for GMRES. Not doing so does not
                          alter the convergence of GMRES in the Schur
                          solver (nor the pressure solution), using a
                          pressure Dirichlet BC however slows down
                          convergence (I suppose because of the scaling
                          of the matrix).<br>
                          <br>
                          I also checked the pressure mass matrix that I
                          give PETSc, it looks correct.<br>
                          <br>
                          In all these cases, the solver behaves just as
                          before. With LU in fieldsplit_0 and GMRES/LU
                          with rtol 1e-10 in fieldsplit_1, it converges
                          after 1 outer iteration, but the inner Schur
                          solver converges slowly. <br>
                          <br>
                          How should the convergence of GMRES/LU of the
                          Schur complement *normally* behave?<br>
                          <br>
                          Thanks again!<span
class="m_-7814857284138451345gmail-m_2691972541491180255gmail-m_1522616294910952114HOEnZb"><font
                              color="#888888"><br>
                              David</font></span>
                          <div>
                            <div
class="m_-7814857284138451345gmail-m_2691972541491180255gmail-m_1522616294910952114h5"><br>
                              <br>
                              <br>
                              <br>
                              <div
class="m_-7814857284138451345gmail-m_2691972541491180255gmail-m_1522616294910952114m_-1125133874872333755moz-cite-prefix">On
                                06/12/2017 12:41 PM, Matthew Knepley
                                wrote:<br>
                              </div>
                              <blockquote type="cite">
                                <div>
                                  <div class="gmail_extra">
                                    <div class="gmail_quote">On Mon, Jun
                                      12, 2017 at 10:36 AM, David Nolte
                                      <span><<a
                                          moz-do-not-send="true"
                                          href="mailto:dnolte@dim.uchile.cl"
                                          target="_blank">dnolte@dim.uchile.cl</a>></span>
                                      wrote:<br>
                                      <blockquote class="gmail_quote"
                                        style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                                        <div bgcolor="#FFFFFF"> <br>
                                          <div
class="m_-7814857284138451345gmail-m_2691972541491180255gmail-m_1522616294910952114m_-1125133874872333755m_4366232618162032171moz-cite-prefix">On
                                            06/12/2017 07:50 AM, Matthew
                                            Knepley wrote:<br>
                                          </div>
                                          <blockquote type="cite">
                                            <div>
                                              <div class="gmail_extra">
                                                <div class="gmail_quote">On
                                                  Sun, Jun 11, 2017 at
                                                  11:06 PM, David Nolte
                                                  <span><<a
                                                      moz-do-not-send="true"
href="mailto:dnolte@dim.uchile.cl" target="_blank">dnolte@dim.uchile.cl</a>></span>
                                                  wrote:<br>
                                                  <blockquote
                                                    class="gmail_quote"
                                                    style="margin:0px
                                                    0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
                                                    <div
                                                      bgcolor="#FFFFFF">
                                                      Thanks Matt, makes
                                                      sense to me!<br>
                                                      <br>
                                                      I skipped direct
                                                      solvers at first
                                                      because for these
                                                      'real'
                                                      configurations LU
(mumps/superlu_dist) usally goes out of memory (got 32GB RAM). It would
                                                      be reasonable to
                                                      take one more step
                                                      back and play with
                                                      synthetic
                                                      examples.<br>
                                                      I managed to run
                                                      one case though
                                                      with 936k dofs
                                                      using: ("user"
                                                      =pressure mass
                                                      matrix)<br>
                                                      <br>
                                                      <tt><...><br>
-pc_fieldsplit_schur_fact_type upper</tt><tt><br>
                                                      </tt><tt>-pc_fieldsplit_schur_precondition
                                                        user</tt><tt><br>
                                                      </tt><tt>-fieldsplit_0_ksp_type
                                                        preonly   </tt><tt><br>
                                                      </tt><tt>-fieldsplit_0_pc_type
                                                        lu</tt><tt><br>
                                                      </tt><tt>-fieldsplit_0_pc_factor_mat_solver_package
                                                        mumps</tt><tt><br>
                                                      </tt><tt><br>
                                                      </tt><tt>
                                                        -fieldsplit_1_ksp_type
                                                        gmres<br>
-fieldsplit_1_ksp_monitor_true_residuals<br>
-fieldsplit_1_ksp_rtol 1e-10<br>
                                                      </tt><tt>-fieldsplit_1_pc_type
                                                        lu</tt><tt><br>
                                                      </tt><tt>
                                                        -fieldsplit_1_pc_factor_mat_solver_package
                                                        mumps</tt><tt><br>
                                                      </tt><br>
                                                      It takes 2 outer
                                                      iterations, as
                                                      expected. However
                                                      the fieldsplit_1
                                                      solve takes very
                                                      long.<br>
                                                    </div>
                                                  </blockquote>
                                                  <div><br>
                                                  </div>
                                                  <div>1) It should take
                                                    1 outer iterate, not
                                                    two. The problem is
                                                    that your Schur
                                                    tolerance is way too
                                                    high. Use</div>
                                                  <div><br>
                                                  </div>
                                                  <div> 
                                                    -fieldsplit_1_ksp_rtol
                                                    1e-10</div>
                                                  <div><br>
                                                  </div>
                                                  <div>or something like
                                                    that. Then it will
                                                    take 1 iterate.</div>
                                                </div>
                                              </div>
                                            </div>
                                          </blockquote>
                                          <br>
                                          Shouldn't it take 2 with a
                                          triangular Schur factorization
                                          and exact preconditioners, and
                                          1 with a full factorization?
                                          (cf. Benzi et al 2005, p.66, <a
                                            moz-do-not-send="true"
class="m_-7814857284138451345gmail-m_2691972541491180255gmail-m_1522616294910952114m_-1125133874872333755m_4366232618162032171moz-txt-link-freetext"
href="http://www.mathcs.emory.edu/%7Ebenzi/Web_papers/bgl05.pdf"
                                            target="_blank">http://www.mathcs.emory.edu/~benzi/Web_papers/bgl05.pdf</a>)<br>
                                          <br>
                                          That's exactly what I set: <tt>
                                            -fieldsplit_1_ksp_rtol 1e-10
                                          </tt>and the Schur solver does
                                          drop below "rtol < 1e-10"<br>
                                        </div>
                                      </blockquote>
                                      <div><br>
                                      </div>
                                      <div>Oh, yes. Take away the upper
                                        until things are worked out.</div>
                                      <div><br>
                                      </div>
                                      <div>  Thanks,</div>
                                      <div><br>
                                      </div>
                                      <div>    Matt</div>
                                      <blockquote class="gmail_quote"
                                        style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                                        <div bgcolor="#FFFFFF">
                                          <blockquote type="cite">
                                            <div>
                                              <div class="gmail_extra">
                                                <div class="gmail_quote">
                                                  <div><br>
                                                  </div>
                                                  <div>2) There is a
                                                    problem with the
                                                    Schur solve. Now
                                                    from the iterates</div>
                                                  <div><br>
                                                  </div>
                                                  <div><span
                                                      style="font-family:monospace">423
                                                      KSP preconditioned
                                                      resid norm
                                                      2.638419658982e-02
                                                      true resid norm
                                                      7.229653211635e-11
                                                      ||r(i)||/||b||
                                                      7.229653211635e-11</span><br>
                                                  </div>
                                                  <div><br>
                                                  </div>
                                                  <div>it is clear that
                                                    the preconditioner
                                                    is really screwing
                                                    stuff up. For
                                                    testing, you can use</div>
                                                  <div><br>
                                                  </div>
                                                  <div> 
                                                    -pc_fieldsplit_schur_precondition
                                                    full</div>
                                                  <div><br>
                                                  </div>
                                                  <div>and your same
                                                    setup here. It
                                                    should take one
                                                    iterate. I think
                                                    there is something
                                                    wrong with your</div>
                                                  <div>mass matrix.</div>
                                                </div>
                                              </div>
                                            </div>
                                          </blockquote>
                                          <br>
                                          I agree. I forgot to mention
                                          that I am considering an
                                          "enclosed flow" problem, with
                                          u=0 on all the boundary and a
                                          Dirichlet condition for the
                                          pressure in one point for
                                          fixing the constant pressure.
                                          Maybe the preconditioner is
                                          not consistent with this
                                          setup, need to check this..<br>
                                          <br>
                                          Thanks a lot<br>
                                          <br>
                                          <br>
                                          <blockquote type="cite">
                                            <div>
                                              <div class="gmail_extra">
                                                <div class="gmail_quote">
                                                  <div><br>
                                                  </div>
                                                  <div>  Thanks,</div>
                                                  <div><br>
                                                  </div>
                                                  <div>    Matt</div>
                                                  <div><br>
                                                  </div>
                                                  <blockquote
                                                    class="gmail_quote"
                                                    style="margin:0px
                                                    0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
                                                    <div
                                                      bgcolor="#FFFFFF">
                                                      <br>
                                                      <tt>  0 KSP
                                                        unpreconditioned
                                                        resid norm
                                                        4.038466809302e-03
                                                        true resid norm
4.038466809302e-03 ||r(i)||/||b|| 1.000000000000e+00</tt><tt><br>
                                                      </tt><tt>   
                                                        Residual norms
                                                        for
                                                        fieldsplit_1_
                                                        solve.</tt><tt><br>
                                                      </tt><tt>    0 KSP
                                                        preconditioned
                                                        resid norm
                                                        0.000000000000e+00
                                                        true resid norm
0.000000000000e+00 ||r(i)||/||b||           -nan</tt><tt><br>
                                                      </tt><tt>  Linear
                                                        fieldsplit_1_
                                                        solve converged
                                                        due to
                                                        CONVERGED_ATOL
                                                        iterations 0</tt><tt><br>
                                                      </tt><tt>  1 KSP
                                                        unpreconditioned
                                                        resid norm
                                                        4.860095964831e-06
                                                        true resid norm
4.860095964831e-06 ||r(i)||/||b|| 1.203450763452e-03</tt><tt><br>
                                                      </tt><tt>   
                                                        Residual norms
                                                        for
                                                        fieldsplit_1_
                                                        solve.</tt><tt><br>
                                                      </tt><tt>    0 KSP
                                                        preconditioned
                                                        resid norm
                                                        2.965546249872e+08
                                                        true resid norm
1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00</tt><tt><br>
                                                      </tt><tt>    1 KSP
                                                        preconditioned
                                                        resid norm
                                                        1.347596594634e+08
                                                        true resid norm
3.599678801575e-01 ||r(i)||/||b|| 3.599678801575e-01</tt><tt><br>
                                                      </tt><tt>    2 KSP
                                                        preconditioned
                                                        resid norm
                                                        5.913230136403e+07
                                                        true resid norm
2.364916760834e-01 ||r(i)||/||b|| 2.364916760834e-01</tt><tt><br>
                                                      </tt><tt>    3 KSP
                                                        preconditioned
                                                        resid norm
                                                        4.629700028930e+07
                                                        true resid norm
1.984444715595e-01 ||r(i)||/||b|| 1.984444715595e-01</tt><tt><br>
                                                      </tt><tt>    4 KSP
                                                        preconditioned
                                                        resid norm
                                                        3.804431276819e+07
                                                        true resid norm
1.747224559120e-01 ||r(i)||/||b|| 1.747224559120e-01</tt><tt><br>
                                                      </tt><tt>    5 KSP
                                                        preconditioned
                                                        resid norm
                                                        3.178769422140e+07
                                                        true resid norm
1.402254864444e-01 ||r(i)||/||b|| 1.402254864444e-01</tt><tt><br>
                                                      </tt><tt>    6 KSP
                                                        preconditioned
                                                        resid norm
                                                        2.648669043919e+07
                                                        true resid norm
1.191164310866e-01 ||r(i)||/||b|| 1.191164310866e-01</tt><tt><br>
                                                      </tt><tt>    7 KSP
                                                        preconditioned
                                                        resid norm
                                                        2.203522108614e+07
                                                        true resid norm
9.690500018007e-02 ||r(i)||/||b|| 9.690500018007e-02</tt><tt><br>
                                                            <...><br>
                                                            422 KSP
                                                        preconditioned
                                                        resid norm
                                                        2.984888715147e-02
                                                        true resid norm
8.598401046494e-11 ||r(i)||/||b|| 8.598401046494e-11<br>
                                                            423 KSP
                                                        preconditioned
                                                        resid norm
                                                        2.638419658982e-02
                                                        true resid norm
7.229653211635e-11 ||r(i)||/||b|| 7.229653211635e-11<br>
                                                          Linear
                                                        fieldsplit_1_
                                                        solve converged
                                                        due to
                                                        CONVERGED_RTOL
                                                        iterations 423<br>
                                                          2 KSP
                                                        unpreconditioned
                                                        resid norm
                                                        3.539889585599e-16
                                                        true resid norm
3.542279617063e-16 ||r(i)||/||b|| 8.771347603759e-14<br>
                                                        Linear solve
                                                        converged due to
                                                        CONVERGED_RTOL
                                                        iterations 2<br>
                                                      </tt><tt><br>
                                                                       
                                                      </tt><br>
                                                      Does the slow
                                                      convergence of the
                                                      Schur block mean
                                                      that my
                                                      preconditioning
                                                      matrix Sp is a
                                                      poor choice?<br>
                                                      <br>
                                                      Thanks,<br>
                                                      David<br>
                                                      <br>
                                                      <br>
                                                      <div
class="m_-7814857284138451345gmail-m_2691972541491180255gmail-m_1522616294910952114m_-1125133874872333755m_4366232618162032171gmail-m_5328507656823621836moz-cite-prefix">On
                                                        06/11/2017 08:53
                                                        AM, Matthew
                                                        Knepley wrote:<br>
                                                      </div>
                                                      <blockquote
                                                        type="cite">
                                                        <div>
                                                          <div
                                                          class="gmail_extra">
                                                          <div
                                                          class="gmail_quote">On
                                                          Sat, Jun 10,
                                                          2017 at 8:25
                                                          PM, David
                                                          Nolte <span><<a
moz-do-not-send="true" href="mailto:dnolte@dim.uchile.cl"
                                                          target="_blank">dnolte@dim.uchile.cl</a>></span>
                                                          wrote:<br>
                                                          <blockquote
                                                          class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Dear
                                                          all,<br>
                                                          <br>
                                                          I am solving a
                                                          Stokes problem
                                                          in 3D aorta
                                                          geometries,
                                                          using a P2/P1<br>
                                                          finite
                                                          elements
                                                          discretization
                                                          on tetrahedral
                                                          meshes
                                                          resulting in<br>
                                                          ~1-1.5M DOFs.
                                                          Viscosity is
                                                          uniform (can
                                                          be adjusted
                                                          arbitrarily),
                                                          and<br>
                                                          the right hand
                                                          side is a
                                                          function of
                                                          noisy
                                                          measurement
                                                          data.<br>
                                                          <br>
                                                          In other
                                                          settings of
                                                          "standard"
                                                          Stokes flow
                                                          problems I
                                                          have obtained<br>
                                                          good
                                                          convergence
                                                          with an
                                                          "upper" Schur
                                                          complement
                                                          preconditioner,
                                                          using<br>
                                                          AMG (ML or
                                                          Hypre) on the
                                                          velocity block
                                                          and
                                                          approximating
                                                          the Schur<br>
                                                          complement
                                                          matrix by the
                                                          diagonal of
                                                          the pressure
                                                          mass matrix:<br>
                                                          <br>
                                                             
                                                          -ksp_converged_reason<br>
                                                             
                                                          -ksp_monitor_true_residual<br>
                                                             
                                                          -ksp_initial_guess_nonzero<br>
                                                             
                                                          -ksp_diagonal_scale<br>
                                                             
                                                          -ksp_diagonal_scale_fix<br>
                                                              -ksp_type
                                                          fgmres<br>
                                                              -ksp_rtol
                                                          1.0e-8<br>
                                                          <br>
                                                              -pc_type
                                                          fieldsplit<br>
                                                             
                                                          -pc_fieldsplit_type
                                                          schur<br>
                                                             
                                                          -pc_fieldsplit_detect_saddle_point<br>
                                                             
                                                          -pc_fieldsplit_schur_fact_type
                                                          upper<br>
                                                             
                                                          -pc_fieldsplit_schur_precondition
                                                          user    #
                                                          <--
                                                          pressure mass
                                                          matrix<br>
                                                          <br>
                                                             
                                                          -fieldsplit_0_ksp_type
                                                          preonly<br>
                                                             
                                                          -fieldsplit_0_pc_type
                                                          ml<br>
                                                          <br>
                                                             
                                                          -fieldsplit_1_ksp_type
                                                          preonly<br>
                                                             
                                                          -fieldsplit_1_pc_type
                                                          jacobi<br>
                                                          </blockquote>
                                                          <div><br>
                                                          </div>
                                                          <div>1) I
                                                          always
                                                          recommend
                                                          starting from
                                                          an exact
                                                          solver and
                                                          backing off in
                                                          small steps
                                                          for
                                                          optimization.
                                                          Thus</div>
                                                          <div>    I
                                                          would start
                                                          with LU on the
                                                          upper block
                                                          and GMRES/LU
                                                          with toelrance
                                                          1e-10 on the
                                                          Schur block.</div>
                                                          <div>    This
                                                          should
                                                          converge in 1
                                                          iterate.</div>
                                                          <div><br>
                                                          </div>
                                                          <div>2) I
                                                          don't think
                                                          you want
                                                          preonly on the
                                                          Schur system.
                                                          You might want
                                                          GMRES/Jacobi
                                                          to invert the
                                                          mass matrix.</div>
                                                          <div><br>
                                                          </div>
                                                          <div>3) You
                                                          probably want
                                                          to tighten the
                                                          tolerance on
                                                          the Schur
                                                          solve, at
                                                          least to
                                                          start, and
                                                          then slowly
                                                          let it out.
                                                          The</div>
                                                          <div>    tight
                                                          tolerance will
                                                          show you how
                                                          effective the
                                                          preconditioner
                                                          is using that
                                                          Schur
                                                          operator. Then
                                                          you can start</div>
                                                          <div>    to
                                                          evaluate how
                                                          effective the
                                                          Schur linear
                                                          sovler is.</div>
                                                          <div><br>
                                                          </div>
                                                          <div>Does this
                                                          make sense?</div>
                                                          <div><br>
                                                          </div>
                                                          <div>  Thanks,</div>
                                                          <div><br>
                                                          </div>
                                                          <div>     Matt</div>
                                                          <div> </div>
                                                          <blockquote
                                                          class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
                                                          In my present
                                                          case this
                                                          setup gives
                                                          rather slow
                                                          convergence
                                                          (varies for<br>
                                                          different
                                                          geometries
                                                          between
                                                          200-500 or
                                                          several
                                                          thousands!). I
                                                          obtain<br>
                                                          better
                                                          convergence
                                                          with
                                                          "-pc_fieldsplit_schur_precondition
                                                          selfp"and<br>
                                                          using
                                                          multigrid on
                                                          S, with
                                                          "-fieldsplit_1_pc_type
                                                          ml" (I don't
                                                          think<br>
                                                          this is
                                                          optimal,
                                                          though).<br>
                                                          <br>
                                                          I don't
                                                          understand why
                                                          the pressure
                                                          mass matrix
                                                          approach
                                                          performs so<br>
                                                          poorly and
                                                          wonder what I
                                                          could try to
                                                          improve the
                                                          convergence.
                                                          Until now<br>
                                                          I have been
                                                          using ML and
                                                          Hypre
                                                          BoomerAMG
                                                          mostly with
                                                          default
                                                          parameters.<br>
                                                          Surely they
                                                          can be
                                                          improved by
                                                          tuning some
                                                          parameters.
                                                          Which could be
                                                          a<br>
                                                          good starting
                                                          point? Are
                                                          there other
                                                          options I
                                                          should
                                                          consider?<br>
                                                          <br>
                                                          With the above
                                                          setup (jacobi)
                                                          for a case
                                                          that works
                                                          better than
                                                          others,<br>
                                                          the KSP
                                                          terminates
                                                          with<br>
                                                          467 KSP
                                                          unpreconditioned
                                                          resid norm
                                                          2.072014323515e-09
                                                          true resid
                                                          norm<br>
2.072014322600e-09 ||r(i)||/||b|| 9.939098100674e-09<br>
                                                          <br>
                                                          You can find
                                                          the output of
                                                          -ksp_view
                                                          below. Let me
                                                          know if you
                                                          need more<br>
                                                          details.<br>
                                                          <br>
                                                          Thanks in
                                                          advance for
                                                          your advice!<br>
                                                          Best wishes<br>
                                                          David<br>
                                                          <br>
                                                          <br>
                                                          KSP Object: 1
                                                          MPI processes<br>
                                                            type: fgmres<br>
                                                              GMRES:
                                                          restart=30,
                                                          using
                                                          Classical
                                                          (unmodified)
                                                          Gram-Schmidt<br>
Orthogonalization with no iterative refinement<br>
                                                              GMRES:
                                                          happy
                                                          breakdown
                                                          tolerance
                                                          1e-30<br>
                                                            maximum
                                                          iterations=10000<br>
                                                            tolerances: 
relative=1e-08, absolute=1e-50, divergence=10000.<br>
                                                            right
                                                          preconditioning<br>
                                                            diagonally
                                                          scaled system<br>
                                                            using
                                                          nonzero
                                                          initial guess<br>
                                                            using
                                                          UNPRECONDITIONED
                                                          norm type for
                                                          convergence
                                                          test<br>
                                                          PC Object: 1
                                                          MPI processes<br>
                                                            type:
                                                          fieldsplit<br>
                                                              FieldSplit
                                                          with Schur
                                                          preconditioner,
                                                          factorization
                                                          UPPER<br>
                                                             
                                                          Preconditioner
                                                          for the Schur
                                                          complement
                                                          formed from
                                                          user provided
                                                          matrix<br>
                                                              Split
                                                          info:<br>
                                                              Split
                                                          number 0
                                                          Defined by IS<br>
                                                              Split
                                                          number 1
                                                          Defined by IS<br>
                                                              KSP solver
                                                          for A00 block<br>
                                                                KSP
                                                          Object:     
                                                          (fieldsplit_0_) 
                                                               1 MPI
                                                          processes<br>
                                                                  type:
                                                          preonly<br>
                                                                 
                                                          maximum
                                                          iterations=10000,
                                                          initial guess
                                                          is zero<br>
                                                                 
                                                          tolerances: 
                                                          relative=1e-05,
absolute=1e-50, divergence=10000.<br>
                                                                  left
                                                          preconditioning<br>
                                                                  using
                                                          NONE norm type
                                                          for
                                                          convergence
                                                          test<br>
                                                                PC
                                                          Object:     
                                                          (fieldsplit_0_) 
                                                               1 MPI
                                                          processes<br>
                                                                  type:
                                                          ml<br>
                                                                    MG:
                                                          type is
                                                          MULTIPLICATIVE,
                                                          levels=5
                                                          cycles=v<br>
                                                                     
                                                          Cycles per
                                                          PCApply=1<br>
                                                                     
                                                          Using Galerkin
                                                          computed
                                                          coarse grid
                                                          matrices<br>
                                                                  Coarse
                                                          grid solver --
                                                          level
                                                          -------------------------------<br>
                                                                    KSP
                                                          Object:       
                                                           
                                                          (fieldsplit_0_mg_coarse_) 
                                                                   1 MPI<br>
                                                          processes<br>
                                                                     
                                                          type: preonly<br>
                                                                     
                                                          maximum
                                                          iterations=10000,
                                                          initial guess
                                                          is zero<br>
                                                                     
                                                          tolerances: 
                                                          relative=1e-05,
absolute=1e-50, divergence=10000.<br>
                                                                     
                                                          left
                                                          preconditioning<br>
                                                                     
                                                          using NONE
                                                          norm type for
                                                          convergence
                                                          test<br>
                                                                    PC
                                                          Object:       
                                                           
                                                          (fieldsplit_0_mg_coarse_) 
                                                                   1 MPI<br>
                                                          processes<br>
                                                                     
                                                          type: lu<br>
                                                                       
                                                          LU:
                                                          out-of-place
                                                          factorization<br>
                                                                       
                                                          tolerance for
                                                          zero pivot
                                                          2.22045e-14<br>
                                                                       
                                                          using diagonal
                                                          shift on
                                                          blocks to
                                                          prevent zero
                                                          pivot<br>
                                                          [INBLOCKS]<br>
                                                                       
                                                          matrix
                                                          ordering: nd<br>
                                                                       
                                                          factor fill
                                                          ratio given
                                                          5., needed 1.<br>
                                                                       
                                                            Factored
                                                          matrix
                                                          follows:<br>
                                                                       
                                                              Mat
                                                          Object:       
                                                                     1
                                                          MPI processes<br>
                                                                       
                                                                type:
                                                          seqaij<br>
                                                                       
                                                                rows=3,
                                                          cols=3<br>
                                                                       
                                                                package
                                                          used to
                                                          perform
                                                          factorization:
                                                          petsc<br>
                                                                       
                                                                total:
                                                          nonzeros=3,
                                                          allocated
                                                          nonzeros=3<br>
                                                                       
                                                                total
                                                          number of
                                                          mallocs used
                                                          during
                                                          MatSetValues<br>
                                                          calls =0<br>
                                                                       
                                                                  not
                                                          using I-node
                                                          routines<br>
                                                                     
                                                          linear system
                                                          matrix =
                                                          precond
                                                          matrix:<br>
                                                                     
                                                          Mat Object:   
                                                                   1 MPI
                                                          processes<br>
                                                                       
                                                          type: seqaij<br>
                                                                       
                                                          rows=3, cols=3<br>
                                                                       
                                                          total:
                                                          nonzeros=3,
                                                          allocated
                                                          nonzeros=3<br>
                                                                       
                                                          total number
                                                          of mallocs
                                                          used during
                                                          MatSetValues
                                                          calls =0<br>
                                                                       
                                                            not using
                                                          I-node
                                                          routines<br>
                                                                  Down
                                                          solver
                                                          (pre-smoother)
                                                          on level 1<br>
-------------------------------<br>
                                                                    KSP
                                                          Object:       
                                                           
                                                          (fieldsplit_0_mg_levels_1_) 
                                                                   1<br>
                                                          MPI processes<br>
                                                                     
                                                          type:
                                                          richardson<br>
                                                                       
                                                          Richardson:
                                                          damping
                                                          factor=1.<br>
                                                                     
                                                          maximum
                                                          iterations=2<br>
                                                                     
                                                          tolerances: 
                                                          relative=1e-05,
absolute=1e-50, divergence=10000.<br>
                                                                     
                                                          left
                                                          preconditioning<br>
                                                                     
                                                          using nonzero
                                                          initial guess<br>
                                                                     
                                                          using NONE
                                                          norm type for
                                                          convergence
                                                          test<br>
                                                                    PC
                                                          Object:       
                                                           
                                                          (fieldsplit_0_mg_levels_1_) 
                                                                   1<br>
                                                          MPI processes<br>
                                                                     
                                                          type: sor<br>
                                                                       
                                                          SOR: type =
                                                          local_symmetric,
                                                          iterations =
                                                          1, local<br>
                                                          iterations =
                                                          1, omega = 1.<br>
                                                                     
                                                          linear system
                                                          matrix =
                                                          precond
                                                          matrix:<br>
                                                                     
                                                          Mat Object:   
                                                                   1 MPI
                                                          processes<br>
                                                                       
                                                          type: seqaij<br>
                                                                       
                                                          rows=15,
                                                          cols=15<br>
                                                                       
                                                          total:
                                                          nonzeros=69,
                                                          allocated
                                                          nonzeros=69<br>
                                                                       
                                                          total number
                                                          of mallocs
                                                          used during
                                                          MatSetValues
                                                          calls =0<br>
                                                                       
                                                            not using
                                                          I-node
                                                          routines<br>
                                                                  Up
                                                          solver
                                                          (post-smoother)
                                                          same as down
                                                          solver
                                                          (pre-smoother)<br>
                                                                  Down
                                                          solver
                                                          (pre-smoother)
                                                          on level 2<br>
-------------------------------<br>
                                                                    KSP
                                                          Object:       
                                                           
                                                          (fieldsplit_0_mg_levels_2_) 
                                                                   1<br>
                                                          MPI processes<br>
                                                                     
                                                          type:
                                                          richardson<br>
                                                                       
                                                          Richardson:
                                                          damping
                                                          factor=1.<br>
                                                                     
                                                          maximum
                                                          iterations=2<br>
                                                                     
                                                          tolerances: 
                                                          relative=1e-05,
absolute=1e-50, divergence=10000.<br>
                                                                     
                                                          left
                                                          preconditioning<br>
                                                                     
                                                          using nonzero
                                                          initial guess<br>
                                                                     
                                                          using NONE
                                                          norm type for
                                                          convergence
                                                          test<br>
                                                                    PC
                                                          Object:       
                                                           
                                                          (fieldsplit_0_mg_levels_2_) 
                                                                   1<br>
                                                          MPI processes<br>
                                                                     
                                                          type: sor<br>
                                                                       
                                                          SOR: type =
                                                          local_symmetric,
                                                          iterations =
                                                          1, local<br>
                                                          iterations =
                                                          1, omega = 1.<br>
                                                                     
                                                          linear system
                                                          matrix =
                                                          precond
                                                          matrix:<br>
                                                                     
                                                          Mat Object:   
                                                                   1 MPI
                                                          processes<br>
                                                                       
                                                          type: seqaij<br>
                                                                       
                                                          rows=304,
                                                          cols=304<br>
                                                                       
                                                          total:
                                                          nonzeros=7354,
                                                          allocated
                                                          nonzeros=7354<br>
                                                                       
                                                          total number
                                                          of mallocs
                                                          used during
                                                          MatSetValues
                                                          calls =0<br>
                                                                       
                                                            not using
                                                          I-node
                                                          routines<br>
                                                                  Up
                                                          solver
                                                          (post-smoother)
                                                          same as down
                                                          solver
                                                          (pre-smoother)<br>
                                                                  Down
                                                          solver
                                                          (pre-smoother)
                                                          on level 3<br>
-------------------------------<br>
                                                                    KSP
                                                          Object:       
                                                           
                                                          (fieldsplit_0_mg_levels_3_) 
                                                                   1<br>
                                                          MPI processes<br>
                                                                     
                                                          type:
                                                          richardson<br>
                                                                       
                                                          Richardson:
                                                          damping
                                                          factor=1.<br>
                                                                     
                                                          maximum
                                                          iterations=2<br>
                                                                     
                                                          tolerances: 
                                                          relative=1e-05,
absolute=1e-50, divergence=10000.<br>
                                                                     
                                                          left
                                                          preconditioning<br>
                                                                     
                                                          using nonzero
                                                          initial guess<br>
                                                                     
                                                          using NONE
                                                          norm type for
                                                          convergence
                                                          test<br>
                                                                    PC
                                                          Object:       
                                                           
                                                          (fieldsplit_0_mg_levels_3_) 
                                                                   1<br>
                                                          MPI processes<br>
                                                                     
                                                          type: sor<br>
                                                                       
                                                          SOR: type =
                                                          local_symmetric,
                                                          iterations =
                                                          1, local<br>
                                                          iterations =
                                                          1, omega = 1.<br>
                                                                     
                                                          linear system
                                                          matrix =
                                                          precond
                                                          matrix:<br>
                                                                     
                                                          Mat Object:   
                                                                   1 MPI
                                                          processes<br>
                                                                       
                                                          type: seqaij<br>
                                                                       
                                                          rows=30236,
                                                          cols=30236<br>
                                                                       
                                                          total:
                                                          nonzeros=2730644,
                                                          allocated
                                                          nonzeros=2730644<br>
                                                                       
                                                          total number
                                                          of mallocs
                                                          used during
                                                          MatSetValues
                                                          calls =0<br>
                                                                       
                                                            not using
                                                          I-node
                                                          routines<br>
                                                                  Up
                                                          solver
                                                          (post-smoother)
                                                          same as down
                                                          solver
                                                          (pre-smoother)<br>
                                                                  Down
                                                          solver
                                                          (pre-smoother)
                                                          on level 4<br>
-------------------------------<br>
                                                                    KSP
                                                          Object:       
                                                           
                                                          (fieldsplit_0_mg_levels_4_) 
                                                                   1<br>
                                                          MPI processes<br>
                                                                     
                                                          type:
                                                          richardson<br>
                                                                       
                                                          Richardson:
                                                          damping
                                                          factor=1.<br>
                                                                     
                                                          maximum
                                                          iterations=2<br>
                                                                     
                                                          tolerances: 
                                                          relative=1e-05,
absolute=1e-50, divergence=10000.<br>
                                                                     
                                                          left
                                                          preconditioning<br>
                                                                     
                                                          using nonzero
                                                          initial guess<br>
                                                                     
                                                          using NONE
                                                          norm type for
                                                          convergence
                                                          test<br>
                                                                    PC
                                                          Object:       
                                                           
                                                          (fieldsplit_0_mg_levels_4_) 
                                                                   1<br>
                                                          MPI processes<br>
                                                                     
                                                          type: sor<br>
                                                                       
                                                          SOR: type =
                                                          local_symmetric,
                                                          iterations =
                                                          1, local<br>
                                                          iterations =
                                                          1, omega = 1.<br>
                                                                     
                                                          linear system
                                                          matrix =
                                                          precond
                                                          matrix:<br>
                                                                     
                                                          Mat Object:   
                                                                 
                                                          (fieldsplit_0_) 
                                                                     1
                                                          MPI<br>
                                                          processes<br>
                                                                       
                                                          type: seqaij<br>
                                                                       
                                                          rows=894132,
                                                          cols=894132<br>
                                                                       
                                                          total:
                                                          nonzeros=70684164,
                                                          allocated
                                                          nonzeros=70684164<br>
                                                                       
                                                          total number
                                                          of mallocs
                                                          used during
                                                          MatSetValues
                                                          calls =0<br>
                                                                       
                                                            not using
                                                          I-node
                                                          routines<br>
                                                                  Up
                                                          solver
                                                          (post-smoother)
                                                          same as down
                                                          solver
                                                          (pre-smoother)<br>
                                                                  linear
                                                          system matrix
                                                          = precond
                                                          matrix:<br>
                                                                  Mat
                                                          Object:       
(fieldsplit_0_)         1 MPI processes<br>
                                                                   
                                                          type: seqaij<br>
                                                                   
                                                          rows=894132,
                                                          cols=894132<br>
                                                                   
                                                          total:
                                                          nonzeros=70684164,
                                                          allocated
                                                          nonzeros=70684164<br>
                                                                   
                                                          total number
                                                          of mallocs
                                                          used during
                                                          MatSetValues
                                                          calls =0<br>
                                                                     
                                                          not using
                                                          I-node
                                                          routines<br>
                                                              KSP solver
                                                          for S = A11 -
                                                          A10 inv(A00)
                                                          A01<br>
                                                                KSP
                                                          Object:     
                                                          (fieldsplit_1_) 
                                                               1 MPI
                                                          processes<br>
                                                                  type:
                                                          preonly<br>
                                                                 
                                                          maximum
                                                          iterations=10000,
                                                          initial guess
                                                          is zero<br>
                                                                 
                                                          tolerances: 
                                                          relative=1e-05,
absolute=1e-50, divergence=10000.<br>
                                                                  left
                                                          preconditioning<br>
                                                                  using
                                                          NONE norm type
                                                          for
                                                          convergence
                                                          test<br>
                                                                PC
                                                          Object:     
                                                          (fieldsplit_1_) 
                                                               1 MPI
                                                          processes<br>
                                                                  type:
                                                          jacobi<br>
                                                                  linear
                                                          system matrix
                                                          followed by
                                                          preconditioner
                                                          matrix:<br>
                                                                  Mat
                                                          Object:       
(fieldsplit_1_)         1 MPI processes<br>
                                                                   
                                                          type:
                                                          schurcomplement<br>
                                                                   
                                                          rows=42025,
                                                          cols=42025<br>
                                                                     
                                                          Schur
                                                          complement A11
                                                          - A10 inv(A00)
                                                          A01<br>
                                                                     
                                                          A11<br>
                                                                       
                                                          Mat Object:   
                                                                   
                                                          (fieldsplit_1_) 
                                                                       1<br>
                                                          MPI processes<br>
                                                                       
                                                            type: seqaij<br>
                                                                       
                                                            rows=42025,
                                                          cols=42025<br>
                                                                       
                                                            total:
                                                          nonzeros=554063,
                                                          allocated
                                                          nonzeros=554063<br>
                                                                       
                                                            total number
                                                          of mallocs
                                                          used during
                                                          MatSetValues
                                                          calls =0<br>
                                                                       
                                                              not using
                                                          I-node
                                                          routines<br>
                                                                     
                                                          A10<br>
                                                                       
                                                          Mat Object:   
                                                                     1
                                                          MPI processes<br>
                                                                       
                                                            type: seqaij<br>
                                                                       
                                                            rows=42025,
                                                          cols=894132<br>
                                                                       
                                                            total:
                                                          nonzeros=6850107,
                                                          allocated
                                                          nonzeros=6850107<br>
                                                                       
                                                            total number
                                                          of mallocs
                                                          used during
                                                          MatSetValues
                                                          calls =0<br>
                                                                       
                                                              not using
                                                          I-node
                                                          routines<br>
                                                                     
                                                          KSP of A00<br>
                                                                       
                                                          KSP Object:   
                                                                   
                                                          (fieldsplit_0_) 
                                                                       1<br>
                                                          MPI processes<br>
                                                                       
                                                            type:
                                                          preonly<br>
                                                                       
                                                            maximum
                                                          iterations=10000,
                                                          initial guess
                                                          is zero<br>
                                                                       
                                                            tolerances: 
relative=1e-05, absolute=1e-50,<br>
divergence=10000.<br>
                                                                       
                                                            left
                                                          preconditioning<br>
                                                                       
                                                            using NONE
                                                          norm type for
                                                          convergence
                                                          test<br>
                                                                       
                                                          PC Object:   
                                                                   
                                                          (fieldsplit_0_) 
                                                                       1<br>
                                                          MPI processes<br>
                                                                       
                                                            type: ml<br>
                                                                       
                                                              MG: type
                                                          is
                                                          MULTIPLICATIVE,
                                                          levels=5
                                                          cycles=v<br>
                                                                       
                                                                Cycles
                                                          per PCApply=1<br>
                                                                       
                                                                Using
                                                          Galerkin
                                                          computed
                                                          coarse grid
                                                          matrices<br>
                                                                       
                                                            Coarse grid
                                                          solver --
                                                          level
                                                          -------------------------------<br>
                                                                       
                                                              KSP
                                                          Object:<br>
(fieldsplit_0_mg_coarse_)                   1 MPI processes<br>
                                                                       
                                                                type:
                                                          preonly<br>
                                                                       
                                                                maximum
iterations=10000, initial guess is zero<br>
                                                                       
                                                               
                                                          tolerances: 
                                                          relative=1e-05,
absolute=1e-50,<br>
divergence=10000.<br>
                                                                       
                                                                left
                                                          preconditioning<br>
                                                                       
                                                                using
                                                          NONE norm type
                                                          for
                                                          convergence
                                                          test<br>
                                                                       
                                                              PC Object:<br>
(fieldsplit_0_mg_coarse_)                   1 MPI processes<br>
                                                                       
                                                                type: lu<br>
                                                                       
                                                                  LU:
                                                          out-of-place
                                                          factorization<br>
                                                                       
                                                                 
                                                          tolerance for
                                                          zero pivot
                                                          2.22045e-14<br>
                                                                       
                                                                  using
                                                          diagonal shift
                                                          on blocks to
                                                          prevent zero<br>
                                                          pivot
                                                          [INBLOCKS]<br>
                                                                       
                                                                  matrix
                                                          ordering: nd<br>
                                                                       
                                                                  factor
                                                          fill ratio
                                                          given 5.,
                                                          needed 1.<br>
                                                                       
                                                                   
                                                          Factored
                                                          matrix
                                                          follows:<br>
                                                                       
                                                                     
                                                          Mat Object:   
                                                                       
                                                                   1 MPI<br>
                                                          processes<br>
                                                                       
                                                                       
                                                          type: seqaij<br>
                                                                       
                                                                       
                                                          rows=3, cols=3<br>
                                                                       
                                                                       
                                                          package used
                                                          to perform
                                                          factorization:
                                                          petsc<br>
                                                                       
                                                                       
                                                          total:
                                                          nonzeros=3,
                                                          allocated
                                                          nonzeros=3<br>
                                                                       
                                                                       
                                                          total number
                                                          of mallocs
                                                          used during<br>
                                                          MatSetValues
                                                          calls =0<br>
                                                                       
                                                                       
                                                            not using
                                                          I-node
                                                          routines<br>
                                                                   </blockquote>
                                                          </div>
                                                          </div>
                                                        </div>
                                                      </blockquote>
                                                    </div>
                                                  </blockquote>
                                                </div>
                                              </div>
                                            </div>
                                          </blockquote>
                                        </div>
                                      </blockquote>
                                    </div>
                                  </div>
                                </div>
                              </blockquote>
                            </div>
                          </div>
                        </div>
                      </blockquote>
                    </div>
                  </div>
                </div>
              </blockquote>
            </div>
          </blockquote>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>