[petsc-users] Advice on improving Stokes Schur preconditioners

Wed Jun 14 12:41:55 CDT 2017

Dave, thanks a lot for your great answer and for sharing your
experience. I have a much clearer picture now. :)

The experiments 3/ give the desired results for examples of cavity flow.
The (1/mu scaled) mass matrix seems OK.

I followed your and Matt's recommendations, used a FULL Schur
factorization, LU in the 0th split, and gradually relaxed the tolerance
of GMRES/Jacobi in split 1 (observed the gradual increase in outer
iterations). Then I replaced the split_0 LU with AMG (further increase
of outer iterations and iterations on the Schur complement).
Doing so I converged to using hypre boomeramg (smooth_type Euclid,
strong_threshold 0.75) and 3 iterations of GMRES/Jacobi on the Schur
block, which gave the best time-to-solution in my particular setup and
convergence to rtol=1e-8 within 60 outer iterations.
In my cases, using GMRES in the 0th split (with rtol 1e-1 or 1e-2)
instead of "preonly" did not help convergence (on the contrary).

I also repeated the experiments with "-pc_fieldsplit_schur_precondition
selfp", with hypre(ilu) in split 0 and hypre in split 1, just to check,
and somewhat disappointingly ( ;-) ) the wall time is less than half
than when using gmres/Jac and Sp = mass matrix.
I am aware that this says nothing about scaling and robustness with
respect to h-refinement...

Would you agree that these configurations "make sense"?
Furthermore, maybe anyone has a hint where to start tuning multigrid? So
far hypre worked better than ML, but I have not experimented much with
the parameters.

Thanks again for your help!

Best wishes,
David

On 06/12/2017 04:52 PM, Dave May wrote:
> I've been following the discussion and have a couple of comments:
>
> 1/ For the preconditioners that you are using (Schur factorisation
> LDU, or upper block triangular DU), the convergence properties (e.g. 1
> iterate for LDU and 2 iterates for DU) come from analysis involving
> exact inverses of A_00 and S
>
> Once you switch from using exact inverses of A_00 and S, you have to
> rely on spectral equivalence of operators. That is fine, but the
> spectral equivalence does not tell you how many iterates LDU or DU
> will require to converge. What it does inform you about is that if you
> have a spectrally equivalent operator for A_00 and S (Schur
> complement), then under mesh refinement, your iteration count
> (whatever it was prior to refinement) will not increase.
>
> 2/ Looking at your first set of options, I see you have opted to use
> -fieldsplit_ksp_type preonly (for both split 0 and 1). That is nice as
> it creates a linear operator thus you don't need something like FGMRES
> or GCR applied to the saddle point problem. 
>
> Your choice for Schur is fine in the sense that the diagonal of M is
> spectrally equivalent to M, and M is spectrally equivalent to S.
> Whether it is "fine" in terms of the iteration count for Schur
> systems, we cannot say apriori (since the spectral equivalence doesn't
> give us direct info about the iterations we should expect). 
>
> Your preconditioner for A_00 relies on AMG producing a spectrally
> equivalent operator with bounds which are tight enough to ensure
> convergence of the saddle point problem. I'll try explain this.
>
> In my experience, for many problems (unstructured FE with variable
> coefficients, structured FE meshes with variable coefficients) AMG and
> preonly is not a robust choice. To control the approximation (the
> spectral equiv bounds), I typically run a stationary or Krylov method
> on split 0 (e.g. -fieldsplit_0_ksp_type xxx -fieldsplit_0_kps_rtol
> yyy). Since the AMG preconditioner generated is spectrally equivalent
> (usually!), these solves will converge to a chosen rtol in a constant
> number of iterates under h-refinement. In practice, if I don't enforce
> that I hit something like rtol=1.0e-1 (or 1.0e-2) on the 0th split,
> saddle point iterates will typically increase for "hard" problems
> under mesh refinement (1e4-1e7 coefficient variation), and may not
> even converge at all when just using -fieldsplit_0_ksp_type preonly.
> Failure ultimately depends on how "strong" the preconditioner for A_00
> block is (consider re-discretized geometric multigrid versus AMG).
> Running an iterative solve on the 0th split lets you control and
> recover from weak/poor, but spectrally equivalent preconditioners for
> A_00. Note that people hate this approach as it invariably nests
> Krylov methods, and subsequently adds more global reductions. However,
> it is scalable, optimal, tuneable and converges faster than the case
> which didn't converge at all :D
>
> 3/ I agree with Matt's comments, but I'd do a couple of other things
> first.
>
> * I'd first check the discretization is implemented correctly. Your
> P2/P1 element is inf-sup stable - thus the condition number of S
> (unpreconditioned) should be independent of the mesh resolution (h).
> An easy way to verify this is to run either LDU (schur_fact_type full)
> or DU (schur_fact_type upper) and monitor the iterations required for
> those S solves. Use -fieldsplit_1_pc_type none -fieldsplit_1_ksp_rtol
> 1.0e-8 -fieldsplit_1_ksp_monitor_true_residual
> -fieldsplit_1_ksp_pc_right -fieldsplit_1_ksp_type gmres
> -fieldsplit_0_pc_type lu
>
> Then refine the mesh (ideally via sub-division) and repeat the experiment.
> If the S iterates don't asymptote, but instead grow with each
> refinement - you likely have a problem with the discretisation.
>
> * Do the same experiment, but this time use your mass matrix as the
> preconditioner for S and use -fieldsplit_1_pc_type lu. If the
> iterates, compared with the previous experiments (without a Schur PC)
> have gone up your mass matrix is not defined correctly. If in the
> previous experiment (without a Schur PC) iterates on the S solves were
> bounded, but now when preconditioned with the mass matrix the iterates
> go up, then your mass matrix is definitely not correct.
>
> 4/ Lastly, to finally get to your question regarding does  +400
> iterates for the solving the Schur seem "reasonable" and what is
> "normal behaviour"? 
>
> It seems "high" to me. However the specifics of your discretisation,
> mesh topology, element quality, boundary conditions render it almost
> impossible to say what should be expected. When I use a Q2-P2*
> discretisation on a structured mesh with a non-constant viscosity I'd
> expect something like 50-60 for 1.0e-10 with a mass matrix scaled by
> the inverse (local) viscosity. For constant viscosity maybe 30
> iterates. I think this kind of statement is not particularly useful or
> helpful though.
>
> Given you use an unstructured tet mesh, it is possible that some
> elements have very bad quality (high aspect ratio (AR), highly
> skewed). I am certain that P2/P1 has an inf-sup constant which is
> sensitive to the element aspect ratio (I don't recall the exact
> scaling wrt AR). From experience I know that using the mass matrix as
> a preconditioner for Schur is not robust as AR increases (e.g.
> iterations for the S solve grow). Hence, with a couple of "bad"
> element in your mesh, I could imagine that you could end up having to
> perform +400 iterations 
>
> 5/ Lastly, definitely don't impose one Dirichlet BC on pressure to
> make the pressure unique. This really screws up all the nice
> properties of your matrices. Just enforce the constant null space for
> p. And as you noticed, GMRES magically just does it automatically if
> the RHS of your original system was consistent.
>  
> Thanks,
>   Dave
>
>
> On 12 June 2017 at 20:20, David Nolte <dnolte at dim.uchile.cl
> <mailto:dnolte at dim.uchile.cl>> wrote:
>
>     Ok. With "-pc_fieldsplit_schur_fact_type full" the outer iteration
>     converges in 1 step. The problem remain the Schur iterations.
>
>     I was not sure if the problem was maybe the singular pressure or
>     the pressure Dirichlet BC. I tested the solver with a standard
>     Stokes flow in a pipe with a constriction (zero Neumann BC for the
>     pressure at the outlet) and in a 3D cavity (enclosed flow, no
>     pressure BC or fixed at one point). I am not sure if I need to
>     attach the constant pressure nullspace to the matrix for GMRES.
>     Not doing so does not alter the convergence of GMRES in the Schur
>     solver (nor the pressure solution), using a pressure Dirichlet BC
>     however slows down convergence (I suppose because of the scaling
>     of the matrix).
>
>     I also checked the pressure mass matrix that I give PETSc, it
>     looks correct.
>
>     In all these cases, the solver behaves just as before. With LU in
>     fieldsplit_0 and GMRES/LU with rtol 1e-10 in fieldsplit_1, it
>     converges after 1 outer iteration, but the inner Schur solver
>     converges slowly.
>
>     How should the convergence of GMRES/LU of the Schur complement
>     *normally* behave?
>
>     Thanks again!
>     David
>
>
>
>
>     On 06/12/2017 12:41 PM, Matthew Knepley wrote:
>>     On Mon, Jun 12, 2017 at 10:36 AM, David Nolte
>>     <dnolte at dim.uchile.cl <mailto:dnolte at dim.uchile.cl>> wrote:
>>
>>
>>         On 06/12/2017 07:50 AM, Matthew Knepley wrote:
>>>         On Sun, Jun 11, 2017 at 11:06 PM, David Nolte
>>>         <dnolte at dim.uchile.cl <mailto:dnolte at dim.uchile.cl>> wrote:
>>>
>>>             Thanks Matt, makes sense to me!
>>>
>>>             I skipped direct solvers at first because for these
>>>             'real' configurations LU (mumps/superlu_dist) usally
>>>             goes out of memory (got 32GB RAM). It would be
>>>             reasonable to take one more step back and play with
>>>             synthetic examples.
>>>             I managed to run one case though with 936k dofs using:
>>>             ("user" =pressure mass matrix)
>>>
>>>             <...>
>>>             -pc_fieldsplit_schur_fact_type upper
>>>             -pc_fieldsplit_schur_precondition user
>>>             -fieldsplit_0_ksp_type preonly  
>>>             -fieldsplit_0_pc_type lu
>>>             -fieldsplit_0_pc_factor_mat_solver_package mumps
>>>
>>>             -fieldsplit_1_ksp_type gmres
>>>             -fieldsplit_1_ksp_monitor_true_residuals
>>>             -fieldsplit_1_ksp_rtol 1e-10
>>>             -fieldsplit_1_pc_type lu
>>>             -fieldsplit_1_pc_factor_mat_solver_package mumps
>>>
>>>             It takes 2 outer iterations, as expected. However the
>>>             fieldsplit_1 solve takes very long.
>>>
>>>
>>>         1) It should take 1 outer iterate, not two. The problem is
>>>         that your Schur tolerance is way too high. Use
>>>
>>>           -fieldsplit_1_ksp_rtol 1e-10
>>>
>>>         or something like that. Then it will take 1 iterate.
>>
>>         Shouldn't it take 2 with a triangular Schur factorization and
>>         exact preconditioners, and 1 with a full factorization? (cf.
>>         Benzi et al 2005, p.66,
>>         http://www.mathcs.emory.edu/~benzi/Web_papers/bgl05.pdf
>>         <http://www.mathcs.emory.edu/%7Ebenzi/Web_papers/bgl05.pdf>)
>>
>>         That's exactly what I set: -fieldsplit_1_ksp_rtol 1e-10 and
>>         the Schur solver does drop below "rtol < 1e-10"
>>
>>
>>     Oh, yes. Take away the upper until things are worked out.
>>
>>       Thanks,
>>
>>         Matt
>>
>>>
>>>         2) There is a problem with the Schur solve. Now from the
>>>         iterates
>>>
>>>         423 KSP preconditioned resid norm 2.638419658982e-02 true
>>>         resid norm 7.229653211635e-11 ||r(i)||/||b|| 7.229653211635e-11
>>>
>>>         it is clear that the preconditioner is really screwing stuff
>>>         up. For testing, you can use
>>>
>>>           -pc_fieldsplit_schur_precondition full
>>>
>>>         and your same setup here. It should take one iterate. I
>>>         think there is something wrong with your
>>>         mass matrix.
>>
>>         I agree. I forgot to mention that I am considering an
>>         "enclosed flow" problem, with u=0 on all the boundary and a
>>         Dirichlet condition for the pressure in one point for fixing
>>         the constant pressure. Maybe the preconditioner is not
>>         consistent with this setup, need to check this..
>>
>>         Thanks a lot
>>
>>
>>>
>>>           Thanks,
>>>
>>>             Matt
>>>
>>>
>>>               0 KSP unpreconditioned resid norm 4.038466809302e-03
>>>             true resid norm 4.038466809302e-03 ||r(i)||/||b||
>>>             1.000000000000e+00
>>>                 Residual norms for fieldsplit_1_ solve.
>>>                 0 KSP preconditioned resid norm 0.000000000000e+00
>>>             true resid norm 0.000000000000e+00
>>>             ||r(i)||/||b||           -nan
>>>               Linear fieldsplit_1_ solve converged due to
>>>             CONVERGED_ATOL iterations 0
>>>               1 KSP unpreconditioned resid norm 4.860095964831e-06
>>>             true resid norm 4.860095964831e-06 ||r(i)||/||b||
>>>             1.203450763452e-03
>>>                 Residual norms for fieldsplit_1_ solve.
>>>                 0 KSP preconditioned resid norm 2.965546249872e+08
>>>             true resid norm 1.000000000000e+00 ||r(i)||/||b||
>>>             1.000000000000e+00
>>>                 1 KSP preconditioned resid norm 1.347596594634e+08
>>>             true resid norm 3.599678801575e-01 ||r(i)||/||b||
>>>             3.599678801575e-01
>>>                 2 KSP preconditioned resid norm 5.913230136403e+07
>>>             true resid norm 2.364916760834e-01 ||r(i)||/||b||
>>>             2.364916760834e-01
>>>                 3 KSP preconditioned resid norm 4.629700028930e+07
>>>             true resid norm 1.984444715595e-01 ||r(i)||/||b||
>>>             1.984444715595e-01
>>>                 4 KSP preconditioned resid norm 3.804431276819e+07
>>>             true resid norm 1.747224559120e-01 ||r(i)||/||b||
>>>             1.747224559120e-01
>>>                 5 KSP preconditioned resid norm 3.178769422140e+07
>>>             true resid norm 1.402254864444e-01 ||r(i)||/||b||
>>>             1.402254864444e-01
>>>                 6 KSP preconditioned resid norm 2.648669043919e+07
>>>             true resid norm 1.191164310866e-01 ||r(i)||/||b||
>>>             1.191164310866e-01
>>>                 7 KSP preconditioned resid norm 2.203522108614e+07
>>>             true resid norm 9.690500018007e-02 ||r(i)||/||b||
>>>             9.690500018007e-02
>>>                 <...>
>>>                 422 KSP preconditioned resid norm 2.984888715147e-02
>>>             true resid norm 8.598401046494e-11 ||r(i)||/||b||
>>>             8.598401046494e-11
>>>                 423 KSP preconditioned resid norm 2.638419658982e-02
>>>             true resid norm 7.229653211635e-11 ||r(i)||/||b||
>>>             7.229653211635e-11
>>>               Linear fieldsplit_1_ solve converged due to
>>>             CONVERGED_RTOL iterations 423
>>>               2 KSP unpreconditioned resid norm 3.539889585599e-16
>>>             true resid norm 3.542279617063e-16 ||r(i)||/||b||
>>>             8.771347603759e-14
>>>             Linear solve converged due to CONVERGED_RTOL iterations 2
>>>
>>>                            
>>>             Does the slow convergence of the Schur block mean that
>>>             my preconditioning matrix Sp is a poor choice?
>>>
>>>             Thanks,
>>>             David
>>>
>>>
>>>             On 06/11/2017 08:53 AM, Matthew Knepley wrote:
>>>>             On Sat, Jun 10, 2017 at 8:25 PM, David Nolte
>>>>             <dnolte at dim.uchile.cl <mailto:dnolte at dim.uchile.cl>> wrote:
>>>>
>>>>                 Dear all,
>>>>
>>>>                 I am solving a Stokes problem in 3D aorta
>>>>                 geometries, using a P2/P1
>>>>                 finite elements discretization on tetrahedral
>>>>                 meshes resulting in
>>>>                 ~1-1.5M DOFs. Viscosity is uniform (can be adjusted
>>>>                 arbitrarily), and
>>>>                 the right hand side is a function of noisy
>>>>                 measurement data.
>>>>
>>>>                 In other settings of "standard" Stokes flow
>>>>                 problems I have obtained
>>>>                 good convergence with an "upper" Schur complement
>>>>                 preconditioner, using
>>>>                 AMG (ML or Hypre) on the velocity block and
>>>>                 approximating the Schur
>>>>                 complement matrix by the diagonal of the pressure
>>>>                 mass matrix:
>>>>
>>>>                     -ksp_converged_reason
>>>>                     -ksp_monitor_true_residual
>>>>                     -ksp_initial_guess_nonzero
>>>>                     -ksp_diagonal_scale
>>>>                     -ksp_diagonal_scale_fix
>>>>                     -ksp_type fgmres
>>>>                     -ksp_rtol 1.0e-8
>>>>
>>>>                     -pc_type fieldsplit
>>>>                     -pc_fieldsplit_type schur
>>>>                     -pc_fieldsplit_detect_saddle_point
>>>>                     -pc_fieldsplit_schur_fact_type upper
>>>>                     -pc_fieldsplit_schur_precondition user    # <--
>>>>                 pressure mass matrix
>>>>
>>>>                     -fieldsplit_0_ksp_type preonly
>>>>                     -fieldsplit_0_pc_type ml
>>>>
>>>>                     -fieldsplit_1_ksp_type preonly
>>>>                     -fieldsplit_1_pc_type jacobi
>>>>
>>>>
>>>>             1) I always recommend starting from an exact solver and
>>>>             backing off in small steps for optimization. Thus
>>>>                 I would start with LU on the upper block and
>>>>             GMRES/LU with toelrance 1e-10 on the Schur block.
>>>>                 This should converge in 1 iterate.
>>>>
>>>>             2) I don't think you want preonly on the Schur system.
>>>>             You might want GMRES/Jacobi to invert the mass matrix.
>>>>
>>>>             3) You probably want to tighten the tolerance on the
>>>>             Schur solve, at least to start, and then slowly let it
>>>>             out. The
>>>>                 tight tolerance will show you how effective the
>>>>             preconditioner is using that Schur operator. Then you
>>>>             can start
>>>>                 to evaluate how effective the Schur linear sovler is.
>>>>
>>>>             Does this make sense?
>>>>
>>>>               Thanks,
>>>>
>>>>                  Matt
>>>>              
>>>>
>>>>                 In my present case this setup gives rather slow
>>>>                 convergence (varies for
>>>>                 different geometries between 200-500 or several
>>>>                 thousands!). I obtain
>>>>                 better convergence with
>>>>                 "-pc_fieldsplit_schur_precondition selfp"and
>>>>                 using multigrid on S, with "-fieldsplit_1_pc_type
>>>>                 ml" (I don't think
>>>>                 this is optimal, though).
>>>>
>>>>                 I don't understand why the pressure mass matrix
>>>>                 approach performs so
>>>>                 poorly and wonder what I could try to improve the
>>>>                 convergence. Until now
>>>>                 I have been using ML and Hypre BoomerAMG mostly
>>>>                 with default parameters.
>>>>                 Surely they can be improved by tuning some
>>>>                 parameters. Which could be a
>>>>                 good starting point? Are there other options I
>>>>                 should consider?
>>>>
>>>>                 With the above setup (jacobi) for a case that works
>>>>                 better than others,
>>>>                 the KSP terminates with
>>>>                 467 KSP unpreconditioned resid norm
>>>>                 2.072014323515e-09 true resid norm
>>>>                 2.072014322600e-09 ||r(i)||/||b|| 9.939098100674e-09
>>>>
>>>>                 You can find the output of -ksp_view below. Let me
>>>>                 know if you need more
>>>>                 details.
>>>>
>>>>                 Thanks in advance for your advice!
>>>>                 Best wishes
>>>>                 David
>>>>
>>>>
>>>>                 KSP Object: 1 MPI processes
>>>>                   type: fgmres
>>>>                     GMRES: restart=30, using Classical (unmodified)
>>>>                 Gram-Schmidt
>>>>                 Orthogonalization with no iterative refinement
>>>>                     GMRES: happy breakdown tolerance 1e-30
>>>>                   maximum iterations=10000
>>>>                   tolerances:  relative=1e-08, absolute=1e-50,
>>>>                 divergence=10000.
>>>>                   right preconditioning
>>>>                   diagonally scaled system
>>>>                   using nonzero initial guess
>>>>                   using UNPRECONDITIONED norm type for convergence test
>>>>                 PC Object: 1 MPI processes
>>>>                   type: fieldsplit
>>>>                     FieldSplit with Schur preconditioner,
>>>>                 factorization UPPER
>>>>                     Preconditioner for the Schur complement formed
>>>>                 from user provided matrix
>>>>                     Split info:
>>>>                     Split number 0 Defined by IS
>>>>                     Split number 1 Defined by IS
>>>>                     KSP solver for A00 block
>>>>                       KSP Object:      (fieldsplit_0_)       1 MPI
>>>>                 processes
>>>>                         type: preonly
>>>>                         maximum iterations=10000, initial guess is zero
>>>>                         tolerances:  relative=1e-05,
>>>>                 absolute=1e-50, divergence=10000.
>>>>                         left preconditioning
>>>>                         using NONE norm type for convergence test
>>>>                       PC Object:      (fieldsplit_0_)       1 MPI
>>>>                 processes
>>>>                         type: ml
>>>>                           MG: type is MULTIPLICATIVE, levels=5 cycles=v
>>>>                             Cycles per PCApply=1
>>>>                             Using Galerkin computed coarse grid
>>>>                 matrices
>>>>                         Coarse grid solver -- level
>>>>                 -------------------------------
>>>>                           KSP Object:         
>>>>                 (fieldsplit_0_mg_coarse_)           1 MPI
>>>>                 processes
>>>>                             type: preonly
>>>>                             maximum iterations=10000, initial guess
>>>>                 is zero
>>>>                             tolerances:  relative=1e-05,
>>>>                 absolute=1e-50, divergence=10000.
>>>>                             left preconditioning
>>>>                             using NONE norm type for convergence test
>>>>                           PC Object:         
>>>>                 (fieldsplit_0_mg_coarse_)           1 MPI
>>>>                 processes
>>>>                             type: lu
>>>>                               LU: out-of-place factorization
>>>>                               tolerance for zero pivot 2.22045e-14
>>>>                               using diagonal shift on blocks to
>>>>                 prevent zero pivot
>>>>                 [INBLOCKS]
>>>>                               matrix ordering: nd
>>>>                               factor fill ratio given 5., needed 1.
>>>>                                 Factored matrix follows:
>>>>                                   Mat Object:                   1
>>>>                 MPI processes
>>>>                                     type: seqaij
>>>>                                     rows=3, cols=3
>>>>                                     package used to perform
>>>>                 factorization: petsc
>>>>                                     total: nonzeros=3, allocated
>>>>                 nonzeros=3
>>>>                                     total number of mallocs used
>>>>                 during MatSetValues
>>>>                 calls =0
>>>>                                       not using I-node routines
>>>>                             linear system matrix = precond matrix:
>>>>                             Mat Object:             1 MPI processes
>>>>                               type: seqaij
>>>>                               rows=3, cols=3
>>>>                               total: nonzeros=3, allocated nonzeros=3
>>>>                               total number of mallocs used during
>>>>                 MatSetValues calls =0
>>>>                                 not using I-node routines
>>>>                         Down solver (pre-smoother) on level 1
>>>>                 -------------------------------
>>>>                           KSP Object:         
>>>>                 (fieldsplit_0_mg_levels_1_)           1
>>>>                 MPI processes
>>>>                             type: richardson
>>>>                               Richardson: damping factor=1.
>>>>                             maximum iterations=2
>>>>                             tolerances:  relative=1e-05,
>>>>                 absolute=1e-50, divergence=10000.
>>>>                             left preconditioning
>>>>                             using nonzero initial guess
>>>>                             using NONE norm type for convergence test
>>>>                           PC Object:         
>>>>                 (fieldsplit_0_mg_levels_1_)           1
>>>>                 MPI processes
>>>>                             type: sor
>>>>                               SOR: type = local_symmetric,
>>>>                 iterations = 1, local
>>>>                 iterations = 1, omega = 1.
>>>>                             linear system matrix = precond matrix:
>>>>                             Mat Object:             1 MPI processes
>>>>                               type: seqaij
>>>>                               rows=15, cols=15
>>>>                               total: nonzeros=69, allocated nonzeros=69
>>>>                               total number of mallocs used during
>>>>                 MatSetValues calls =0
>>>>                                 not using I-node routines
>>>>                         Up solver (post-smoother) same as down
>>>>                 solver (pre-smoother)
>>>>                         Down solver (pre-smoother) on level 2
>>>>                 -------------------------------
>>>>                           KSP Object:         
>>>>                 (fieldsplit_0_mg_levels_2_)           1
>>>>                 MPI processes
>>>>                             type: richardson
>>>>                               Richardson: damping factor=1.
>>>>                             maximum iterations=2
>>>>                             tolerances:  relative=1e-05,
>>>>                 absolute=1e-50, divergence=10000.
>>>>                             left preconditioning
>>>>                             using nonzero initial guess
>>>>                             using NONE norm type for convergence test
>>>>                           PC Object:         
>>>>                 (fieldsplit_0_mg_levels_2_)           1
>>>>                 MPI processes
>>>>                             type: sor
>>>>                               SOR: type = local_symmetric,
>>>>                 iterations = 1, local
>>>>                 iterations = 1, omega = 1.
>>>>                             linear system matrix = precond matrix:
>>>>                             Mat Object:             1 MPI processes
>>>>                               type: seqaij
>>>>                               rows=304, cols=304
>>>>                               total: nonzeros=7354, allocated
>>>>                 nonzeros=7354
>>>>                               total number of mallocs used during
>>>>                 MatSetValues calls =0
>>>>                                 not using I-node routines
>>>>                         Up solver (post-smoother) same as down
>>>>                 solver (pre-smoother)
>>>>                         Down solver (pre-smoother) on level 3
>>>>                 -------------------------------
>>>>                           KSP Object:         
>>>>                 (fieldsplit_0_mg_levels_3_)           1
>>>>                 MPI processes
>>>>                             type: richardson
>>>>                               Richardson: damping factor=1.
>>>>                             maximum iterations=2
>>>>                             tolerances:  relative=1e-05,
>>>>                 absolute=1e-50, divergence=10000.
>>>>                             left preconditioning
>>>>                             using nonzero initial guess
>>>>                             using NONE norm type for convergence test
>>>>                           PC Object:         
>>>>                 (fieldsplit_0_mg_levels_3_)           1
>>>>                 MPI processes
>>>>                             type: sor
>>>>                               SOR: type = local_symmetric,
>>>>                 iterations = 1, local
>>>>                 iterations = 1, omega = 1.
>>>>                             linear system matrix = precond matrix:
>>>>                             Mat Object:             1 MPI processes
>>>>                               type: seqaij
>>>>                               rows=30236, cols=30236
>>>>                               total: nonzeros=2730644, allocated
>>>>                 nonzeros=2730644
>>>>                               total number of mallocs used during
>>>>                 MatSetValues calls =0
>>>>                                 not using I-node routines
>>>>                         Up solver (post-smoother) same as down
>>>>                 solver (pre-smoother)
>>>>                         Down solver (pre-smoother) on level 4
>>>>                 -------------------------------
>>>>                           KSP Object:         
>>>>                 (fieldsplit_0_mg_levels_4_)           1
>>>>                 MPI processes
>>>>                             type: richardson
>>>>                               Richardson: damping factor=1.
>>>>                             maximum iterations=2
>>>>                             tolerances:  relative=1e-05,
>>>>                 absolute=1e-50, divergence=10000.
>>>>                             left preconditioning
>>>>                             using nonzero initial guess
>>>>                             using NONE norm type for convergence test
>>>>                           PC Object:         
>>>>                 (fieldsplit_0_mg_levels_4_)           1
>>>>                 MPI processes
>>>>                             type: sor
>>>>                               SOR: type = local_symmetric,
>>>>                 iterations = 1, local
>>>>                 iterations = 1, omega = 1.
>>>>                             linear system matrix = precond matrix:
>>>>                             Mat Object:            (fieldsplit_0_) 
>>>>                            1 MPI
>>>>                 processes
>>>>                               type: seqaij
>>>>                               rows=894132, cols=894132
>>>>                               total: nonzeros=70684164, allocated
>>>>                 nonzeros=70684164
>>>>                               total number of mallocs used during
>>>>                 MatSetValues calls =0
>>>>                                 not using I-node routines
>>>>                         Up solver (post-smoother) same as down
>>>>                 solver (pre-smoother)
>>>>                         linear system matrix = precond matrix:
>>>>                         Mat Object:        (fieldsplit_0_)       
>>>>                  1 MPI processes
>>>>                           type: seqaij
>>>>                           rows=894132, cols=894132
>>>>                           total: nonzeros=70684164, allocated
>>>>                 nonzeros=70684164
>>>>                           total number of mallocs used during
>>>>                 MatSetValues calls =0
>>>>                             not using I-node routines
>>>>                     KSP solver for S = A11 - A10 inv(A00) A01
>>>>                       KSP Object:      (fieldsplit_1_)       1 MPI
>>>>                 processes
>>>>                         type: preonly
>>>>                         maximum iterations=10000, initial guess is zero
>>>>                         tolerances:  relative=1e-05,
>>>>                 absolute=1e-50, divergence=10000.
>>>>                         left preconditioning
>>>>                         using NONE norm type for convergence test
>>>>                       PC Object:      (fieldsplit_1_)       1 MPI
>>>>                 processes
>>>>                         type: jacobi
>>>>                         linear system matrix followed by
>>>>                 preconditioner matrix:
>>>>                         Mat Object:        (fieldsplit_1_)       
>>>>                  1 MPI processes
>>>>                           type: schurcomplement
>>>>                           rows=42025, cols=42025
>>>>                             Schur complement A11 - A10 inv(A00) A01
>>>>                             A11
>>>>                               Mat Object:             
>>>>                 (fieldsplit_1_)               1
>>>>                 MPI processes
>>>>                                 type: seqaij
>>>>                                 rows=42025, cols=42025
>>>>                                 total: nonzeros=554063, allocated
>>>>                 nonzeros=554063
>>>>                                 total number of mallocs used during
>>>>                 MatSetValues calls =0
>>>>                                   not using I-node routines
>>>>                             A10
>>>>                               Mat Object:               1 MPI processes
>>>>                                 type: seqaij
>>>>                                 rows=42025, cols=894132
>>>>                                 total: nonzeros=6850107, allocated
>>>>                 nonzeros=6850107
>>>>                                 total number of mallocs used during
>>>>                 MatSetValues calls =0
>>>>                                   not using I-node routines
>>>>                             KSP of A00
>>>>                               KSP Object:             
>>>>                 (fieldsplit_0_)               1
>>>>                 MPI processes
>>>>                                 type: preonly
>>>>                                 maximum iterations=10000, initial
>>>>                 guess is zero
>>>>                                 tolerances:  relative=1e-05,
>>>>                 absolute=1e-50,
>>>>                 divergence=10000.
>>>>                                 left preconditioning
>>>>                                 using NONE norm type for
>>>>                 convergence test
>>>>                               PC Object:             
>>>>                 (fieldsplit_0_)               1
>>>>                 MPI processes
>>>>                                 type: ml
>>>>                                   MG: type is MULTIPLICATIVE,
>>>>                 levels=5 cycles=v
>>>>                                     Cycles per PCApply=1
>>>>                                     Using Galerkin computed coarse
>>>>                 grid matrices
>>>>                                 Coarse grid solver -- level
>>>>                 -------------------------------
>>>>                                   KSP Object:
>>>>                 (fieldsplit_0_mg_coarse_)                   1 MPI
>>>>                 processes
>>>>                                     type: preonly
>>>>                                     maximum iterations=10000,
>>>>                 initial guess is zero
>>>>                                     tolerances:  relative=1e-05,
>>>>                 absolute=1e-50,
>>>>                 divergence=10000.
>>>>                                     left preconditioning
>>>>                                     using NONE norm type for
>>>>                 convergence test
>>>>                                   PC Object:
>>>>                 (fieldsplit_0_mg_coarse_)                   1 MPI
>>>>                 processes
>>>>                                     type: lu
>>>>                                       LU: out-of-place factorization
>>>>                                       tolerance for zero pivot
>>>>                 2.22045e-14
>>>>                                       using diagonal shift on
>>>>                 blocks to prevent zero
>>>>                 pivot [INBLOCKS]
>>>>                                       matrix ordering: nd
>>>>                                       factor fill ratio given 5.,
>>>>                 needed 1.
>>>>                                         Factored matrix follows:
>>>>                                           Mat Object:             
>>>>                              1 MPI
>>>>                 processes
>>>>                                             type: seqaij
>>>>                                             rows=3, cols=3
>>>>                                             package used to perform
>>>>                 factorization: petsc
>>>>                                             total: nonzeros=3,
>>>>                 allocated nonzeros=3
>>>>                                             total number of mallocs
>>>>                 used during
>>>>                 MatSetValues calls =0
>>>>                                               not using I-node routines
>>>>                                     linear system matrix = precond
>>>>                 matrix:
>>>>                                     Mat Object:                   
>>>>                  1 MPI processes
>>>>                                       type: seqaij
>>>>                                       rows=3, cols=3
>>>>                                       total: nonzeros=3, allocated
>>>>                 nonzeros=3
>>>>                                       total number of mallocs used
>>>>                 during MatSetValues
>>>>                 calls =0
>>>>                                         not using I-node routines
>>>>                                 Down solver (pre-smoother) on level 1
>>>>                 -------------------------------
>>>>                                   KSP Object:
>>>>                 (fieldsplit_0_mg_levels_1_)                   1 MPI
>>>>                 processes
>>>>                                     type: richardson
>>>>                                       Richardson: damping factor=1.
>>>>                                     maximum iterations=2
>>>>                                     tolerances:  relative=1e-05,
>>>>                 absolute=1e-50,
>>>>                 divergence=10000.
>>>>                                     left preconditioning
>>>>                                     using nonzero initial guess
>>>>                                     using NONE norm type for
>>>>                 convergence test
>>>>                                   PC Object:
>>>>                 (fieldsplit_0_mg_levels_1_)                   1 MPI
>>>>                 processes
>>>>                                     type: sor
>>>>                                       SOR: type = local_symmetric,
>>>>                 iterations = 1, local
>>>>                 iterations = 1, omega = 1.
>>>>                                     linear system matrix = precond
>>>>                 matrix:
>>>>                                     Mat Object:                   
>>>>                  1 MPI processes
>>>>                                       type: seqaij
>>>>                                       rows=15, cols=15
>>>>                                       total: nonzeros=69, allocated
>>>>                 nonzeros=69
>>>>                                       total number of mallocs used
>>>>                 during MatSetValues
>>>>                 calls =0
>>>>                                         not using I-node routines
>>>>                                 Up solver (post-smoother) same as
>>>>                 down solver (pre-smoother)
>>>>                                 Down solver (pre-smoother) on level 2
>>>>                 -------------------------------
>>>>                                   KSP Object:
>>>>                 (fieldsplit_0_mg_levels_2_)                   1 MPI
>>>>                 processes
>>>>                                     type: richardson
>>>>                                       Richardson: damping factor=1.
>>>>                                     maximum iterations=2
>>>>                                     tolerances:  relative=1e-05,
>>>>                 absolute=1e-50,
>>>>                 divergence=10000.
>>>>                                     left preconditioning
>>>>                                     using nonzero initial guess
>>>>                                     using NONE norm type for
>>>>                 convergence test
>>>>                                   PC Object:
>>>>                 (fieldsplit_0_mg_levels_2_)                   1 MPI
>>>>                 processes
>>>>                                     type: sor
>>>>                                       SOR: type = local_symmetric,
>>>>                 iterations = 1, local
>>>>                 iterations = 1, omega = 1.
>>>>                                     linear system matrix = precond
>>>>                 matrix:
>>>>                                     Mat Object:                   
>>>>                  1 MPI processes
>>>>                                       type: seqaij
>>>>                                       rows=304, cols=304
>>>>                                       total: nonzeros=7354,
>>>>                 allocated nonzeros=7354
>>>>                                       total number of mallocs used
>>>>                 during MatSetValues
>>>>                 calls =0
>>>>                                         not using I-node routines
>>>>                                 Up solver (post-smoother) same as
>>>>                 down solver (pre-smoother)
>>>>                                 Down solver (pre-smoother) on level 3
>>>>                 -------------------------------
>>>>                                   KSP Object:
>>>>                 (fieldsplit_0_mg_levels_3_)                   1 MPI
>>>>                 processes
>>>>                                     type: richardson
>>>>                                       Richardson: damping factor=1.
>>>>                                     maximum iterations=2
>>>>                                     tolerances:  relative=1e-05,
>>>>                 absolute=1e-50,
>>>>                 divergence=10000.
>>>>                                     left preconditioning
>>>>                                     using nonzero initial guess
>>>>                                     using NONE norm type for
>>>>                 convergence test
>>>>                                   PC Object:
>>>>                 (fieldsplit_0_mg_levels_3_)                   1 MPI
>>>>                 processes
>>>>                                     type: sor
>>>>                                       SOR: type = local_symmetric,
>>>>                 iterations = 1, local
>>>>                 iterations = 1, omega = 1.
>>>>                                     linear system matrix = precond
>>>>                 matrix:
>>>>                                     Mat Object:                   
>>>>                  1 MPI processes
>>>>                                       type: seqaij
>>>>                                       rows=30236, cols=30236
>>>>                                       total: nonzeros=2730644,
>>>>                 allocated nonzeros=2730644
>>>>                                       total number of mallocs used
>>>>                 during MatSetValues
>>>>                 calls =0
>>>>                                         not using I-node routines
>>>>                                 Up solver (post-smoother) same as
>>>>                 down solver (pre-smoother)
>>>>                                 Down solver (pre-smoother) on level 4
>>>>                 -------------------------------
>>>>                                   KSP Object:
>>>>                 (fieldsplit_0_mg_levels_4_)                   1 MPI
>>>>                 processes
>>>>                                     type: richardson
>>>>                                       Richardson: damping factor=1.
>>>>                                     maximum iterations=2
>>>>                                     tolerances:  relative=1e-05,
>>>>                 absolute=1e-50,
>>>>                 divergence=10000.
>>>>                                     left preconditioning
>>>>                                     using nonzero initial guess
>>>>                                     using NONE norm type for
>>>>                 convergence test
>>>>                                   PC Object:
>>>>                 (fieldsplit_0_mg_levels_4_)                   1 MPI
>>>>                 processes
>>>>                                     type: sor
>>>>                                       SOR: type = local_symmetric,
>>>>                 iterations = 1, local
>>>>                 iterations = 1, omega = 1.
>>>>                                     linear system matrix = precond
>>>>                 matrix:
>>>>                                     Mat Object:
>>>>                 (fieldsplit_0_)                     1 MPI processes
>>>>                                       type: seqaij
>>>>                                       rows=894132, cols=894132
>>>>                                       total: nonzeros=70684164,
>>>>                 allocated nonzeros=70684164
>>>>                                       total number of mallocs used
>>>>                 during MatSetValues
>>>>                 calls =0
>>>>                                         not using I-node routines
>>>>                                 Up solver (post-smoother) same as
>>>>                 down solver (pre-smoother)
>>>>                                 linear system matrix = precond matrix:
>>>>                                 Mat Object:
>>>>                 (fieldsplit_0_)                 1 MPI processes
>>>>                                   type: seqaij
>>>>                                   rows=894132, cols=894132
>>>>                                   total: nonzeros=70684164,
>>>>                 allocated nonzeros=70684164
>>>>                                   total number of mallocs used
>>>>                 during MatSetValues calls =0
>>>>                                     not using I-node routines
>>>>                             A01
>>>>                               Mat Object:               1 MPI processes
>>>>                                 type: seqaij
>>>>                                 rows=894132, cols=42025
>>>>                                 total: nonzeros=6850107, allocated
>>>>                 nonzeros=6850107
>>>>                                 total number of mallocs used during
>>>>                 MatSetValues calls =0
>>>>                                   not using I-node routines
>>>>                         Mat Object:         1 MPI processes
>>>>                           type: seqaij
>>>>                           rows=42025, cols=42025
>>>>                           total: nonzeros=554063, allocated
>>>>                 nonzeros=554063
>>>>                           total number of mallocs used during
>>>>                 MatSetValues calls =0
>>>>                             not using I-node routines
>>>>                   linear system matrix = precond matrix:
>>>>                   Mat Object:   1 MPI processes
>>>>                     type: seqaij
>>>>                     rows=936157, cols=936157
>>>>                     total: nonzeros=84938441, allocated
>>>>                 nonzeros=84938441
>>>>                     total number of mallocs used during
>>>>                 MatSetValues calls =0
>>>>                       not using I-node routines
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>             -- 
>>>>             What most experimenters take for granted before they
>>>>             begin their experiments is infinitely more interesting
>>>>             than any results to which their experiments lead.
>>>>             -- Norbert Wiener
>>>>
>>>>             http://www.caam.rice.edu/~mk51/
>>>>             <http://www.caam.rice.edu/%7Emk51/>
>>>
>>>
>>>
>>>
>>>         -- 
>>>         What most experimenters take for granted before they begin
>>>         their experiments is infinitely more interesting than any
>>>         results to which their experiments lead.
>>>         -- Norbert Wiener
>>>
>>>         http://www.caam.rice.edu/~mk51/
>>>         <http://www.caam.rice.edu/%7Emk51/>
>>
>>
>>
>>
>>     -- 
>>     What most experimenters take for granted before they begin their
>>     experiments is infinitely more interesting than any results to
>>     which their experiments lead.
>>     -- Norbert Wiener
>>
>>     http://www.caam.rice.edu/~mk51/ <http://www.caam.rice.edu/%7Emk51/>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170614/83cd3ab0/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170614/83cd3ab0/attachment-0001.pgp>