[petsc-users] Advice on improving Stokes Schur preconditioners

David Nolte dnolte at dim.uchile.cl
Wed Jun 14 14:04:53 CDT 2017



On 06/14/2017 02:26 PM, Dave May wrote:
>
> On Wed, 14 Jun 2017 at 19:42, David Nolte <dnolte at dim.uchile.cl
> <mailto:dnolte at dim.uchile.cl>> wrote:
>
>     Dave, thanks a lot for your great answer and for sharing your
>     experience. I have a much clearer picture now. :)
>
>     The experiments 3/ give the desired results for examples of cavity
>     flow. The (1/mu scaled) mass matrix seems OK.
>
>     I followed your and Matt's recommendations, used a FULL Schur
>     factorization, LU in the 0th split, and gradually relaxed the
>     tolerance of GMRES/Jacobi in split 1 (observed the gradual
>     increase in outer iterations). Then I replaced the split_0 LU with
>     AMG (further increase of outer iterations and iterations on the
>     Schur complement).
>     Doing so I converged to using hypre boomeramg (smooth_type Euclid,
>     strong_threshold 0.75) and 3 iterations of GMRES/Jacobi on the
>     Schur block, which gave the best time-to-solution in my particular
>     setup and convergence to rtol=1e-8 within 60 outer iterations.
>     In my cases, using GMRES in the 0th split (with rtol 1e-1 or 1e-2)
>     instead of "preonly" did not help convergence (on the contrary).
>
>     I also repeated the experiments with
>     "-pc_fieldsplit_schur_precondition selfp", with hypre(ilu) in
>     split 0 and hypre in split 1, just to check, and somewhat
>     disappointingly ( ;-) ) the wall time is less than half than when
>     using gmres/Jac and Sp = mass matrix.
>     I am aware that this says nothing about scaling and robustness
>     with respect to h-refinement...
>
>
> - selfp defines the schur pc as A10 inv(diag(A00)) A01. This operator
> is not spectrally equivalent to S
>
> - For split 0 did you use preonly-hypre(ilu)?
>
> - For split 1 did you also use hypre(ilu) (you just wrote hypre)?
>
> - What was the iteration count for the saddle point problem with hypre
> and selfp? Iterates will increase if you refine the mesh and a cross
> over will occur at some (unknown) resolution and the mass matrix
> variant will be faster.

Ok, this makes sense.

split 1 has hypre with the default smoother (Schwarz-smoothers), the
setup is:

     -pc_type fieldsplit
    -pc_fieldsplit_type schur
    -pc_fieldsplit_detect_saddle_point
    -pc_fieldsplit_schur_fact_type full
    -pc_fieldsplit_schur_precondition selfp

    -fieldsplit_0_ksp_type richardson
    -fieldsplit_0_ksp_max_it 1
    -fieldsplit_0_pc_type hypre
    -fieldsplit_0_pc_hypre_type boomeramg
    -fieldsplit_0_pc_hypre_boomeramg_strong_threshold 0.75
    -fieldsplit_0_pc_hypre_boomeramg_smooth_type Euclid
    -fieldsplit_0_pc_hypre_boomeramg_eu_bj

    -fieldsplit_1_ksp_type richardson
    -fieldsplit_1_ksp_max_it 1
    -fieldsplit_1_pc_type hypre
    -fieldsplit_1_pc_hypre_type boomeramg
    -fieldsplit_1_pc_hypre_boomeramg_strong_threshold 0.75

Iteration counts were in two different cases 90 and 113, while the mass
matrix variant (gmres/jacobi iterations on the Schur complement) took 56
and 59.


>     Would you agree that these configurations "make sense"?
>
>
> If you want to weak scale, the configuration with the mass matrix
> makes the most sense.
>
> If you are only interested in solving many problems on one mesh, then
> do what ever you can to make the solve time as fast as possible -
> including using preconditioners defined with non-spectrally equivalent
> operators :D
>

I see. That's exactly my case, many problems on one mesh (they are
generated from medical images with fixed resolution). The hypre/selfp
variant is 2-3x faster, so I'll just stick with that for the moment and
try tuning the hypre parameters.

Thanks again!



> Thanks,
>   Dave
>
>
>     Furthermore, maybe anyone has a hint where to start tuning
>     multigrid? So far hypre worked better than ML, but I have not
>     experimented much with the parameters.
>
>
>
>
>     Thanks again for your help!
>
>     Best wishes,
>     David
>
>
>
>
>     On 06/12/2017 04:52 PM, Dave May wrote:
>>     I've been following the discussion and have a couple of comments:
>>
>>     1/ For the preconditioners that you are using (Schur
>>     factorisation LDU, or upper block triangular DU), the convergence
>>     properties (e.g. 1 iterate for LDU and 2 iterates for DU) come
>>     from analysis involving exact inverses of A_00 and S
>>
>>     Once you switch from using exact inverses of A_00 and S, you have
>>     to rely on spectral equivalence of operators. That is fine, but
>>     the spectral equivalence does not tell you how many iterates LDU
>>     or DU will require to converge. What it does inform you about is
>>     that if you have a spectrally equivalent operator for A_00 and S
>>     (Schur complement), then under mesh refinement, your iteration
>>     count (whatever it was prior to refinement) will not increase.
>>
>>     2/ Looking at your first set of options, I see you have opted to
>>     use -fieldsplit_ksp_type preonly (for both split 0 and 1). That
>>     is nice as it creates a linear operator thus you don't need
>>     something like FGMRES or GCR applied to the saddle point problem. 
>>
>>     Your choice for Schur is fine in the sense that the diagonal of M
>>     is spectrally equivalent to M, and M is spectrally equivalent to
>>     S. Whether it is "fine" in terms of the iteration count for Schur
>>     systems, we cannot say apriori (since the spectral equivalence
>>     doesn't give us direct info about the iterations we should expect). 
>>
>>     Your preconditioner for A_00 relies on AMG producing a spectrally
>>     equivalent operator with bounds which are tight enough to ensure
>>     convergence of the saddle point problem. I'll try explain this.
>>
>>     In my experience, for many problems (unstructured FE with
>>     variable coefficients, structured FE meshes with variable
>>     coefficients) AMG and preonly is not a robust choice. To control
>>     the approximation (the spectral equiv bounds), I typically run a
>>     stationary or Krylov method on split 0 (e.g.
>>     -fieldsplit_0_ksp_type xxx -fieldsplit_0_kps_rtol yyy). Since the
>>     AMG preconditioner generated is spectrally equivalent (usually!),
>>     these solves will converge to a chosen rtol in a constant number
>>     of iterates under h-refinement. In practice, if I don't enforce
>>     that I hit something like rtol=1.0e-1 (or 1.0e-2) on the 0th
>>     split, saddle point iterates will typically increase for "hard"
>>     problems under mesh refinement (1e4-1e7 coefficient variation),
>>     and may not even converge at all when just using
>>     -fieldsplit_0_ksp_type preonly. Failure ultimately depends on how
>>     "strong" the preconditioner for A_00 block is (consider
>>     re-discretized geometric multigrid versus AMG). Running an
>>     iterative solve on the 0th split lets you control and recover
>>     from weak/poor, but spectrally equivalent preconditioners for
>>     A_00. Note that people hate this approach as it invariably nests
>>     Krylov methods, and subsequently adds more global reductions.
>>     However, it is scalable, optimal, tuneable and converges faster
>>     than the case which didn't converge at all :D
>>
>>     3/ I agree with Matt's comments, but I'd do a couple of other
>>     things first.
>>
>>     * I'd first check the discretization is implemented correctly.
>>     Your P2/P1 element is inf-sup stable - thus the condition number
>>     of S (unpreconditioned) should be independent of the mesh
>>     resolution (h). An easy way to verify this is to run either LDU
>>     (schur_fact_type full) or DU (schur_fact_type upper) and monitor
>>     the iterations required for those S solves. Use
>>     -fieldsplit_1_pc_type none -fieldsplit_1_ksp_rtol 1.0e-8
>>     -fieldsplit_1_ksp_monitor_true_residual
>>     -fieldsplit_1_ksp_pc_right -fieldsplit_1_ksp_type gmres
>>     -fieldsplit_0_pc_type lu
>>
>>     Then refine the mesh (ideally via sub-division) and repeat the
>>     experiment.
>>     If the S iterates don't asymptote, but instead grow with each
>>     refinement - you likely have a problem with the discretisation.
>>
>>     * Do the same experiment, but this time use your mass matrix as
>>     the preconditioner for S and use -fieldsplit_1_pc_type lu. If the
>>     iterates, compared with the previous experiments (without a Schur
>>     PC) have gone up your mass matrix is not defined correctly. If in
>>     the previous experiment (without a Schur PC) iterates on the S
>>     solves were bounded, but now when preconditioned with the mass
>>     matrix the iterates go up, then your mass matrix is definitely
>>     not correct.
>>
>>     4/ Lastly, to finally get to your question regarding does  +400
>>     iterates for the solving the Schur seem "reasonable" and what is
>>     "normal behaviour"? 
>>
>>     It seems "high" to me. However the specifics of your
>>     discretisation, mesh topology, element quality, boundary
>>     conditions render it almost impossible to say what should be
>>     expected. When I use a Q2-P2* discretisation on a structured mesh
>>     with a non-constant viscosity I'd expect something like 50-60 for
>>     1.0e-10 with a mass matrix scaled by the inverse (local)
>>     viscosity. For constant viscosity maybe 30 iterates. I think this
>>     kind of statement is not particularly useful or helpful though.
>>
>>     Given you use an unstructured tet mesh, it is possible that some
>>     elements have very bad quality (high aspect ratio (AR), highly
>>     skewed). I am certain that P2/P1 has an inf-sup constant which is
>>     sensitive to the element aspect ratio (I don't recall the exact
>>     scaling wrt AR). From experience I know that using the mass
>>     matrix as a preconditioner for Schur is not robust as AR
>>     increases (e.g. iterations for the S solve grow). Hence, with a
>>     couple of "bad" element in your mesh, I could imagine that you
>>     could end up having to perform +400 iterations 
>>
>>     5/ Lastly, definitely don't impose one Dirichlet BC on pressure
>>     to make the pressure unique. This really screws up all the nice
>>     properties of your matrices. Just enforce the constant null space
>>     for p. And as you noticed, GMRES magically just does it
>>     automatically if the RHS of your original system was consistent.
>>      
>>     Thanks,
>>       Dave
>>
>>
>>     On 12 June 2017 at 20:20, David Nolte <dnolte at dim.uchile.cl
>>     <mailto:dnolte at dim.uchile.cl>> wrote:
>>
>>         Ok. With "-pc_fieldsplit_schur_fact_type full" the outer
>>         iteration converges in 1 step. The problem remain the Schur
>>         iterations.
>>
>>         I was not sure if the problem was maybe the singular pressure
>>         or the pressure Dirichlet BC. I tested the solver with a
>>         standard Stokes flow in a pipe with a constriction (zero
>>         Neumann BC for the pressure at the outlet) and in a 3D cavity
>>         (enclosed flow, no pressure BC or fixed at one point). I am
>>         not sure if I need to attach the constant pressure nullspace
>>         to the matrix for GMRES. Not doing so does not alter the
>>         convergence of GMRES in the Schur solver (nor the pressure
>>         solution), using a pressure Dirichlet BC however slows down
>>         convergence (I suppose because of the scaling of the matrix).
>>
>>         I also checked the pressure mass matrix that I give PETSc, it
>>         looks correct.
>>
>>         In all these cases, the solver behaves just as before. With
>>         LU in fieldsplit_0 and GMRES/LU with rtol 1e-10 in
>>         fieldsplit_1, it converges after 1 outer iteration, but the
>>         inner Schur solver converges slowly.
>>
>>         How should the convergence of GMRES/LU of the Schur
>>         complement *normally* behave?
>>
>>         Thanks again!
>>         David
>>
>>
>>
>>
>>         On 06/12/2017 12:41 PM, Matthew Knepley wrote:
>>>         On Mon, Jun 12, 2017 at 10:36 AM, David Nolte
>>>         <dnolte at dim.uchile.cl <mailto:dnolte at dim.uchile.cl>> wrote:
>>>
>>>
>>>             On 06/12/2017 07:50 AM, Matthew Knepley wrote:
>>>>             On Sun, Jun 11, 2017 at 11:06 PM, David Nolte
>>>>             <dnolte at dim.uchile.cl <mailto:dnolte at dim.uchile.cl>> wrote:
>>>>
>>>>                 Thanks Matt, makes sense to me!
>>>>
>>>>                 I skipped direct solvers at first because for these
>>>>                 'real' configurations LU (mumps/superlu_dist)
>>>>                 usally goes out of memory (got 32GB RAM). It would
>>>>                 be reasonable to take one more step back and play
>>>>                 with synthetic examples.
>>>>                 I managed to run one case though with 936k dofs
>>>>                 using: ("user" =pressure mass matrix)
>>>>
>>>>                 <...>
>>>>                 -pc_fieldsplit_schur_fact_type upper
>>>>                 -pc_fieldsplit_schur_precondition user
>>>>                 -fieldsplit_0_ksp_type preonly  
>>>>                 -fieldsplit_0_pc_type lu
>>>>                 -fieldsplit_0_pc_factor_mat_solver_package mumps
>>>>
>>>>                 -fieldsplit_1_ksp_type gmres
>>>>                 -fieldsplit_1_ksp_monitor_true_residuals
>>>>                 -fieldsplit_1_ksp_rtol 1e-10
>>>>                 -fieldsplit_1_pc_type lu
>>>>                 -fieldsplit_1_pc_factor_mat_solver_package mumps
>>>>
>>>>                 It takes 2 outer iterations, as expected. However
>>>>                 the fieldsplit_1 solve takes very long.
>>>>
>>>>
>>>>             1) It should take 1 outer iterate, not two. The problem
>>>>             is that your Schur tolerance is way too high. Use
>>>>
>>>>               -fieldsplit_1_ksp_rtol 1e-10
>>>>
>>>>             or something like that. Then it will take 1 iterate.
>>>
>>>             Shouldn't it take 2 with a triangular Schur
>>>             factorization and exact preconditioners, and 1 with a
>>>             full factorization? (cf. Benzi et al 2005, p.66,
>>>             http://www.mathcs.emory.edu/~benzi/Web_papers/bgl05.pdf
>>>             <http://www.mathcs.emory.edu/%7Ebenzi/Web_papers/bgl05.pdf>)
>>>
>>>             That's exactly what I set: -fieldsplit_1_ksp_rtol 1e-10
>>>             and the Schur solver does drop below "rtol < 1e-10"
>>>
>>>
>>>         Oh, yes. Take away the upper until things are worked out.
>>>
>>>           Thanks,
>>>
>>>             Matt
>>>
>>>>
>>>>             2) There is a problem with the Schur solve. Now from
>>>>             the iterates
>>>>
>>>>             423 KSP preconditioned resid norm 2.638419658982e-02
>>>>             true resid norm 7.229653211635e-11 ||r(i)||/||b||
>>>>             7.229653211635e-11
>>>>
>>>>             it is clear that the preconditioner is really screwing
>>>>             stuff up. For testing, you can use
>>>>
>>>>               -pc_fieldsplit_schur_precondition full
>>>>
>>>>             and your same setup here. It should take one iterate. I
>>>>             think there is something wrong with your
>>>>             mass matrix.
>>>
>>>             I agree. I forgot to mention that I am considering an
>>>             "enclosed flow" problem, with u=0 on all the boundary
>>>             and a Dirichlet condition for the pressure in one point
>>>             for fixing the constant pressure. Maybe the
>>>             preconditioner is not consistent with this setup, need
>>>             to check this..
>>>
>>>             Thanks a lot
>>>
>>>
>>>>
>>>>               Thanks,
>>>>
>>>>                 Matt
>>>>
>>>>
>>>>                   0 KSP unpreconditioned resid norm
>>>>                 4.038466809302e-03 true resid norm
>>>>                 4.038466809302e-03 ||r(i)||/||b|| 1.000000000000e+00
>>>>                     Residual norms for fieldsplit_1_ solve.
>>>>                     0 KSP preconditioned resid norm
>>>>                 0.000000000000e+00 true resid norm
>>>>                 0.000000000000e+00 ||r(i)||/||b||           -nan
>>>>                   Linear fieldsplit_1_ solve converged due to
>>>>                 CONVERGED_ATOL iterations 0
>>>>                   1 KSP unpreconditioned resid norm
>>>>                 4.860095964831e-06 true resid norm
>>>>                 4.860095964831e-06 ||r(i)||/||b|| 1.203450763452e-03
>>>>                     Residual norms for fieldsplit_1_ solve.
>>>>                     0 KSP preconditioned resid norm
>>>>                 2.965546249872e+08 true resid norm
>>>>                 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00
>>>>                     1 KSP preconditioned resid norm
>>>>                 1.347596594634e+08 true resid norm
>>>>                 3.599678801575e-01 ||r(i)||/||b|| 3.599678801575e-01
>>>>                     2 KSP preconditioned resid norm
>>>>                 5.913230136403e+07 true resid norm
>>>>                 2.364916760834e-01 ||r(i)||/||b|| 2.364916760834e-01
>>>>                     3 KSP preconditioned resid norm
>>>>                 4.629700028930e+07 true resid norm
>>>>                 1.984444715595e-01 ||r(i)||/||b|| 1.984444715595e-01
>>>>                     4 KSP preconditioned resid norm
>>>>                 3.804431276819e+07 true resid norm
>>>>                 1.747224559120e-01 ||r(i)||/||b|| 1.747224559120e-01
>>>>                     5 KSP preconditioned resid norm
>>>>                 3.178769422140e+07 true resid norm
>>>>                 1.402254864444e-01 ||r(i)||/||b|| 1.402254864444e-01
>>>>                     6 KSP preconditioned resid norm
>>>>                 2.648669043919e+07 true resid norm
>>>>                 1.191164310866e-01 ||r(i)||/||b|| 1.191164310866e-01
>>>>                     7 KSP preconditioned resid norm
>>>>                 2.203522108614e+07 true resid norm
>>>>                 9.690500018007e-02 ||r(i)||/||b|| 9.690500018007e-02
>>>>                     <...>
>>>>                     422 KSP preconditioned resid norm
>>>>                 2.984888715147e-02 true resid norm
>>>>                 8.598401046494e-11 ||r(i)||/||b|| 8.598401046494e-11
>>>>                     423 KSP preconditioned resid norm
>>>>                 2.638419658982e-02 true resid norm
>>>>                 7.229653211635e-11 ||r(i)||/||b|| 7.229653211635e-11
>>>>                   Linear fieldsplit_1_ solve converged due to
>>>>                 CONVERGED_RTOL iterations 423
>>>>                   2 KSP unpreconditioned resid norm
>>>>                 3.539889585599e-16 true resid norm
>>>>                 3.542279617063e-16 ||r(i)||/||b|| 8.771347603759e-14
>>>>                 Linear solve converged due to CONVERGED_RTOL
>>>>                 iterations 2
>>>>
>>>>                                
>>>>                 Does the slow convergence of the Schur block mean
>>>>                 that my preconditioning matrix Sp is a poor choice?
>>>>
>>>>                 Thanks,
>>>>                 David
>>>>
>>>>
>>>>                 On 06/11/2017 08:53 AM, Matthew Knepley wrote:
>>>>>                 On Sat, Jun 10, 2017 at 8:25 PM, David Nolte
>>>>>                 <dnolte at dim.uchile.cl
>>>>>                 <mailto:dnolte at dim.uchile.cl>> wrote:
>>>>>
>>>>>                     Dear all,
>>>>>
>>>>>                     I am solving a Stokes problem in 3D aorta
>>>>>                     geometries, using a P2/P1
>>>>>                     finite elements discretization on tetrahedral
>>>>>                     meshes resulting in
>>>>>                     ~1-1.5M DOFs. Viscosity is uniform (can be
>>>>>                     adjusted arbitrarily), and
>>>>>                     the right hand side is a function of noisy
>>>>>                     measurement data.
>>>>>
>>>>>                     In other settings of "standard" Stokes flow
>>>>>                     problems I have obtained
>>>>>                     good convergence with an "upper" Schur
>>>>>                     complement preconditioner, using
>>>>>                     AMG (ML or Hypre) on the velocity block and
>>>>>                     approximating the Schur
>>>>>                     complement matrix by the diagonal of the
>>>>>                     pressure mass matrix:
>>>>>
>>>>>                         -ksp_converged_reason
>>>>>                         -ksp_monitor_true_residual
>>>>>                         -ksp_initial_guess_nonzero
>>>>>                         -ksp_diagonal_scale
>>>>>                         -ksp_diagonal_scale_fix
>>>>>                         -ksp_type fgmres
>>>>>                         -ksp_rtol 1.0e-8
>>>>>
>>>>>                         -pc_type fieldsplit
>>>>>                         -pc_fieldsplit_type schur
>>>>>                         -pc_fieldsplit_detect_saddle_point
>>>>>                         -pc_fieldsplit_schur_fact_type upper
>>>>>                         -pc_fieldsplit_schur_precondition user   
>>>>>                     # <-- pressure mass matrix
>>>>>
>>>>>                         -fieldsplit_0_ksp_type preonly
>>>>>                         -fieldsplit_0_pc_type ml
>>>>>
>>>>>                         -fieldsplit_1_ksp_type preonly
>>>>>                         -fieldsplit_1_pc_type jacobi
>>>>>
>>>>>
>>>>>                 1) I always recommend starting from an exact
>>>>>                 solver and backing off in small steps for
>>>>>                 optimization. Thus
>>>>>                     I would start with LU on the upper block and
>>>>>                 GMRES/LU with toelrance 1e-10 on the Schur block.
>>>>>                     This should converge in 1 iterate.
>>>>>
>>>>>                 2) I don't think you want preonly on the Schur
>>>>>                 system. You might want GMRES/Jacobi to invert the
>>>>>                 mass matrix.
>>>>>
>>>>>                 3) You probably want to tighten the tolerance on
>>>>>                 the Schur solve, at least to start, and then
>>>>>                 slowly let it out. The
>>>>>                     tight tolerance will show you how effective
>>>>>                 the preconditioner is using that Schur operator.
>>>>>                 Then you can start
>>>>>                     to evaluate how effective the Schur linear
>>>>>                 sovler is.
>>>>>
>>>>>                 Does this make sense?
>>>>>
>>>>>                   Thanks,
>>>>>
>>>>>                      Matt
>>>>>                  
>>>>>
>>>>>                     In my present case this setup gives rather
>>>>>                     slow convergence (varies for
>>>>>                     different geometries between 200-500 or
>>>>>                     several thousands!). I obtain
>>>>>                     better convergence with
>>>>>                     "-pc_fieldsplit_schur_precondition selfp"and
>>>>>                     using multigrid on S, with
>>>>>                     "-fieldsplit_1_pc_type ml" (I don't think
>>>>>                     this is optimal, though).
>>>>>
>>>>>                     I don't understand why the pressure mass
>>>>>                     matrix approach performs so
>>>>>                     poorly and wonder what I could try to improve
>>>>>                     the convergence. Until now
>>>>>                     I have been using ML and Hypre BoomerAMG
>>>>>                     mostly with default parameters.
>>>>>                     Surely they can be improved by tuning some
>>>>>                     parameters. Which could be a
>>>>>                     good starting point? Are there other options I
>>>>>                     should consider?
>>>>>
>>>>>                     With the above setup (jacobi) for a case that
>>>>>                     works better than others,
>>>>>                     the KSP terminates with
>>>>>                     467 KSP unpreconditioned resid norm
>>>>>                     2.072014323515e-09 true resid norm
>>>>>                     2.072014322600e-09 ||r(i)||/||b||
>>>>>                     9.939098100674e-09
>>>>>
>>>>>                     You can find the output of -ksp_view below.
>>>>>                     Let me know if you need more
>>>>>                     details.
>>>>>
>>>>>                     Thanks in advance for your advice!
>>>>>                     Best wishes
>>>>>                     David
>>>>>
>>>>>
>>>>>                     KSP Object: 1 MPI processes
>>>>>                       type: fgmres
>>>>>                         GMRES: restart=30, using Classical
>>>>>                     (unmodified) Gram-Schmidt
>>>>>                     Orthogonalization with no iterative refinement
>>>>>                         GMRES: happy breakdown tolerance 1e-30
>>>>>                       maximum iterations=10000
>>>>>                       tolerances:  relative=1e-08, absolute=1e-50,
>>>>>                     divergence=10000.
>>>>>                       right preconditioning
>>>>>                       diagonally scaled system
>>>>>                       using nonzero initial guess
>>>>>                       using UNPRECONDITIONED norm type for
>>>>>                     convergence test
>>>>>                     PC Object: 1 MPI processes
>>>>>                       type: fieldsplit
>>>>>                         FieldSplit with Schur preconditioner,
>>>>>                     factorization UPPER
>>>>>                         Preconditioner for the Schur complement
>>>>>                     formed from user provided matrix
>>>>>                         Split info:
>>>>>                         Split number 0 Defined by IS
>>>>>                         Split number 1 Defined by IS
>>>>>                         KSP solver for A00 block
>>>>>                           KSP Object:      (fieldsplit_0_)       1
>>>>>                     MPI processes
>>>>>                             type: preonly
>>>>>                             maximum iterations=10000, initial
>>>>>                     guess is zero
>>>>>                             tolerances:  relative=1e-05,
>>>>>                     absolute=1e-50, divergence=10000.
>>>>>                             left preconditioning
>>>>>                             using NONE norm type for convergence test
>>>>>                           PC Object:      (fieldsplit_0_)       1
>>>>>                     MPI processes
>>>>>                             type: ml
>>>>>                               MG: type is MULTIPLICATIVE, levels=5
>>>>>                     cycles=v
>>>>>                                 Cycles per PCApply=1
>>>>>                                 Using Galerkin computed coarse
>>>>>                     grid matrices
>>>>>                             Coarse grid solver -- level
>>>>>                     -------------------------------
>>>>>                               KSP Object:         
>>>>>                     (fieldsplit_0_mg_coarse_)           1 MPI
>>>>>                     processes
>>>>>                                 type: preonly
>>>>>                                 maximum iterations=10000, initial
>>>>>                     guess is zero
>>>>>                                 tolerances:  relative=1e-05,
>>>>>                     absolute=1e-50, divergence=10000.
>>>>>                                 left preconditioning
>>>>>                                 using NONE norm type for
>>>>>                     convergence test
>>>>>                               PC Object:         
>>>>>                     (fieldsplit_0_mg_coarse_)           1 MPI
>>>>>                     processes
>>>>>                                 type: lu
>>>>>                                   LU: out-of-place factorization
>>>>>                                   tolerance for zero pivot 2.22045e-14
>>>>>                                   using diagonal shift on blocks
>>>>>                     to prevent zero pivot
>>>>>                     [INBLOCKS]
>>>>>                                   matrix ordering: nd
>>>>>                                   factor fill ratio given 5.,
>>>>>                     needed 1.
>>>>>                                     Factored matrix follows:
>>>>>                                       Mat Object:                 
>>>>>                      1 MPI processes
>>>>>                                         type: seqaij
>>>>>                                         rows=3, cols=3
>>>>>                                         package used to perform
>>>>>                     factorization: petsc
>>>>>                                         total: nonzeros=3,
>>>>>                     allocated nonzeros=3
>>>>>                                         total number of mallocs
>>>>>                     used during MatSetValues
>>>>>                     calls =0
>>>>>                                           not using I-node routines
>>>>>                                 linear system matrix = precond matrix:
>>>>>                                 Mat Object:             1 MPI
>>>>>                     processes
>>>>>                                   type: seqaij
>>>>>                                   rows=3, cols=3
>>>>>                                   total: nonzeros=3, allocated
>>>>>                     nonzeros=3
>>>>>                                   total number of mallocs used
>>>>>                     during MatSetValues calls =0
>>>>>                                     not using I-node routines
>>>>>                             Down solver (pre-smoother) on level 1
>>>>>                     -------------------------------
>>>>>                               KSP Object:         
>>>>>                     (fieldsplit_0_mg_levels_1_)           1
>>>>>                     MPI processes
>>>>>                                 type: richardson
>>>>>                                   Richardson: damping factor=1.
>>>>>                                 maximum iterations=2
>>>>>                                 tolerances:  relative=1e-05,
>>>>>                     absolute=1e-50, divergence=10000.
>>>>>                                 left preconditioning
>>>>>                                 using nonzero initial guess
>>>>>                                 using NONE norm type for
>>>>>                     convergence test
>>>>>                               PC Object:         
>>>>>                     (fieldsplit_0_mg_levels_1_)           1
>>>>>                     MPI processes
>>>>>                                 type: sor
>>>>>                                   SOR: type = local_symmetric,
>>>>>                     iterations = 1, local
>>>>>                     iterations = 1, omega = 1.
>>>>>                                 linear system matrix = precond matrix:
>>>>>                                 Mat Object:             1 MPI
>>>>>                     processes
>>>>>                                   type: seqaij
>>>>>                                   rows=15, cols=15
>>>>>                                   total: nonzeros=69, allocated
>>>>>                     nonzeros=69
>>>>>                                   total number of mallocs used
>>>>>                     during MatSetValues calls =0
>>>>>                                     not using I-node routines
>>>>>                             Up solver (post-smoother) same as down
>>>>>                     solver (pre-smoother)
>>>>>                             Down solver (pre-smoother) on level 2
>>>>>                     -------------------------------
>>>>>                               KSP Object:         
>>>>>                     (fieldsplit_0_mg_levels_2_)           1
>>>>>                     MPI processes
>>>>>                                 type: richardson
>>>>>                                   Richardson: damping factor=1.
>>>>>                                 maximum iterations=2
>>>>>                                 tolerances:  relative=1e-05,
>>>>>                     absolute=1e-50, divergence=10000.
>>>>>                                 left preconditioning
>>>>>                                 using nonzero initial guess
>>>>>                                 using NONE norm type for
>>>>>                     convergence test
>>>>>                               PC Object:         
>>>>>                     (fieldsplit_0_mg_levels_2_)           1
>>>>>                     MPI processes
>>>>>                                 type: sor
>>>>>                                   SOR: type = local_symmetric,
>>>>>                     iterations = 1, local
>>>>>                     iterations = 1, omega = 1.
>>>>>                                 linear system matrix = precond matrix:
>>>>>                                 Mat Object:             1 MPI
>>>>>                     processes
>>>>>                                   type: seqaij
>>>>>                                   rows=304, cols=304
>>>>>                                   total: nonzeros=7354, allocated
>>>>>                     nonzeros=7354
>>>>>                                   total number of mallocs used
>>>>>                     during MatSetValues calls =0
>>>>>                                     not using I-node routines
>>>>>                             Up solver (post-smoother) same as down
>>>>>                     solver (pre-smoother)
>>>>>                             Down solver (pre-smoother) on level 3
>>>>>                     -------------------------------
>>>>>                               KSP Object:         
>>>>>                     (fieldsplit_0_mg_levels_3_)           1
>>>>>                     MPI processes
>>>>>                                 type: richardson
>>>>>                                   Richardson: damping factor=1.
>>>>>                                 maximum iterations=2
>>>>>                                 tolerances:  relative=1e-05,
>>>>>                     absolute=1e-50, divergence=10000.
>>>>>                                 left preconditioning
>>>>>                                 using nonzero initial guess
>>>>>                                 using NONE norm type for
>>>>>                     convergence test
>>>>>                               PC Object:         
>>>>>                     (fieldsplit_0_mg_levels_3_)           1
>>>>>                     MPI processes
>>>>>                                 type: sor
>>>>>                                   SOR: type = local_symmetric,
>>>>>                     iterations = 1, local
>>>>>                     iterations = 1, omega = 1.
>>>>>                                 linear system matrix = precond matrix:
>>>>>                                 Mat Object:             1 MPI
>>>>>                     processes
>>>>>                                   type: seqaij
>>>>>                                   rows=30236, cols=30236
>>>>>                                   total: nonzeros=2730644,
>>>>>                     allocated nonzeros=2730644
>>>>>                                   total number of mallocs used
>>>>>                     during MatSetValues calls =0
>>>>>                                     not using I-node routines
>>>>>                             Up solver (post-smoother) same as down
>>>>>                     solver (pre-smoother)
>>>>>                             Down solver (pre-smoother) on level 4
>>>>>                     -------------------------------
>>>>>                               KSP Object:         
>>>>>                     (fieldsplit_0_mg_levels_4_)           1
>>>>>                     MPI processes
>>>>>                                 type: richardson
>>>>>                                   Richardson: damping factor=1.
>>>>>                                 maximum iterations=2
>>>>>                                 tolerances:  relative=1e-05,
>>>>>                     absolute=1e-50, divergence=10000.
>>>>>                                 left preconditioning
>>>>>                                 using nonzero initial guess
>>>>>                                 using NONE norm type for
>>>>>                     convergence test
>>>>>                               PC Object:         
>>>>>                     (fieldsplit_0_mg_levels_4_)           1
>>>>>                     MPI processes
>>>>>                                 type: sor
>>>>>                                   SOR: type = local_symmetric,
>>>>>                     iterations = 1, local
>>>>>                     iterations = 1, omega = 1.
>>>>>                                 linear system matrix = precond matrix:
>>>>>                                 Mat Object:           
>>>>>                     (fieldsplit_0_)             1 MPI
>>>>>                     processes
>>>>>                                   type: seqaij
>>>>>                                   rows=894132, cols=894132
>>>>>                                   total: nonzeros=70684164,
>>>>>                     allocated nonzeros=70684164
>>>>>                                   total number of mallocs used
>>>>>                     during MatSetValues calls =0
>>>>>                                     not using I-node routines
>>>>>                             Up solver (post-smoother) same as down
>>>>>                     solver (pre-smoother)
>>>>>                             linear system matrix = precond matrix:
>>>>>                             Mat Object:        (fieldsplit_0_)   
>>>>>                          1 MPI processes
>>>>>                               type: seqaij
>>>>>                               rows=894132, cols=894132
>>>>>                               total: nonzeros=70684164, allocated
>>>>>                     nonzeros=70684164
>>>>>                               total number of mallocs used during
>>>>>                     MatSetValues calls =0
>>>>>                                 not using I-node routines
>>>>>                         KSP solver for S = A11 - A10 inv(A00) A01
>>>>>                           KSP Object:      (fieldsplit_1_)       1
>>>>>                     MPI processes
>>>>>                             type: preonly
>>>>>                             maximum iterations=10000, initial
>>>>>                     guess is zero
>>>>>                             tolerances:  relative=1e-05,
>>>>>                     absolute=1e-50, divergence=10000.
>>>>>                             left preconditioning
>>>>>                             using NONE norm type for convergence test
>>>>>                           PC Object:      (fieldsplit_1_)       1
>>>>>                     MPI processes
>>>>>                             type: jacobi
>>>>>                             linear system matrix followed by
>>>>>                     preconditioner matrix:
>>>>>                             Mat Object:        (fieldsplit_1_)   
>>>>>                          1 MPI processes
>>>>>                               type: schurcomplement
>>>>>                               rows=42025, cols=42025
>>>>>                                 Schur complement A11 - A10
>>>>>                     inv(A00) A01
>>>>>                                 A11
>>>>>                                   Mat Object:             
>>>>>                     (fieldsplit_1_)               1
>>>>>                     MPI processes
>>>>>                                     type: seqaij
>>>>>                                     rows=42025, cols=42025
>>>>>                                     total: nonzeros=554063,
>>>>>                     allocated nonzeros=554063
>>>>>                                     total number of mallocs used
>>>>>                     during MatSetValues calls =0
>>>>>                                       not using I-node routines
>>>>>                                 A10
>>>>>                                   Mat Object:               1 MPI
>>>>>                     processes
>>>>>                                     type: seqaij
>>>>>                                     rows=42025, cols=894132
>>>>>                                     total: nonzeros=6850107,
>>>>>                     allocated nonzeros=6850107
>>>>>                                     total number of mallocs used
>>>>>                     during MatSetValues calls =0
>>>>>                                       not using I-node routines
>>>>>                                 KSP of A00
>>>>>                                   KSP Object:             
>>>>>                     (fieldsplit_0_)               1
>>>>>                     MPI processes
>>>>>                                     type: preonly
>>>>>                                     maximum iterations=10000,
>>>>>                     initial guess is zero
>>>>>                                     tolerances:  relative=1e-05,
>>>>>                     absolute=1e-50,
>>>>>                     divergence=10000.
>>>>>                                     left preconditioning
>>>>>                                     using NONE norm type for
>>>>>                     convergence test
>>>>>                                   PC Object:             
>>>>>                     (fieldsplit_0_)               1
>>>>>                     MPI processes
>>>>>                                     type: ml
>>>>>                                       MG: type is MULTIPLICATIVE,
>>>>>                     levels=5 cycles=v
>>>>>                                         Cycles per PCApply=1
>>>>>                                         Using Galerkin computed
>>>>>                     coarse grid matrices
>>>>>                                     Coarse grid solver -- level
>>>>>                     -------------------------------
>>>>>                                       KSP Object:
>>>>>                     (fieldsplit_0_mg_coarse_)                   1
>>>>>                     MPI processes
>>>>>                                         type: preonly
>>>>>                                         maximum iterations=10000,
>>>>>                     initial guess is zero
>>>>>                                         tolerances: 
>>>>>                     relative=1e-05, absolute=1e-50,
>>>>>                     divergence=10000.
>>>>>                                         left preconditioning
>>>>>                                         using NONE norm type for
>>>>>                     convergence test
>>>>>                                       PC Object:
>>>>>                     (fieldsplit_0_mg_coarse_)                   1
>>>>>                     MPI processes
>>>>>                                         type: lu
>>>>>                                           LU: out-of-place
>>>>>                     factorization
>>>>>                                           tolerance for zero pivot
>>>>>                     2.22045e-14
>>>>>                                           using diagonal shift on
>>>>>                     blocks to prevent zero
>>>>>                     pivot [INBLOCKS]
>>>>>                                           matrix ordering: nd
>>>>>                                           factor fill ratio given
>>>>>                     5., needed 1.
>>>>>                                             Factored matrix follows:
>>>>>                                               Mat Object:         
>>>>>                                      1 MPI
>>>>>                     processes
>>>>>                                                 type: seqaij
>>>>>                                                 rows=3, cols=3
>>>>>                                                 package used to
>>>>>                     perform factorization: petsc
>>>>>                                                 total: nonzeros=3,
>>>>>                     allocated nonzeros=3
>>>>>                                                 total number of
>>>>>                     mallocs used during
>>>>>                     MatSetValues calls =0
>>>>>                                                   not using I-node
>>>>>                     routines
>>>>>                              
>>>>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170614/18a35bfc/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170614/18a35bfc/attachment-0001.pgp>


More information about the petsc-users mailing list