[petsc-users] Advice on improving Stokes Schur preconditioners
David Nolte
dnolte at dim.uchile.cl
Wed Jun 14 14:04:53 CDT 2017
On 06/14/2017 02:26 PM, Dave May wrote:
>
> On Wed, 14 Jun 2017 at 19:42, David Nolte <dnolte at dim.uchile.cl
> <mailto:dnolte at dim.uchile.cl>> wrote:
>
> Dave, thanks a lot for your great answer and for sharing your
> experience. I have a much clearer picture now. :)
>
> The experiments 3/ give the desired results for examples of cavity
> flow. The (1/mu scaled) mass matrix seems OK.
>
> I followed your and Matt's recommendations, used a FULL Schur
> factorization, LU in the 0th split, and gradually relaxed the
> tolerance of GMRES/Jacobi in split 1 (observed the gradual
> increase in outer iterations). Then I replaced the split_0 LU with
> AMG (further increase of outer iterations and iterations on the
> Schur complement).
> Doing so I converged to using hypre boomeramg (smooth_type Euclid,
> strong_threshold 0.75) and 3 iterations of GMRES/Jacobi on the
> Schur block, which gave the best time-to-solution in my particular
> setup and convergence to rtol=1e-8 within 60 outer iterations.
> In my cases, using GMRES in the 0th split (with rtol 1e-1 or 1e-2)
> instead of "preonly" did not help convergence (on the contrary).
>
> I also repeated the experiments with
> "-pc_fieldsplit_schur_precondition selfp", with hypre(ilu) in
> split 0 and hypre in split 1, just to check, and somewhat
> disappointingly ( ;-) ) the wall time is less than half than when
> using gmres/Jac and Sp = mass matrix.
> I am aware that this says nothing about scaling and robustness
> with respect to h-refinement...
>
>
> - selfp defines the schur pc as A10 inv(diag(A00)) A01. This operator
> is not spectrally equivalent to S
>
> - For split 0 did you use preonly-hypre(ilu)?
>
> - For split 1 did you also use hypre(ilu) (you just wrote hypre)?
>
> - What was the iteration count for the saddle point problem with hypre
> and selfp? Iterates will increase if you refine the mesh and a cross
> over will occur at some (unknown) resolution and the mass matrix
> variant will be faster.
Ok, this makes sense.
split 1 has hypre with the default smoother (Schwarz-smoothers), the
setup is:
-pc_type fieldsplit
-pc_fieldsplit_type schur
-pc_fieldsplit_detect_saddle_point
-pc_fieldsplit_schur_fact_type full
-pc_fieldsplit_schur_precondition selfp
-fieldsplit_0_ksp_type richardson
-fieldsplit_0_ksp_max_it 1
-fieldsplit_0_pc_type hypre
-fieldsplit_0_pc_hypre_type boomeramg
-fieldsplit_0_pc_hypre_boomeramg_strong_threshold 0.75
-fieldsplit_0_pc_hypre_boomeramg_smooth_type Euclid
-fieldsplit_0_pc_hypre_boomeramg_eu_bj
-fieldsplit_1_ksp_type richardson
-fieldsplit_1_ksp_max_it 1
-fieldsplit_1_pc_type hypre
-fieldsplit_1_pc_hypre_type boomeramg
-fieldsplit_1_pc_hypre_boomeramg_strong_threshold 0.75
Iteration counts were in two different cases 90 and 113, while the mass
matrix variant (gmres/jacobi iterations on the Schur complement) took 56
and 59.
> Would you agree that these configurations "make sense"?
>
>
> If you want to weak scale, the configuration with the mass matrix
> makes the most sense.
>
> If you are only interested in solving many problems on one mesh, then
> do what ever you can to make the solve time as fast as possible -
> including using preconditioners defined with non-spectrally equivalent
> operators :D
>
I see. That's exactly my case, many problems on one mesh (they are
generated from medical images with fixed resolution). The hypre/selfp
variant is 2-3x faster, so I'll just stick with that for the moment and
try tuning the hypre parameters.
Thanks again!
> Thanks,
> Dave
>
>
> Furthermore, maybe anyone has a hint where to start tuning
> multigrid? So far hypre worked better than ML, but I have not
> experimented much with the parameters.
>
>
>
>
> Thanks again for your help!
>
> Best wishes,
> David
>
>
>
>
> On 06/12/2017 04:52 PM, Dave May wrote:
>> I've been following the discussion and have a couple of comments:
>>
>> 1/ For the preconditioners that you are using (Schur
>> factorisation LDU, or upper block triangular DU), the convergence
>> properties (e.g. 1 iterate for LDU and 2 iterates for DU) come
>> from analysis involving exact inverses of A_00 and S
>>
>> Once you switch from using exact inverses of A_00 and S, you have
>> to rely on spectral equivalence of operators. That is fine, but
>> the spectral equivalence does not tell you how many iterates LDU
>> or DU will require to converge. What it does inform you about is
>> that if you have a spectrally equivalent operator for A_00 and S
>> (Schur complement), then under mesh refinement, your iteration
>> count (whatever it was prior to refinement) will not increase.
>>
>> 2/ Looking at your first set of options, I see you have opted to
>> use -fieldsplit_ksp_type preonly (for both split 0 and 1). That
>> is nice as it creates a linear operator thus you don't need
>> something like FGMRES or GCR applied to the saddle point problem.
>>
>> Your choice for Schur is fine in the sense that the diagonal of M
>> is spectrally equivalent to M, and M is spectrally equivalent to
>> S. Whether it is "fine" in terms of the iteration count for Schur
>> systems, we cannot say apriori (since the spectral equivalence
>> doesn't give us direct info about the iterations we should expect).
>>
>> Your preconditioner for A_00 relies on AMG producing a spectrally
>> equivalent operator with bounds which are tight enough to ensure
>> convergence of the saddle point problem. I'll try explain this.
>>
>> In my experience, for many problems (unstructured FE with
>> variable coefficients, structured FE meshes with variable
>> coefficients) AMG and preonly is not a robust choice. To control
>> the approximation (the spectral equiv bounds), I typically run a
>> stationary or Krylov method on split 0 (e.g.
>> -fieldsplit_0_ksp_type xxx -fieldsplit_0_kps_rtol yyy). Since the
>> AMG preconditioner generated is spectrally equivalent (usually!),
>> these solves will converge to a chosen rtol in a constant number
>> of iterates under h-refinement. In practice, if I don't enforce
>> that I hit something like rtol=1.0e-1 (or 1.0e-2) on the 0th
>> split, saddle point iterates will typically increase for "hard"
>> problems under mesh refinement (1e4-1e7 coefficient variation),
>> and may not even converge at all when just using
>> -fieldsplit_0_ksp_type preonly. Failure ultimately depends on how
>> "strong" the preconditioner for A_00 block is (consider
>> re-discretized geometric multigrid versus AMG). Running an
>> iterative solve on the 0th split lets you control and recover
>> from weak/poor, but spectrally equivalent preconditioners for
>> A_00. Note that people hate this approach as it invariably nests
>> Krylov methods, and subsequently adds more global reductions.
>> However, it is scalable, optimal, tuneable and converges faster
>> than the case which didn't converge at all :D
>>
>> 3/ I agree with Matt's comments, but I'd do a couple of other
>> things first.
>>
>> * I'd first check the discretization is implemented correctly.
>> Your P2/P1 element is inf-sup stable - thus the condition number
>> of S (unpreconditioned) should be independent of the mesh
>> resolution (h). An easy way to verify this is to run either LDU
>> (schur_fact_type full) or DU (schur_fact_type upper) and monitor
>> the iterations required for those S solves. Use
>> -fieldsplit_1_pc_type none -fieldsplit_1_ksp_rtol 1.0e-8
>> -fieldsplit_1_ksp_monitor_true_residual
>> -fieldsplit_1_ksp_pc_right -fieldsplit_1_ksp_type gmres
>> -fieldsplit_0_pc_type lu
>>
>> Then refine the mesh (ideally via sub-division) and repeat the
>> experiment.
>> If the S iterates don't asymptote, but instead grow with each
>> refinement - you likely have a problem with the discretisation.
>>
>> * Do the same experiment, but this time use your mass matrix as
>> the preconditioner for S and use -fieldsplit_1_pc_type lu. If the
>> iterates, compared with the previous experiments (without a Schur
>> PC) have gone up your mass matrix is not defined correctly. If in
>> the previous experiment (without a Schur PC) iterates on the S
>> solves were bounded, but now when preconditioned with the mass
>> matrix the iterates go up, then your mass matrix is definitely
>> not correct.
>>
>> 4/ Lastly, to finally get to your question regarding does +400
>> iterates for the solving the Schur seem "reasonable" and what is
>> "normal behaviour"?
>>
>> It seems "high" to me. However the specifics of your
>> discretisation, mesh topology, element quality, boundary
>> conditions render it almost impossible to say what should be
>> expected. When I use a Q2-P2* discretisation on a structured mesh
>> with a non-constant viscosity I'd expect something like 50-60 for
>> 1.0e-10 with a mass matrix scaled by the inverse (local)
>> viscosity. For constant viscosity maybe 30 iterates. I think this
>> kind of statement is not particularly useful or helpful though.
>>
>> Given you use an unstructured tet mesh, it is possible that some
>> elements have very bad quality (high aspect ratio (AR), highly
>> skewed). I am certain that P2/P1 has an inf-sup constant which is
>> sensitive to the element aspect ratio (I don't recall the exact
>> scaling wrt AR). From experience I know that using the mass
>> matrix as a preconditioner for Schur is not robust as AR
>> increases (e.g. iterations for the S solve grow). Hence, with a
>> couple of "bad" element in your mesh, I could imagine that you
>> could end up having to perform +400 iterations
>>
>> 5/ Lastly, definitely don't impose one Dirichlet BC on pressure
>> to make the pressure unique. This really screws up all the nice
>> properties of your matrices. Just enforce the constant null space
>> for p. And as you noticed, GMRES magically just does it
>> automatically if the RHS of your original system was consistent.
>>
>> Thanks,
>> Dave
>>
>>
>> On 12 June 2017 at 20:20, David Nolte <dnolte at dim.uchile.cl
>> <mailto:dnolte at dim.uchile.cl>> wrote:
>>
>> Ok. With "-pc_fieldsplit_schur_fact_type full" the outer
>> iteration converges in 1 step. The problem remain the Schur
>> iterations.
>>
>> I was not sure if the problem was maybe the singular pressure
>> or the pressure Dirichlet BC. I tested the solver with a
>> standard Stokes flow in a pipe with a constriction (zero
>> Neumann BC for the pressure at the outlet) and in a 3D cavity
>> (enclosed flow, no pressure BC or fixed at one point). I am
>> not sure if I need to attach the constant pressure nullspace
>> to the matrix for GMRES. Not doing so does not alter the
>> convergence of GMRES in the Schur solver (nor the pressure
>> solution), using a pressure Dirichlet BC however slows down
>> convergence (I suppose because of the scaling of the matrix).
>>
>> I also checked the pressure mass matrix that I give PETSc, it
>> looks correct.
>>
>> In all these cases, the solver behaves just as before. With
>> LU in fieldsplit_0 and GMRES/LU with rtol 1e-10 in
>> fieldsplit_1, it converges after 1 outer iteration, but the
>> inner Schur solver converges slowly.
>>
>> How should the convergence of GMRES/LU of the Schur
>> complement *normally* behave?
>>
>> Thanks again!
>> David
>>
>>
>>
>>
>> On 06/12/2017 12:41 PM, Matthew Knepley wrote:
>>> On Mon, Jun 12, 2017 at 10:36 AM, David Nolte
>>> <dnolte at dim.uchile.cl <mailto:dnolte at dim.uchile.cl>> wrote:
>>>
>>>
>>> On 06/12/2017 07:50 AM, Matthew Knepley wrote:
>>>> On Sun, Jun 11, 2017 at 11:06 PM, David Nolte
>>>> <dnolte at dim.uchile.cl <mailto:dnolte at dim.uchile.cl>> wrote:
>>>>
>>>> Thanks Matt, makes sense to me!
>>>>
>>>> I skipped direct solvers at first because for these
>>>> 'real' configurations LU (mumps/superlu_dist)
>>>> usally goes out of memory (got 32GB RAM). It would
>>>> be reasonable to take one more step back and play
>>>> with synthetic examples.
>>>> I managed to run one case though with 936k dofs
>>>> using: ("user" =pressure mass matrix)
>>>>
>>>> <...>
>>>> -pc_fieldsplit_schur_fact_type upper
>>>> -pc_fieldsplit_schur_precondition user
>>>> -fieldsplit_0_ksp_type preonly
>>>> -fieldsplit_0_pc_type lu
>>>> -fieldsplit_0_pc_factor_mat_solver_package mumps
>>>>
>>>> -fieldsplit_1_ksp_type gmres
>>>> -fieldsplit_1_ksp_monitor_true_residuals
>>>> -fieldsplit_1_ksp_rtol 1e-10
>>>> -fieldsplit_1_pc_type lu
>>>> -fieldsplit_1_pc_factor_mat_solver_package mumps
>>>>
>>>> It takes 2 outer iterations, as expected. However
>>>> the fieldsplit_1 solve takes very long.
>>>>
>>>>
>>>> 1) It should take 1 outer iterate, not two. The problem
>>>> is that your Schur tolerance is way too high. Use
>>>>
>>>> -fieldsplit_1_ksp_rtol 1e-10
>>>>
>>>> or something like that. Then it will take 1 iterate.
>>>
>>> Shouldn't it take 2 with a triangular Schur
>>> factorization and exact preconditioners, and 1 with a
>>> full factorization? (cf. Benzi et al 2005, p.66,
>>> http://www.mathcs.emory.edu/~benzi/Web_papers/bgl05.pdf
>>> <http://www.mathcs.emory.edu/%7Ebenzi/Web_papers/bgl05.pdf>)
>>>
>>> That's exactly what I set: -fieldsplit_1_ksp_rtol 1e-10
>>> and the Schur solver does drop below "rtol < 1e-10"
>>>
>>>
>>> Oh, yes. Take away the upper until things are worked out.
>>>
>>> Thanks,
>>>
>>> Matt
>>>
>>>>
>>>> 2) There is a problem with the Schur solve. Now from
>>>> the iterates
>>>>
>>>> 423 KSP preconditioned resid norm 2.638419658982e-02
>>>> true resid norm 7.229653211635e-11 ||r(i)||/||b||
>>>> 7.229653211635e-11
>>>>
>>>> it is clear that the preconditioner is really screwing
>>>> stuff up. For testing, you can use
>>>>
>>>> -pc_fieldsplit_schur_precondition full
>>>>
>>>> and your same setup here. It should take one iterate. I
>>>> think there is something wrong with your
>>>> mass matrix.
>>>
>>> I agree. I forgot to mention that I am considering an
>>> "enclosed flow" problem, with u=0 on all the boundary
>>> and a Dirichlet condition for the pressure in one point
>>> for fixing the constant pressure. Maybe the
>>> preconditioner is not consistent with this setup, need
>>> to check this..
>>>
>>> Thanks a lot
>>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>> Matt
>>>>
>>>>
>>>> 0 KSP unpreconditioned resid norm
>>>> 4.038466809302e-03 true resid norm
>>>> 4.038466809302e-03 ||r(i)||/||b|| 1.000000000000e+00
>>>> Residual norms for fieldsplit_1_ solve.
>>>> 0 KSP preconditioned resid norm
>>>> 0.000000000000e+00 true resid norm
>>>> 0.000000000000e+00 ||r(i)||/||b|| -nan
>>>> Linear fieldsplit_1_ solve converged due to
>>>> CONVERGED_ATOL iterations 0
>>>> 1 KSP unpreconditioned resid norm
>>>> 4.860095964831e-06 true resid norm
>>>> 4.860095964831e-06 ||r(i)||/||b|| 1.203450763452e-03
>>>> Residual norms for fieldsplit_1_ solve.
>>>> 0 KSP preconditioned resid norm
>>>> 2.965546249872e+08 true resid norm
>>>> 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00
>>>> 1 KSP preconditioned resid norm
>>>> 1.347596594634e+08 true resid norm
>>>> 3.599678801575e-01 ||r(i)||/||b|| 3.599678801575e-01
>>>> 2 KSP preconditioned resid norm
>>>> 5.913230136403e+07 true resid norm
>>>> 2.364916760834e-01 ||r(i)||/||b|| 2.364916760834e-01
>>>> 3 KSP preconditioned resid norm
>>>> 4.629700028930e+07 true resid norm
>>>> 1.984444715595e-01 ||r(i)||/||b|| 1.984444715595e-01
>>>> 4 KSP preconditioned resid norm
>>>> 3.804431276819e+07 true resid norm
>>>> 1.747224559120e-01 ||r(i)||/||b|| 1.747224559120e-01
>>>> 5 KSP preconditioned resid norm
>>>> 3.178769422140e+07 true resid norm
>>>> 1.402254864444e-01 ||r(i)||/||b|| 1.402254864444e-01
>>>> 6 KSP preconditioned resid norm
>>>> 2.648669043919e+07 true resid norm
>>>> 1.191164310866e-01 ||r(i)||/||b|| 1.191164310866e-01
>>>> 7 KSP preconditioned resid norm
>>>> 2.203522108614e+07 true resid norm
>>>> 9.690500018007e-02 ||r(i)||/||b|| 9.690500018007e-02
>>>> <...>
>>>> 422 KSP preconditioned resid norm
>>>> 2.984888715147e-02 true resid norm
>>>> 8.598401046494e-11 ||r(i)||/||b|| 8.598401046494e-11
>>>> 423 KSP preconditioned resid norm
>>>> 2.638419658982e-02 true resid norm
>>>> 7.229653211635e-11 ||r(i)||/||b|| 7.229653211635e-11
>>>> Linear fieldsplit_1_ solve converged due to
>>>> CONVERGED_RTOL iterations 423
>>>> 2 KSP unpreconditioned resid norm
>>>> 3.539889585599e-16 true resid norm
>>>> 3.542279617063e-16 ||r(i)||/||b|| 8.771347603759e-14
>>>> Linear solve converged due to CONVERGED_RTOL
>>>> iterations 2
>>>>
>>>>
>>>> Does the slow convergence of the Schur block mean
>>>> that my preconditioning matrix Sp is a poor choice?
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>
>>>> On 06/11/2017 08:53 AM, Matthew Knepley wrote:
>>>>> On Sat, Jun 10, 2017 at 8:25 PM, David Nolte
>>>>> <dnolte at dim.uchile.cl
>>>>> <mailto:dnolte at dim.uchile.cl>> wrote:
>>>>>
>>>>> Dear all,
>>>>>
>>>>> I am solving a Stokes problem in 3D aorta
>>>>> geometries, using a P2/P1
>>>>> finite elements discretization on tetrahedral
>>>>> meshes resulting in
>>>>> ~1-1.5M DOFs. Viscosity is uniform (can be
>>>>> adjusted arbitrarily), and
>>>>> the right hand side is a function of noisy
>>>>> measurement data.
>>>>>
>>>>> In other settings of "standard" Stokes flow
>>>>> problems I have obtained
>>>>> good convergence with an "upper" Schur
>>>>> complement preconditioner, using
>>>>> AMG (ML or Hypre) on the velocity block and
>>>>> approximating the Schur
>>>>> complement matrix by the diagonal of the
>>>>> pressure mass matrix:
>>>>>
>>>>> -ksp_converged_reason
>>>>> -ksp_monitor_true_residual
>>>>> -ksp_initial_guess_nonzero
>>>>> -ksp_diagonal_scale
>>>>> -ksp_diagonal_scale_fix
>>>>> -ksp_type fgmres
>>>>> -ksp_rtol 1.0e-8
>>>>>
>>>>> -pc_type fieldsplit
>>>>> -pc_fieldsplit_type schur
>>>>> -pc_fieldsplit_detect_saddle_point
>>>>> -pc_fieldsplit_schur_fact_type upper
>>>>> -pc_fieldsplit_schur_precondition user
>>>>> # <-- pressure mass matrix
>>>>>
>>>>> -fieldsplit_0_ksp_type preonly
>>>>> -fieldsplit_0_pc_type ml
>>>>>
>>>>> -fieldsplit_1_ksp_type preonly
>>>>> -fieldsplit_1_pc_type jacobi
>>>>>
>>>>>
>>>>> 1) I always recommend starting from an exact
>>>>> solver and backing off in small steps for
>>>>> optimization. Thus
>>>>> I would start with LU on the upper block and
>>>>> GMRES/LU with toelrance 1e-10 on the Schur block.
>>>>> This should converge in 1 iterate.
>>>>>
>>>>> 2) I don't think you want preonly on the Schur
>>>>> system. You might want GMRES/Jacobi to invert the
>>>>> mass matrix.
>>>>>
>>>>> 3) You probably want to tighten the tolerance on
>>>>> the Schur solve, at least to start, and then
>>>>> slowly let it out. The
>>>>> tight tolerance will show you how effective
>>>>> the preconditioner is using that Schur operator.
>>>>> Then you can start
>>>>> to evaluate how effective the Schur linear
>>>>> sovler is.
>>>>>
>>>>> Does this make sense?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Matt
>>>>>
>>>>>
>>>>> In my present case this setup gives rather
>>>>> slow convergence (varies for
>>>>> different geometries between 200-500 or
>>>>> several thousands!). I obtain
>>>>> better convergence with
>>>>> "-pc_fieldsplit_schur_precondition selfp"and
>>>>> using multigrid on S, with
>>>>> "-fieldsplit_1_pc_type ml" (I don't think
>>>>> this is optimal, though).
>>>>>
>>>>> I don't understand why the pressure mass
>>>>> matrix approach performs so
>>>>> poorly and wonder what I could try to improve
>>>>> the convergence. Until now
>>>>> I have been using ML and Hypre BoomerAMG
>>>>> mostly with default parameters.
>>>>> Surely they can be improved by tuning some
>>>>> parameters. Which could be a
>>>>> good starting point? Are there other options I
>>>>> should consider?
>>>>>
>>>>> With the above setup (jacobi) for a case that
>>>>> works better than others,
>>>>> the KSP terminates with
>>>>> 467 KSP unpreconditioned resid norm
>>>>> 2.072014323515e-09 true resid norm
>>>>> 2.072014322600e-09 ||r(i)||/||b||
>>>>> 9.939098100674e-09
>>>>>
>>>>> You can find the output of -ksp_view below.
>>>>> Let me know if you need more
>>>>> details.
>>>>>
>>>>> Thanks in advance for your advice!
>>>>> Best wishes
>>>>> David
>>>>>
>>>>>
>>>>> KSP Object: 1 MPI processes
>>>>> type: fgmres
>>>>> GMRES: restart=30, using Classical
>>>>> (unmodified) Gram-Schmidt
>>>>> Orthogonalization with no iterative refinement
>>>>> GMRES: happy breakdown tolerance 1e-30
>>>>> maximum iterations=10000
>>>>> tolerances: relative=1e-08, absolute=1e-50,
>>>>> divergence=10000.
>>>>> right preconditioning
>>>>> diagonally scaled system
>>>>> using nonzero initial guess
>>>>> using UNPRECONDITIONED norm type for
>>>>> convergence test
>>>>> PC Object: 1 MPI processes
>>>>> type: fieldsplit
>>>>> FieldSplit with Schur preconditioner,
>>>>> factorization UPPER
>>>>> Preconditioner for the Schur complement
>>>>> formed from user provided matrix
>>>>> Split info:
>>>>> Split number 0 Defined by IS
>>>>> Split number 1 Defined by IS
>>>>> KSP solver for A00 block
>>>>> KSP Object: (fieldsplit_0_) 1
>>>>> MPI processes
>>>>> type: preonly
>>>>> maximum iterations=10000, initial
>>>>> guess is zero
>>>>> tolerances: relative=1e-05,
>>>>> absolute=1e-50, divergence=10000.
>>>>> left preconditioning
>>>>> using NONE norm type for convergence test
>>>>> PC Object: (fieldsplit_0_) 1
>>>>> MPI processes
>>>>> type: ml
>>>>> MG: type is MULTIPLICATIVE, levels=5
>>>>> cycles=v
>>>>> Cycles per PCApply=1
>>>>> Using Galerkin computed coarse
>>>>> grid matrices
>>>>> Coarse grid solver -- level
>>>>> -------------------------------
>>>>> KSP Object:
>>>>> (fieldsplit_0_mg_coarse_) 1 MPI
>>>>> processes
>>>>> type: preonly
>>>>> maximum iterations=10000, initial
>>>>> guess is zero
>>>>> tolerances: relative=1e-05,
>>>>> absolute=1e-50, divergence=10000.
>>>>> left preconditioning
>>>>> using NONE norm type for
>>>>> convergence test
>>>>> PC Object:
>>>>> (fieldsplit_0_mg_coarse_) 1 MPI
>>>>> processes
>>>>> type: lu
>>>>> LU: out-of-place factorization
>>>>> tolerance for zero pivot 2.22045e-14
>>>>> using diagonal shift on blocks
>>>>> to prevent zero pivot
>>>>> [INBLOCKS]
>>>>> matrix ordering: nd
>>>>> factor fill ratio given 5.,
>>>>> needed 1.
>>>>> Factored matrix follows:
>>>>> Mat Object:
>>>>> 1 MPI processes
>>>>> type: seqaij
>>>>> rows=3, cols=3
>>>>> package used to perform
>>>>> factorization: petsc
>>>>> total: nonzeros=3,
>>>>> allocated nonzeros=3
>>>>> total number of mallocs
>>>>> used during MatSetValues
>>>>> calls =0
>>>>> not using I-node routines
>>>>> linear system matrix = precond matrix:
>>>>> Mat Object: 1 MPI
>>>>> processes
>>>>> type: seqaij
>>>>> rows=3, cols=3
>>>>> total: nonzeros=3, allocated
>>>>> nonzeros=3
>>>>> total number of mallocs used
>>>>> during MatSetValues calls =0
>>>>> not using I-node routines
>>>>> Down solver (pre-smoother) on level 1
>>>>> -------------------------------
>>>>> KSP Object:
>>>>> (fieldsplit_0_mg_levels_1_) 1
>>>>> MPI processes
>>>>> type: richardson
>>>>> Richardson: damping factor=1.
>>>>> maximum iterations=2
>>>>> tolerances: relative=1e-05,
>>>>> absolute=1e-50, divergence=10000.
>>>>> left preconditioning
>>>>> using nonzero initial guess
>>>>> using NONE norm type for
>>>>> convergence test
>>>>> PC Object:
>>>>> (fieldsplit_0_mg_levels_1_) 1
>>>>> MPI processes
>>>>> type: sor
>>>>> SOR: type = local_symmetric,
>>>>> iterations = 1, local
>>>>> iterations = 1, omega = 1.
>>>>> linear system matrix = precond matrix:
>>>>> Mat Object: 1 MPI
>>>>> processes
>>>>> type: seqaij
>>>>> rows=15, cols=15
>>>>> total: nonzeros=69, allocated
>>>>> nonzeros=69
>>>>> total number of mallocs used
>>>>> during MatSetValues calls =0
>>>>> not using I-node routines
>>>>> Up solver (post-smoother) same as down
>>>>> solver (pre-smoother)
>>>>> Down solver (pre-smoother) on level 2
>>>>> -------------------------------
>>>>> KSP Object:
>>>>> (fieldsplit_0_mg_levels_2_) 1
>>>>> MPI processes
>>>>> type: richardson
>>>>> Richardson: damping factor=1.
>>>>> maximum iterations=2
>>>>> tolerances: relative=1e-05,
>>>>> absolute=1e-50, divergence=10000.
>>>>> left preconditioning
>>>>> using nonzero initial guess
>>>>> using NONE norm type for
>>>>> convergence test
>>>>> PC Object:
>>>>> (fieldsplit_0_mg_levels_2_) 1
>>>>> MPI processes
>>>>> type: sor
>>>>> SOR: type = local_symmetric,
>>>>> iterations = 1, local
>>>>> iterations = 1, omega = 1.
>>>>> linear system matrix = precond matrix:
>>>>> Mat Object: 1 MPI
>>>>> processes
>>>>> type: seqaij
>>>>> rows=304, cols=304
>>>>> total: nonzeros=7354, allocated
>>>>> nonzeros=7354
>>>>> total number of mallocs used
>>>>> during MatSetValues calls =0
>>>>> not using I-node routines
>>>>> Up solver (post-smoother) same as down
>>>>> solver (pre-smoother)
>>>>> Down solver (pre-smoother) on level 3
>>>>> -------------------------------
>>>>> KSP Object:
>>>>> (fieldsplit_0_mg_levels_3_) 1
>>>>> MPI processes
>>>>> type: richardson
>>>>> Richardson: damping factor=1.
>>>>> maximum iterations=2
>>>>> tolerances: relative=1e-05,
>>>>> absolute=1e-50, divergence=10000.
>>>>> left preconditioning
>>>>> using nonzero initial guess
>>>>> using NONE norm type for
>>>>> convergence test
>>>>> PC Object:
>>>>> (fieldsplit_0_mg_levels_3_) 1
>>>>> MPI processes
>>>>> type: sor
>>>>> SOR: type = local_symmetric,
>>>>> iterations = 1, local
>>>>> iterations = 1, omega = 1.
>>>>> linear system matrix = precond matrix:
>>>>> Mat Object: 1 MPI
>>>>> processes
>>>>> type: seqaij
>>>>> rows=30236, cols=30236
>>>>> total: nonzeros=2730644,
>>>>> allocated nonzeros=2730644
>>>>> total number of mallocs used
>>>>> during MatSetValues calls =0
>>>>> not using I-node routines
>>>>> Up solver (post-smoother) same as down
>>>>> solver (pre-smoother)
>>>>> Down solver (pre-smoother) on level 4
>>>>> -------------------------------
>>>>> KSP Object:
>>>>> (fieldsplit_0_mg_levels_4_) 1
>>>>> MPI processes
>>>>> type: richardson
>>>>> Richardson: damping factor=1.
>>>>> maximum iterations=2
>>>>> tolerances: relative=1e-05,
>>>>> absolute=1e-50, divergence=10000.
>>>>> left preconditioning
>>>>> using nonzero initial guess
>>>>> using NONE norm type for
>>>>> convergence test
>>>>> PC Object:
>>>>> (fieldsplit_0_mg_levels_4_) 1
>>>>> MPI processes
>>>>> type: sor
>>>>> SOR: type = local_symmetric,
>>>>> iterations = 1, local
>>>>> iterations = 1, omega = 1.
>>>>> linear system matrix = precond matrix:
>>>>> Mat Object:
>>>>> (fieldsplit_0_) 1 MPI
>>>>> processes
>>>>> type: seqaij
>>>>> rows=894132, cols=894132
>>>>> total: nonzeros=70684164,
>>>>> allocated nonzeros=70684164
>>>>> total number of mallocs used
>>>>> during MatSetValues calls =0
>>>>> not using I-node routines
>>>>> Up solver (post-smoother) same as down
>>>>> solver (pre-smoother)
>>>>> linear system matrix = precond matrix:
>>>>> Mat Object: (fieldsplit_0_)
>>>>> 1 MPI processes
>>>>> type: seqaij
>>>>> rows=894132, cols=894132
>>>>> total: nonzeros=70684164, allocated
>>>>> nonzeros=70684164
>>>>> total number of mallocs used during
>>>>> MatSetValues calls =0
>>>>> not using I-node routines
>>>>> KSP solver for S = A11 - A10 inv(A00) A01
>>>>> KSP Object: (fieldsplit_1_) 1
>>>>> MPI processes
>>>>> type: preonly
>>>>> maximum iterations=10000, initial
>>>>> guess is zero
>>>>> tolerances: relative=1e-05,
>>>>> absolute=1e-50, divergence=10000.
>>>>> left preconditioning
>>>>> using NONE norm type for convergence test
>>>>> PC Object: (fieldsplit_1_) 1
>>>>> MPI processes
>>>>> type: jacobi
>>>>> linear system matrix followed by
>>>>> preconditioner matrix:
>>>>> Mat Object: (fieldsplit_1_)
>>>>> 1 MPI processes
>>>>> type: schurcomplement
>>>>> rows=42025, cols=42025
>>>>> Schur complement A11 - A10
>>>>> inv(A00) A01
>>>>> A11
>>>>> Mat Object:
>>>>> (fieldsplit_1_) 1
>>>>> MPI processes
>>>>> type: seqaij
>>>>> rows=42025, cols=42025
>>>>> total: nonzeros=554063,
>>>>> allocated nonzeros=554063
>>>>> total number of mallocs used
>>>>> during MatSetValues calls =0
>>>>> not using I-node routines
>>>>> A10
>>>>> Mat Object: 1 MPI
>>>>> processes
>>>>> type: seqaij
>>>>> rows=42025, cols=894132
>>>>> total: nonzeros=6850107,
>>>>> allocated nonzeros=6850107
>>>>> total number of mallocs used
>>>>> during MatSetValues calls =0
>>>>> not using I-node routines
>>>>> KSP of A00
>>>>> KSP Object:
>>>>> (fieldsplit_0_) 1
>>>>> MPI processes
>>>>> type: preonly
>>>>> maximum iterations=10000,
>>>>> initial guess is zero
>>>>> tolerances: relative=1e-05,
>>>>> absolute=1e-50,
>>>>> divergence=10000.
>>>>> left preconditioning
>>>>> using NONE norm type for
>>>>> convergence test
>>>>> PC Object:
>>>>> (fieldsplit_0_) 1
>>>>> MPI processes
>>>>> type: ml
>>>>> MG: type is MULTIPLICATIVE,
>>>>> levels=5 cycles=v
>>>>> Cycles per PCApply=1
>>>>> Using Galerkin computed
>>>>> coarse grid matrices
>>>>> Coarse grid solver -- level
>>>>> -------------------------------
>>>>> KSP Object:
>>>>> (fieldsplit_0_mg_coarse_) 1
>>>>> MPI processes
>>>>> type: preonly
>>>>> maximum iterations=10000,
>>>>> initial guess is zero
>>>>> tolerances:
>>>>> relative=1e-05, absolute=1e-50,
>>>>> divergence=10000.
>>>>> left preconditioning
>>>>> using NONE norm type for
>>>>> convergence test
>>>>> PC Object:
>>>>> (fieldsplit_0_mg_coarse_) 1
>>>>> MPI processes
>>>>> type: lu
>>>>> LU: out-of-place
>>>>> factorization
>>>>> tolerance for zero pivot
>>>>> 2.22045e-14
>>>>> using diagonal shift on
>>>>> blocks to prevent zero
>>>>> pivot [INBLOCKS]
>>>>> matrix ordering: nd
>>>>> factor fill ratio given
>>>>> 5., needed 1.
>>>>> Factored matrix follows:
>>>>> Mat Object:
>>>>> 1 MPI
>>>>> processes
>>>>> type: seqaij
>>>>> rows=3, cols=3
>>>>> package used to
>>>>> perform factorization: petsc
>>>>> total: nonzeros=3,
>>>>> allocated nonzeros=3
>>>>> total number of
>>>>> mallocs used during
>>>>> MatSetValues calls =0
>>>>> not using I-node
>>>>> routines
>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170614/18a35bfc/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170614/18a35bfc/attachment-0001.pgp>
More information about the petsc-users
mailing list