<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On 4 February 2015 at 21:34, Fabian Gabel <span dir="ltr"><<a href="mailto:gabel.fabian@gmail.com" target="_blank">gabel.fabian@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Thank you for pointing me into the right direction. After some first<br>

tests on a test case with 2e6 cells (4dof) I could measure a slight<br>

improvement (25%) with respect to wall time, using a nested<br>

field split for the velocities:<span class=""><br></span></blockquote><div><br></div><div>Great. That's a good start. It can be made ever faster.<br></div><div><br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="">

-coupledsolve_pc_type fieldsplit<br>

-coupledsolve_pc_fieldsplit_0_fields 0,1,2<br>

-coupledsolve_pc_fieldsplit_1_fields 3<br>

</span>-coupledsolve_pc_fieldsplit_type schur<br>

-coupledsolve_pc_fieldsplit_block_size 4<br>

-coupledsolve_fieldsplit_0_ksp_converged_reason<br>

-coupledsolve_fieldsplit_1_ksp_converged_reason<br>

-coupledsolve_fieldsplit_0_ksp_type gmres<br>

-coupledsolve_fieldsplit_0_pc_type fieldsplit<br>

-coupledsolve_fieldsplit_0_pc_fieldsplit_block_size 3<br>

-coupledsolve_fieldsplit_0_fieldsplit_0_pc_type ml<br>

-coupledsolve_fieldsplit_0_fieldsplit_1_pc_type ml<br>

-coupledsolve_fieldsplit_0_fieldsplit_2_pc_type ml<br>

<br>

Is it normal, that I have to explicitly specify the block size for each<br>

fieldsplit?<br></blockquote><div><br></div><div>No. You should be able to just specify<br><br></div><div>-coupledsolve_fieldsplit_ksp_converged <br>-coupledsolve_fieldsplit_0_fieldsplit_pc_type ml<br><br></div><div>and same options will be applied to all splits (0,1,2). <br></div><div>Does this functionality not work?<br><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

I attached the results (-converged_reason only for readability and another file<br>

solely for the output of -ksp_view). I am not sure if this result could<br>

be improved by modifying any of the solver options.<br></blockquote><div><br></div><div>Yes, I believe they can.<br><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

Are there any guidelines to follow that I could use to avoid taking wild<br>

guesses?<br></blockquote><div><br></div>Sure. There are lots of papers published on how to construct robust block preconditioners for saddle point problems arising from Navier Stokes. <br>I would start by looking at this book:<br><br></div><div class="gmail_quote">  Finite Elements and Fast Iterative Solvers<br></div><div class="gmail_quote">  Howard Elman, David Silvester and Andy Wathen<br></div><div class="gmail_quote">  Oxford University Press<br></div><div class="gmail_quote">  See chapters 6 and 8.<br></div><div><br></div><div class="gmail_extra">Your preconditioner is performing a full block LDU factorization with quite accurate inner solves. This is probably overkill - but it will be robust. <br><br>The most relaxed approach would be :<br></div></div><div class="gmail_extra">  -coupledsolve_fieldsplit_0_ksp_type preonly<br>  -coupledsolve_fieldsplit_1_ksp_type preonly<br>  -coupledsolve_pc_fieldsplit_schur_fact_type DIAG<br><br>Something more aggressive (but less stringent that your original test) would be:<br>  -coupledsolve_fieldsplit_0_ksp_type gmres<br>  -coupledsolve_fieldsplit_0_ksp_rtol 1.0e-2<br>  -coupledsolve_fieldsplit_1_ksp_type preonly<br>  -coupledsolve_pc_fieldsplit_schur_fact_type UPPER<br><br>When building the FS preconditioner, you can start with the absolute most robust choices and then relax those choices to improve speed and hopefully not destroy the convergence, or you can start with a light weight preconditioner and make it stronger to improve convergence. <br>Where the balance lies is very much problem dependent.<br><div><br> </div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<span class=""><br>

> Petsc has some support to generate approximate pressure schur<br>

> complements for you, but these will not be as good as the ones<br>

> specifically constructed for you particular discretization.<br>

<br>

</span>I came across a tutorial (/snes/examples/tutorials/ex70.c), which shows<br>

2 different approaches:<br>

<br>

1- provide a Preconditioner \hat{S}p for the approximation of the true<br>

Schur complement<br>

<br>

2- use another Matrix (in this case its the Matrix used for constructing<br>

the preconditioner in the former approach) as a new approximation of the<br>

Schur complement.<br>

<br>

Speaking in terms of the PETSc-manual p.87, looking at the factorization<br>

of the Schur field split preconditioner, approach 1 sets \hat{S}p while<br>

approach 2 furthermore sets \hat{S}. Is this correct?<br>

<span class=""><br></span></blockquote><div><br></div><div>No this is not correct.<br>\hat{S} is always constructed by PETSc as<br>  \hat{S} = A11 - A10 KSP(A00) A01<br></div><div>This is the definition of the pressure schur complement used by FieldSplit. <br>Note that it is inexact since the action <br>  y = inv(A00) x <br>is replaced by a Krylov solve, e.g. we solve A00 y = x for y<br></div><div><br></div><div>You have two choices in how to define the preconditioned, \hat{S_p}:<br></div><div>[1] Assemble you own matrix (as is done in ex70)<br></div><div>[2] Let PETSc build one. PETSc does this according to<br></div><div>  \hat{S_p} = A11 - A10 inv(diag(A00)) A01<br></div><div><br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="">

><br>

> [2] If you assembled a different operator for your preconditioner in<br>

> which the B_pp slot contained a pressure schur complement<br>

</span>> approximation, you could use the simpler and likely more robust option<br>

<span class="">> (assuming you know of a decent schur complement approximation for you<br>

> discretisation and physical problem)<br>

<br>

> -coupledsolve_pc_type fieldsplit<br>

> -coupledsolve_pc_fieldsplit_type MULTIPLICATIVE<br>

><br>

> which include you U-p coupling, or just<br>

><br>

> -coupledsolve_pc_fieldsplit_type ADDITIVE<br>

><br>

><br>

> which would define the following preconditioner<br>

><br>

> inv(B) = diag( inv(B_uu,) , inv(B_vv) , inv(B_ww) , inv(B_pp) )<br>

<br>

<br>

</span>What do you refer to with "B_pp slot"? I don't understand this approach<br>

completely. What would I need a Schur complement approximation for, if I<br>

don't use a Schur complement preconditioner?<br>

<span class=""><br></span></blockquote><div><br></div><div>I was referring to constructing an operator which approximates the schur complement and inserting it into the pressure-pressure coupling block (pp slot)<br></div><div><br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="">

> Option 2 would be better as your operator doesn't have an u_i-u_j, i !<br>

> = j coupling and you could use efficient AMG implementations for each<br>

> scalar terms associated with u-u, v-v, w-w coupled terms without<br>

> having to split again.<br>

><br>

> Also, fieldsplit will not be aware of the fact that the Auu, Avv, Aww<br>

> blocks are all identical - thus it cannot do anything "smart" in order<br>

> to save memory. Accordingly, the KSP defined for each u,v,w split will<br>

> be a unique KSP object. If your A_ii are all identical and you want to<br>

> save memory, you could use MatNest but as Matt will always yell out,<br>

> "MatNest is ONLY a memory optimization and should be ONLY be used once<br>

> all solver exploration/testing is performed".<br>

<br>

</span>Thanks, I will keep this in mind. Does this mean that I would only have<br>

to assemble one matrix for the velocities instead of three?<br></blockquote><div><br></div><div>Yes, exactly.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<span class=""><br>

><br>

>         - A_pp is defined as the matrix resulting from the<br>

>         discretization of the<br>

>         pressure equation that considers only the pressure related<br>

>         terms.<br>

><br>

><br>

> Hmm okay, i assumed for incompressible NS the pressure equation<br>

><br>

> that the pressure equation would be just \div(u) = 0.<br>

><br>

<br>

</span>Indeed, many finite element(!) formulations I found while researching use<br>

this approach, which leads to the block A_pp being zero. I however use a<br>

collocated finite volume formulation and, to avoid checkerboarding of the<br>

pressure field, I deploy a pressure weighted interpolation method to<br>

approximate the velocities surging from the discretisation of \div{u}.<br>

This gives me an equation with the pressure as the dominant variable.<br></blockquote><div><br></div><div>Ah okay, you stabilize the discretisation. <br>Now I understand why you have entries in the PP block.<br><br></div><div>Cheers<br></div><div>  Dave<br></div><div> </div></div><br></div></div>