Comments below:<br><br><div class="gmail_quote">On Thu, Dec 2, 2010 at 8:00 AM, Barry Smith <span dir="ltr"><<a href="mailto:bsmith@mcs.anl.gov">bsmith@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<br>

<br>

Begin forwarded message:<br>

<br>

> From: Laurent Michel <<a href="mailto:laurent.michel@epfl.ch">laurent.michel@epfl.ch</a>><br>

> Date: December 2, 2010 5:25:38 AM CST<br>

> To: <a href="mailto:petsc-maint@mcs.anl.gov">petsc-maint@mcs.anl.gov</a><br>

> Subject: Re: [petsc-maint #57017] Hypre<br>

> Reply-To: <a href="mailto:petsc-maint@mcs.anl.gov">petsc-maint@mcs.anl.gov</a>, Laurent Michel <<a href="mailto:laurent.michel@epfl.ch">laurent.michel@epfl.ch</a>><br>

><br>

> Ok. That sounds much easier. So, during the building of the linear<br>

> system, I do something like (when computing the CSR format)<br>

><br>

>    PetscInt start, end; MatGetOwnershipRange(A, &start, &end);<br>

>    vector<int> idv, idp; // velocity and pressure indices<br>

>    int index = 0, row = 0;<br>

>    for (int i = 0; i < nbNodeP1; i++) { // Loop over the nodes of the<br>

> mesh (P1 nodes)<br>

>        for (int ndof = 0; ndof < 4; ndof++) { // Loop over the degrees<br>

> of freedom of each node; 3 for v, 1 for p<br>

>            if(!nodes[i].is_fixed(ndof)){ // if the node is free, then<br>

> put it in the linear system<br>

>                if(start <= row && row < end){ // each process does its<br>

> own part of the job<br>

><br>

>                    [stuff for CSR format]<br>

><br>

>                    if(0 <= ndof && ndof < 3) // velocity<br>

>                        idv.push_back(row);<br>

>                    else // pressure<br>

>                        idp.push_back(row);<br>

>                }<br>

><br>

>                row++;<br>

>            }<br>

>        }<br>

>    }</blockquote><div><br></div><div>Yes, you just captures the indices. You could even split up velocity components as well. I do</div><div>for elasticity, but for Laplace it probably not necessary.</div><div><br></div>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"> </blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

>    // conversion to arrays<br>

>    int *idp_ptr = new int [idp.size()], *idv_ptr = new int<br>

> [idv.size()]; // fieldsplit indices<br>

>    memcpy(idp_ptr, &idp[0], sizeof(int)*idp.size());<br>

>    memcpy(idv_ptr, &idv[0], sizeof(int)*idv.size()); </blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

>    // conversion to IS<br>

>    IS isv, isp;<br>

>    ISCreateGeneral(PETSC_COMM_WORLD, idp.size(), idp_ptr, &isp);<br>

>    ISCreateGeneral(PETSC_COMM_WORLD, idv.size(), idv_ptr, &isv);<br>

><br>

> Then, when I create the preconditioner, I say<br>

><br>

> PCSetType(pc, PCFIELDSPLIT);<br>

> PCFieldSplitSetIS(pc, isv);<br>

> PCFieldSplitSetIS(pc, isp);<br>

><br>

> I guess the order matters here. Then, when I call the code, I call it<br>

> among other things with<br>

<div class="im">><br>

> -pc_fieldsplit_type multiplicative -fieldsplit_0_pc_type ml<br>

> -fieldsplit_1_pc_type jacobi<br>

><br>

</div>> Ok. Now the thing is, I haven't compiled the ml library, and it seems<br>

> like the petsc servers are down (I can't download anything). Anyway, I<br>

> wanted to try this with something else. Hence, I tried in SEQUENTIAL<br>

><br>

> -pc_fieldsplit_type multiplicative -fieldsplit_0_pc_type ilu<br>

> -fieldsplit_0_pc_factor_levels 2 -fieldsplit_1_pc_type jacobi<br>

><br>

> and this works fine; in parallel, with<br>

><br>

> -pc_fieldsplit_type multiplicative -fieldsplit_0_pc_type asm<br>

<div class="im">> -fieldsplit_1_pc_type jacobi<br>

><br>

</div>> I get something working. I think I've been able to code this. I'll let<br>

> you know if I encounter some problems with the ml preconditioner (I have<br>

> no idea what it is, so I may have problems).<br></blockquote><div><br></div><div>ML is just Algebraic Multigrid, like BoomerAMG, but easier to install in my experience.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


> Regarding all this, I have to mention that until now I've had to reorder<br>

> my matrix elements with<br>

><br>

> -pc_factor_mat_ordering_type rcm<br>

><br>

> because it seems like the nodes of my mesh are not numbered<br>

> appropriately. How should I proceed for parallel computations? In<br>

> sequential, I gain about a factor 8 in CPU time doing this.<br></blockquote><div><br></div><div>You can do that same thing on each process. However, I would reorder my mesh</div><div>up front.</div><div><br></div><div>

   Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

> Thanks,<br>

><br>

> L.<br>

<div class="im">><br>

><br>

><br>

> Matthew Knepley wrote:<br>

>> On Wed, Dec 1, 2010 at 8:38 AM, Laurent Michel <<a href="mailto:laurent.michel@epfl.ch">laurent.michel@epfl.ch</a><br>

</div><div><div></div><div class="h5">>> <mailto:<a href="mailto:laurent.michel@epfl.ch">laurent.michel@epfl.ch</a>>> wrote:<br>

>><br>

>>    Well I'm actually struggling finding a good, scalable<br>

>>    preconditioner for<br>

>>    my problem. This is one of the problems I have to solve these days.<br>

>><br>

>>    I have spoken to a few people here who are kind of experts in the<br>

>>    field<br>

>>    of hpc, and I have always got the same answer: the Stokes problem will<br>

>>    not scale well. I found some people working on more or less the same<br>

>>    problem as I do who actually have a scalable code (sisiphus for<br>

>>    instance; I am solving a glacier problem).<br>

>><br>

>>    I wanted to use the fieldsplit preconditioner, I already roughly<br>

>>    discussed this with Mr. Knepley, but I'm not quite sure yet how to do<br>

>>    this; I mean, I have an unstructured mesh, and I need to define<br>

>>    some DAs<br>

>>    to make the fieldsplit preconditioner work. It seems complicated.<br>

>><br>

>><br>

>> I did not do a good enough job explaining. FieldSplit should not be<br>

>> hard for your<br>

>> problem. You just need to create lists of the unknowns in each field.<br>

>> Each list<br>

>> will be an IS object. Then you call<br>

>><br>

>>  <a href="http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/PC/PCFieldSplitSetIS.html" target="_blank">http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/PC/PCFieldSplitSetIS.html</a><br>


>><br>

>> once for each field. Then you can independently handle the velocity and<br>

>> pressure preconditioners, and it should produce a scalable solver. I would<br>

>> try -pc_fieldsplit_type multiplicative -fieldsplit_0_pc_type ml<br>

>> -fieldsplit_1_pc_type jacobi<br>

>><br>

>>   Matt<br>

>><br>

>><br>

>>    In parallel to this, I have seen that decoupling the velocity from the<br>

>>    pressure field (with like an Uzawa method or something like that; I'm<br>

>>    not an expert here) may be more scalable. I gather this is the same as<br>

>>    your fieldsplit preconditioner. I'll have a more thorough look<br>

>>    into this<br>

>>    if so. There is to my knowledge (which is very small) no documentation<br>

>>    related to the handling of unstructured problems with your fieldsplit<br>

>>    preconditioner, which I think might be the best option for me. ASM<br>

>>    doesn't work well (I have awful results on my cluster; see<br>

>>    attachment).<br>

>>    I'm solving Ax = b here, with A being the whole finite element matrix,<br>

>>    containing the data related to all the fields (velocity and pressure).<br>

>><br>

>>    Any hint on what I could try? It's not central for my PhD thesis, but<br>

>>    I'd enjoy some faster code!<br>

>><br>

>>    Regards,<br>

>><br>

>>    L.<br>

>><br>

>>    Barry Smith wrote:<br>

>>> On Dec 1, 2010, at 2:44 AM, Laurent Michel wrote:<br>

>>><br>

>>><br>

>>>> Hi,<br>

>>>><br>

>>>>   * OS Version: Linux iacspc3 2.6.31-22-generic #68-Ubuntu SMP<br>

>>    Tue Oct<br>

>>>>     26 16:37:17 UTC 2010 x86_64 GNU/Linux<br>

>>>>   * PETSc Version: Fri Apr 30 20:23:44 CDT 2010<br>

>>>>   * MPI implementation: openmpi<br>

>>>>   * Compiler: g++<br>

>>>>   * Probable PETSc component: PC<br>

>>>><br>

>>>> My question is about the use of preconditioner Hypre. I have a<br>

>>    colleague<br>

>>>> who uses hypre directly and solves more or less the same kind<br>

>>    of problem<br>

>>>> as I do. His code scales pretty well (3D Navier-Stokes) and I<br>

>>    wanted to<br>

>>>> try hypre in mine, in order to see what happens. Indeed, for my<br>

>>    problem,<br>

>>>> it's known that the algebraic Schwarz method will not scale<br>

>>    that good<br>

>>>> (this also what I have noticed during my numerical<br>

>>    experiments). Anyway,<br>

>>>> I've tried to run my code with the following additional options:<br>

>>>><br>

>>>> -pc_type hypre -pc_hypre_type euclid<br>

>>>><br>

>>>> and it takes forever to solve the linear system (I mean, to<br>

>>    apply the<br>

>>>> preconditioner and then actually solve the system).<br>

>>>><br>

>>><br>

>>>   Likely Euclid is a bad preconditioner for your problem. Run<br>

>>    with -log_summary -ksp_converged_reason to completion and then<br>

>>    look at the output to see where it is spending all the time. If<br>

>>    the time is truely dominated by PCSetUp and or PCApply or requires<br>

>>    many many iterations then Euclid is the problem. We've found the<br>

>>    Euclid doesn't scale or perform particularlly well.<br>

>>><br>

</div></div><div class="im">>>>   If your problem is a "Stokes" like problem that just throwing<br>

>>    a generic preconditioner like Euclid at it without doing something<br>

>>    special for the fact that it is "Stokes like" problem won't work<br>

>>    well. Perhaps Jed has some suggestions.<br>

>>><br>

</div><div class="im">>>>   Barry<br>

>>><br>

>>><br>

>>>> I'm using the MPIAIJ matrix format, with<br>

>>    MatMPIAIJSetPreallocationCSR.<br>

>>>> Should I use another format? I've been struggling looking for<br>

>>>> documentation about this. Where can I find any further information?<br>

>>>><br>

>>>> Regards,<br>

>>>><br>

>>>> L.<br>

>>>><br>

>>>> --<br>

>>>> Laurent Michel<br>

>>>> PhD candidate<br>

>>>> EPFL SB IACS ASN<br>

>>>> MA C2 644<br>

>>>> +41 21 693 42 46<br>

>>>> +41 77 433 38 94<br>

>>>><br>

>>>><br>

>>>><br>

>>><br>

>>> .<br>

>>><br>

>>><br>

>><br>

>><br>

>>    --<br>

>>    Laurent Michel<br>

>>    PhD candidate<br>

>>    EPFL SB IACS ASN<br>

>>    MA C2 644<br>

>>    +41 21 693 42 46<br>

>>    +41 77 433 38 94<br>

>><br>

>><br>

>><br>

>><br>

>><br>

>> --<br>

>> What most experimenters take for granted before they begin their<br>

>> experiments is infinitely more interesting than any results to which<br>

>> their experiments lead.<br>

>> -- Norbert Wiener<br>

><br>

><br>

</div>> --<br>

<div><div></div><div class="h5">> Laurent Michel<br>

> PhD candidate<br>

> EPFL SB IACS ASN<br>

> MA C2 644<br>

> +41 21 693 42 46<br>

> +41 77 433 38 94<br>

><br>

><br>

<br>

</div></div></blockquote></div><br><br clear="all"><br>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>

-- Norbert Wiener<br>