[petsc-dev] Fwd: [petsc-maint #57017] Hypre

Thu Dec 2 11:21:14 CST 2010

Comments below:

On Thu, Dec 2, 2010 at 8:00 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>
> Begin forwarded message:
>
> > From: Laurent Michel <laurent.michel at epfl.ch>
> > Date: December 2, 2010 5:25:38 AM CST
> > To: petsc-maint at mcs.anl.gov
> > Subject: Re: [petsc-maint #57017] Hypre
> > Reply-To: petsc-maint at mcs.anl.gov, Laurent Michel <
> laurent.michel at epfl.ch>
> >
> > Ok. That sounds much easier. So, during the building of the linear
> > system, I do something like (when computing the CSR format)
> >
> >    PetscInt start, end; MatGetOwnershipRange(A, &start, &end);
> >    vector<int> idv, idp; // velocity and pressure indices
> >    int index = 0, row = 0;
> >    for (int i = 0; i < nbNodeP1; i++) { // Loop over the nodes of the
> > mesh (P1 nodes)
> >        for (int ndof = 0; ndof < 4; ndof++) { // Loop over the degrees
> > of freedom of each node; 3 for v, 1 for p
> >            if(!nodes[i].is_fixed(ndof)){ // if the node is free, then
> > put it in the linear system
> >                if(start <= row && row < end){ // each process does its
> > own part of the job
> >
> >                    [stuff for CSR format]
> >
> >                    if(0 <= ndof && ndof < 3) // velocity
> >                        idv.push_back(row);
> >                    else // pressure
> >                        idp.push_back(row);
> >                }
> >
> >                row++;
> >            }
> >        }
> >    }

Yes, you just captures the indices. You could even split up velocity
components as well. I do
for elasticity, but for Laplace it probably not necessary.

>    // conversion to arrays
> >    int *idp_ptr = new int [idp.size()], *idv_ptr = new int
> > [idv.size()]; // fieldsplit indices
> >    memcpy(idp_ptr, &idp[0], sizeof(int)*idp.size());
> >    memcpy(idv_ptr, &idv[0], sizeof(int)*idv.size());

>    // conversion to IS
> >    IS isv, isp;
> >    ISCreateGeneral(PETSC_COMM_WORLD, idp.size(), idp_ptr, &isp);
> >    ISCreateGeneral(PETSC_COMM_WORLD, idv.size(), idv_ptr, &isv);
> >
> > Then, when I create the preconditioner, I say
> >
> > PCSetType(pc, PCFIELDSPLIT);
> > PCFieldSplitSetIS(pc, isv);
> > PCFieldSplitSetIS(pc, isp);
> >
> > I guess the order matters here. Then, when I call the code, I call it
> > among other things with
> >
> > -pc_fieldsplit_type multiplicative -fieldsplit_0_pc_type ml
> > -fieldsplit_1_pc_type jacobi
> >
> > Ok. Now the thing is, I haven't compiled the ml library, and it seems
> > like the petsc servers are down (I can't download anything). Anyway, I
> > wanted to try this with something else. Hence, I tried in SEQUENTIAL
> >
> > -pc_fieldsplit_type multiplicative -fieldsplit_0_pc_type ilu
> > -fieldsplit_0_pc_factor_levels 2 -fieldsplit_1_pc_type jacobi
> >
> > and this works fine; in parallel, with
> >
> > -pc_fieldsplit_type multiplicative -fieldsplit_0_pc_type asm
> > -fieldsplit_1_pc_type jacobi
> >
> > I get something working. I think I've been able to code this. I'll let
> > you know if I encounter some problems with the ml preconditioner (I have
> > no idea what it is, so I may have problems).
>

ML is just Algebraic Multigrid, like BoomerAMG, but easier to install in my
experience.

> > Regarding all this, I have to mention that until now I've had to reorder
> > my matrix elements with
> >
> > -pc_factor_mat_ordering_type rcm
> >
> > because it seems like the nodes of my mesh are not numbered
> > appropriately. How should I proceed for parallel computations? In
> > sequential, I gain about a factor 8 in CPU time doing this.
>

You can do that same thing on each process. However, I would reorder my mesh
up front.

   Matt

> > Thanks,
> >
> > L.
> >
> >
> >
> > Matthew Knepley wrote:
> >> On Wed, Dec 1, 2010 at 8:38 AM, Laurent Michel <laurent.michel at epfl.ch
> >> <mailto:laurent.michel at epfl.ch>> wrote:
> >>
> >>    Well I'm actually struggling finding a good, scalable
> >>    preconditioner for
> >>    my problem. This is one of the problems I have to solve these days.
> >>
> >>    I have spoken to a few people here who are kind of experts in the
> >>    field
> >>    of hpc, and I have always got the same answer: the Stokes problem
> will
> >>    not scale well. I found some people working on more or less the same
> >>    problem as I do who actually have a scalable code (sisiphus for
> >>    instance; I am solving a glacier problem).
> >>
> >>    I wanted to use the fieldsplit preconditioner, I already roughly
> >>    discussed this with Mr. Knepley, but I'm not quite sure yet how to do
> >>    this; I mean, I have an unstructured mesh, and I need to define
> >>    some DAs
> >>    to make the fieldsplit preconditioner work. It seems complicated.
> >>
> >>
> >> I did not do a good enough job explaining. FieldSplit should not be
> >> hard for your
> >> problem. You just need to create lists of the unknowns in each field.
> >> Each list
> >> will be an IS object. Then you call
> >>
> >>
> http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/PC/PCFieldSplitSetIS.html
> >>
> >> once for each field. Then you can independently handle the velocity and
> >> pressure preconditioners, and it should produce a scalable solver. I
> would
> >> try -pc_fieldsplit_type multiplicative -fieldsplit_0_pc_type ml
> >> -fieldsplit_1_pc_type jacobi
> >>
> >>   Matt
> >>
> >>
> >>    In parallel to this, I have seen that decoupling the velocity from
> the
> >>    pressure field (with like an Uzawa method or something like that; I'm
> >>    not an expert here) may be more scalable. I gather this is the same
> as
> >>    your fieldsplit preconditioner. I'll have a more thorough look
> >>    into this
> >>    if so. There is to my knowledge (which is very small) no
> documentation
> >>    related to the handling of unstructured problems with your fieldsplit
> >>    preconditioner, which I think might be the best option for me. ASM
> >>    doesn't work well (I have awful results on my cluster; see
> >>    attachment).
> >>    I'm solving Ax = b here, with A being the whole finite element
> matrix,
> >>    containing the data related to all the fields (velocity and
> pressure).
> >>
> >>    Any hint on what I could try? It's not central for my PhD thesis, but
> >>    I'd enjoy some faster code!
> >>
> >>    Regards,
> >>
> >>    L.
> >>
> >>    Barry Smith wrote:
> >>> On Dec 1, 2010, at 2:44 AM, Laurent Michel wrote:
> >>>
> >>>
> >>>> Hi,
> >>>>
> >>>>   * OS Version: Linux iacspc3 2.6.31-22-generic #68-Ubuntu SMP
> >>    Tue Oct
> >>>>     26 16:37:17 UTC 2010 x86_64 GNU/Linux
> >>>>   * PETSc Version: Fri Apr 30 20:23:44 CDT 2010
> >>>>   * MPI implementation: openmpi
> >>>>   * Compiler: g++
> >>>>   * Probable PETSc component: PC
> >>>>
> >>>> My question is about the use of preconditioner Hypre. I have a
> >>    colleague
> >>>> who uses hypre directly and solves more or less the same kind
> >>    of problem
> >>>> as I do. His code scales pretty well (3D Navier-Stokes) and I
> >>    wanted to
> >>>> try hypre in mine, in order to see what happens. Indeed, for my
> >>    problem,
> >>>> it's known that the algebraic Schwarz method will not scale
> >>    that good
> >>>> (this also what I have noticed during my numerical
> >>    experiments). Anyway,
> >>>> I've tried to run my code with the following additional options:
> >>>>
> >>>> -pc_type hypre -pc_hypre_type euclid
> >>>>
> >>>> and it takes forever to solve the linear system (I mean, to
> >>    apply the
> >>>> preconditioner and then actually solve the system).
> >>>>
> >>>
> >>>   Likely Euclid is a bad preconditioner for your problem. Run
> >>    with -log_summary -ksp_converged_reason to completion and then
> >>    look at the output to see where it is spending all the time. If
> >>    the time is truely dominated by PCSetUp and or PCApply or requires
> >>    many many iterations then Euclid is the problem. We've found the
> >>    Euclid doesn't scale or perform particularlly well.
> >>>
> >>>   If your problem is a "Stokes" like problem that just throwing
> >>    a generic preconditioner like Euclid at it without doing something
> >>    special for the fact that it is "Stokes like" problem won't work
> >>    well. Perhaps Jed has some suggestions.
> >>>
> >>>   Barry
> >>>
> >>>
> >>>> I'm using the MPIAIJ matrix format, with
> >>    MatMPIAIJSetPreallocationCSR.
> >>>> Should I use another format? I've been struggling looking for
> >>>> documentation about this. Where can I find any further information?
> >>>>
> >>>> Regards,
> >>>>
> >>>> L.
> >>>>
> >>>> --
> >>>> Laurent Michel
> >>>> PhD candidate
> >>>> EPFL SB IACS ASN
> >>>> MA C2 644
> >>>> +41 21 693 42 46
> >>>> +41 77 433 38 94
> >>>>
> >>>>
> >>>>
> >>>
> >>> .
> >>>
> >>>
> >>
> >>
> >>    --
> >>    Laurent Michel
> >>    PhD candidate
> >>    EPFL SB IACS ASN
> >>    MA C2 644
> >>    +41 21 693 42 46
> >>    +41 77 433 38 94
> >>
> >>
> >>
> >>
> >>
> >> --
> >> What most experimenters take for granted before they begin their
> >> experiments is infinitely more interesting than any results to which
> >> their experiments lead.
> >> -- Norbert Wiener
> >
> >
> > --
> > Laurent Michel
> > PhD candidate
> > EPFL SB IACS ASN
> > MA C2 644
> > +41 21 693 42 46
> > +41 77 433 38 94
> >
> >
>
>

-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20101202/1cdc896d/attachment.html>