<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#ffffff">

    Hello,<br>

    <br>

    On 10/07/2011 05:00 AM, Jed Brown wrote:

    <blockquote

cite="mid:CAM9tzSk7gFH=AM2ZLJpW_wvVPDCgesN4py53e=CzA_cR+iFTQA@mail.gmail.com"

      type="cite">[Cc'ing petsc-dev]<br>

      <br>

      <div class="gmail_quote">On Thu, Oct 6, 2011 at 05:14, Jakub

        Sistek <span dir="ltr"><<a moz-do-not-send="true"

            href="mailto:sistek@math.cas.cz" target="_blank">sistek@math.cas.cz</a>></span>

        wrote:<br>

        <blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt

          0.8ex; border-left: 1px solid rgb(204, 204, 204);

          padding-left: 1ex;">

          <div text="#000000" bgcolor="#ffffff"> Dear Jed,<br>

            <br>

            thank you for you explanations. In the meantime, I have

            checked out the petsc-dev to look a bit at the components

            you mention. The support of distributed meshes in PETSc is

            one of the many things I do not know and I have never used.<br>

          </div>

        </blockquote>

        <div><br>

        </div>

        <div>The unstructured mesh component of PETSc, "Sieve", is not

          widely used at present, so we should not rely on it. Most

          users have their own unstructured mesh component, so we want

          to boil that down into common algebraic interfaces instead of

          depending on any particular mesh.</div>

        <div> </div>

        <blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt

          0.8ex; border-left: 1px solid rgb(204, 204, 204);

          padding-left: 1ex;">

          <div text="#000000" bgcolor="#ffffff"> <br>

            After reading your email, I think that your idea of writing

            something like PCBDDC is quite possible. But before I start

            individual comments, let me recall my idea of the core for

            the coarse space in BDDC, which we should discuss in detail.<br>

            As you know, the coarse space in BDDC is given by a small

            set of constraints on continuity of the approximate solution

            among subdomains. These encapsulate the subdomain matrices

            into the saddle point problem, which is (hopefully) regular

            and solving these problems substitutes the Neumann solves in

            standard DD methods. This block of constraints on subdomain

            i is usually called C_i in literature. Now the constraints

            could be divided into two groups: <br>

            (i) point-wise continuity - these translate to a single

            entry of 1 in the corresponding row of C and are usually

            related to so called corners<br>

            (ii) generalized constrains - these are typically

            equivalence of some averages between subdomains. These

            averages might be arithmetic, or some weighted averages.

            These translate into full rows of C comprised of weights of

            the averages.<br>

            <br>

            It is sometimes beneficial to distinguish between the two in

            implementations. Group (ii) can be to some extent automated

            - faces, edges, and vertices of subdomains can be recognised

            automatically in the preconditioner by counting the presence

            of variables in subdomains.<br>

            Group (i) is somehow more open. Identifying corners with

            verices is usually not sufficient. Algorithms for better

            selection depend on the problem type and are somewhat

            problematic to automate.<br>

            If there are enough constraints of type (ii), group (i) is

            not necessary. On the other hand, some implementations rely

            on the fact, that group (i) is already sufficient to make

            the subdomain matrix regular and on the fact that these

            constraints can be implemented as Dirichlet boundary

            conditions to the subdomain problem. This then allows using

            standard (even inexact) direct solvers for factorization of

            subdomain matrix. Constraints (ii) my then be enforced via

            lagrange multipliers or some other way.<br>

            I have used this in the past when the aim was to use frontal

            out-of-core solver (element by element loading) for

            subdomain problems. <br>

            I have to admit that I haven't implemented the method with

            inexact solvers. In a more recent implementation, I have

            this implemented by forming the entire saddle point problem

            with C with both (i) and (ii) and using an indefinite direct

            solver (MUMPS) even for SPD problems. I think this is not

            possible for inexact local solves, but the previous approach

            should be. <br>

          </div>

        </blockquote>

        <div><br>

        </div>

        <div>Enforcing the general constraints is an important issue.

          Klawonn & Widlund 2006 "Dual-primal FETI methods for

          linear elasticity" has a section describing ways to perform

          the change of basis. This is probably just different language,

          but they change of basis keeps the subdomain problems

          definite. Even when Lagrange multipliers are left in, they

          describe how the unknowns can be ordered so that subdomain

          direct solves can always be done without pivoting. I would

          usually prefer to remove the constraints because iterative

          solvers for saddle point problems are more technical.</div>

      </div>

    </blockquote>

    Well, let me share my experience here. Mathematically, they are just

    a different language. But not from the point of view of

    implementation. In this respect, I like work of C. Dohrmann (2003,

    2007), which is in my eye more implementation aware. I have already

    tried to implement three ways to solve these saddle point problems,

    changing of basis among them (A). The other two were splitting the

    constraints into (i) and (ii), enforcing group (i) by changing the

    system as Dirichlet boundary conditions and group (ii) by Lagrange

    multipliers (B), and solving the saddle point system as such by a

    direct method with suitable pivoting (C). Let me summarize (up to my

    understanding to the approaches) my current (and very personal)

    feeling of (A), (B) and (C):<br>

    <br>

    (A) <br>

    (+) allows implementation of the preconditioner by solving system

    with a large (assembled in subdomains and at coarse dofs)

    distributed matrix<br>

    (-) changing of basis produces a lot of fill into the matrix, which

    becomes very expensive especially for faces among subdomains, making

    direct solves after the change a lot more expensive<br>

    (-) does not provide explicit coarse problem, which is the basis for

    multilevel method<br>

    (-) not simple to generalize for arbitrary weighted averages on

    edges/faces (this may be my limited understanding)<br>

    <br>

    (B)<br>

    (+) averages on faces do not present a problem<br>

    (+) allows using variety of solvers for the block K (inexact,

    iterative, ...)<br>

    (-) requires two solves of the problem with K in each application of

    preconditioner (one in constructing the RHS for Lagrange multipliers

    problem, another one for actual solve with corrected RHS)<br>

    (-) requires enough point constraints to regularize K for direct

    solvers<br>

    <br>

    (C) <br>

    (+) averages on faces do not present a problem<br>

    (+) simplicity of implementation - no distinction between group (i)

    and (ii) needed<br>

    (-) quite restrictive for subdomain solvers, which need to be robust

    for saddle-point type of systems<br>

    <br>

    Since inexact solvers seem to me now quite important, I would be

    inclined to try (B) next time I would implement this task. This is

    in my opinion pretty much the approach of Dohrmann (2007) in "An

    approximate BDDC preconditioner".  <br>

    <blockquote

cite="mid:CAM9tzSk7gFH=AM2ZLJpW_wvVPDCgesN4py53e=CzA_cR+iFTQA@mail.gmail.com"

      type="cite">

      <div class="gmail_quote">

        <div><br>

        </div>

        <blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt

          0.8ex; border-left: 1px solid rgb(204, 204, 204);

          padding-left: 1ex;">

          <div text="#000000" bgcolor="#ffffff">

            <div> <br>

              <blockquote type="cite">

                <div class="gmail_quote">

                  <blockquote class="gmail_quote" style="margin: 0pt 0pt

                    0pt 0.8ex; border-left: 1px solid rgb(204, 204,

                    204); padding-left: 1ex;">

                    <div text="#000000" bgcolor="#ffffff"> One thing I

                      particularly enjoy on PETSc is the quick

                      interchangeability of preconditioners and Krylov

                      methods within the KSP object. But I can see this

                      possible through strictly algebraic nature of the

                      approach, where only matrix object is passed. <br>

                    </div>

                  </blockquote>

                  <div><br>

                  </div>

                  <div>The KSP and PC objects have two "slots", the

                    Krylov operator A and the "preconditioning matrix"

                    B. I take a very liberal view of what B is. I

                    consider it to be a container into which any

                    problem/state-dependent information needed by the

                    preconditioner should be placed. Topological and

                    geometric information needed by the preconditioner

                    does not change between nonlinear iterations/time

                    steps/etc, so it can be given to the PC directly

                    (e.g. PCBDDCSetCoarseSpaceCandidates() or something

                    like that, though this could also be attached to the

                    Mat).</div>

                </div>

              </blockquote>

            </div>

            This would make some sense to me with respect to group (i)

            of constraints, for example as index set of unknows to fix

            on the subdomain.

            <div><br>

              <blockquote type="cite">

                <div class="gmail_quote">

                  <blockquote class="gmail_quote" style="margin: 0pt 0pt

                    0pt 0.8ex; border-left: 1px solid rgb(204, 204,

                    204); padding-left: 1ex;">

                    <div text="#000000" bgcolor="#ffffff"> On the other

                      hand, all of the FETI-DP and BDDC implementations

                      I have heard of are related to FEM computations

                      and make the mesh somewhat accessible to the

                      solver. Although I do not like this, also my third

                      generation of implementation of the BDDC method

                      still needs some limited information on geometry.

                      Not really for construction of the coarse basis

                      functions (this is algebraic in BDDC), but rather

                      indirectly for the selection of coarse degrees of

                      freedom. I am not aware of any existing approach

                      to selection of coarse DOFs at the moment, that

                      would not require some information on geometry for

                      robust selection on unstructured 3D meshes. I

                      could imagine that the required information could

                      be limited to positions of unknowns and some

                      information of the problem which is solved (the

                      nullspace size), the topology of the mesh is not

                      really necessary.<br>

                    </div>

                  </blockquote>

                  <div><br>

                  </div>

                  <div>We recently introduced MatSetNearNullSpace()

                    which is also needed by smoothed aggregation

                    algebraic multigrid. (We decided that this belonged

                    on the Mat because there are problems for which the

                    near null space could change depending on the

                    nonlinear regime, thus needing updating within a

                    nonlinear iteration. For multiphysics problems, it

                    is fragile to depend on access to the PC used for a

                    particular "block" (if it exists), so I prefer to

                    put information that may eventually need to be

                    composed with or interact with other "blocks" into

                    the Mat.)</div>

                </div>

              </blockquote>

            </div>

            May I know what is the difference between nullspace and

            nearnullspace? I am not sure if this could be useful here...</div>

        </blockquote>

        <div><br>

        </div>

        <div>NullSpace is specifically for singular operators (e.g. the

          global problem is floating). NearNullSpace is just describing

          the low-energy modes (which need to be suitably represented in

          the coarse space to have a scalable method).</div>

      </div>

    </blockquote>

    Thank you for explanation. These would probably correspond to

    subdomain coarse basis functions, which are computed in BDDC by

    these saddle point problems rather than from an exact knowledge. The

    coarse space here contains all the nullspaces of subdomains and

    typically is much larger than exact nullspace of subdomain matrices

    (e.g. for elasticity in 3D with cubic subdomains considering

    arithmetic averages on faces and edges, each subdomain has 26*3 = 78

    coarse basis functions compared to dimension 6 of exact nullspace.<br>

    <blockquote

cite="mid:CAM9tzSk7gFH=AM2ZLJpW_wvVPDCgesN4py53e=CzA_cR+iFTQA@mail.gmail.com"

      type="cite">

      <div class="gmail_quote">

        <div> </div>

        <blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt

          0.8ex; border-left: 1px solid rgb(204, 204, 204);

          padding-left: 1ex;">

          <div text="#000000" bgcolor="#ffffff">

            <div><br>

              <blockquote type="cite">

                <div class="gmail_quote">

                  <div> </div>

                  <blockquote class="gmail_quote" style="margin: 0pt 0pt

                    0pt 0.8ex; border-left: 1px solid rgb(204, 204,

                    204); padding-left: 1ex;">

                    <div text="#000000" bgcolor="#ffffff"> For this

                      difficulty, I do not see it simple to write

                      something like PCBDDC preconditioner that would

                      simply interchange with PCASM and others. The

                      situation would be simpler for BDDC if the

                      preconditioner could use also some kind of mesh

                      description.<br>

                    </div>

                  </blockquote>

                  <div><br>

                  </div>

                  <div>I agree that it may always be necessary to

                    provide extra information in order to use PCBDDC.

                    The goal would not be to have a solver that only

                    needs a (partially) assembled sparse matrix, but

                    rather to have a purely algebraic interface by which

                    that information can be provided.</div>

                  <div><br>

                  </div>

                  <div>Another way for the PC to access grid information

                    is through PCSetDM(). From the perspective of the

                    solver, the DM is just an interface for providing

                    grid- and discretization-dependent algebraic

                    ingredients to the solver. This enables users of DM

                    to have preconditioners automatically set up.</div>

                </div>

              </blockquote>

            </div>

            This could be one option. If one would let the user to

            provide set of indices to constraint (essentially corners),

            then the mesh would probably not be necessary.

            <div><br>

              <blockquote type="cite">

                <div class="gmail_quote">

                  <div> </div>

                  <blockquote class="gmail_quote" style="margin: 0pt 0pt

                    0pt 0.8ex; border-left: 1px solid rgb(204, 204,

                    204); padding-left: 1ex;">

                    <div text="#000000" bgcolor="#ffffff"> <br>

                      The other issue I can see to be a bit conflicting

                      with the KSP approach of PETSc might be the fact,

                      that BDDC implementations introduce some coupling

                      between preconditioner and Krylov method, which is

                      in fact run only for the Schur complement problem

                      at the interface among subdomains. Multiplication

                      by the system matrix in Krylov method is performed

                      by Dirichlet solves on each subdomain, which

                      corresponds to passing a special matrix-vector

                      multiplying routine to the Krylov method - at

                      least, this is the approach I follow in my last

                      implementation of BDDC, in the BDDCML code, where

                      essentially the preconditioner provides the A*x

                      function to the Krylov method. <br>

                      I have seen this circumvented in PCNN by resolving

                      the vectors to the original size after each

                      application of the preconditioner, but in my

                      opinion, this approach then loses some of the

                      efficiency of running Krylov method on the Schur

                      complement problem instead of the original

                      problem, which usually has a great effect on

                      convergence by itself.<br>

                    </div>

                  </blockquote>

                  <div><br>

                  </div>

                  <div>There are tradeoffs both ways because iterating

                    in the full space can accommodate inexact subdomain

                    solves. There are a bunch of algorithms that use the

                    same ingredients and are essentially equivalent when

                    direct solvers are used, but different when inexact

                    solvers are used:</div>

                  <div><br>

                  </div>

                  <div>BDDC: iterate in interface space, needs exact

                    subdomain and coarse solves</div>

                  <div>BDDC/primal: iterate in interface space plus

                    coarse primal dofs, tolerant of inexact coarse level

                    solve</div>

                  <div>BDDC/full: iterate in full space, tolerant of

                    inexact subdomain and coarse solves</div>

                  <div>FETI-DP: iterate in space of Lagrange

                    multipliers, much like BDDC above</div>

                  <div>iFETI-DP: iterate in space of subdomains with

                    duplicate interface dofs, coarse primal dofs, and

                    Lagrange multipliers. tolerant of inexact subdomain

                    and coarse level solves</div>

                  <div>irFETI-DP: iterate in space of Lagrange

                    multipliers and coarse dofs, tolerant of inexact

                    coarse solves</div>

                  <div><br>

                  </div>

                  <div>One advantage of iterating in the full space is

                    that the method can naturally be used to

                    precondition a somewhat different matrix (e.g. a

                    higher order discretization of the same physics on

                    the same mesh) which can be applied matrix-free. Any

                    method that iterates in a reduced space simply

                    contains another KSP for that purpose.</div>

                </div>

              </blockquote>

            </div>

            I agree, that in the context of PETSc, one should probably

            go for the full space after each application of the

            preconditioner. <br>

          </div>

        </blockquote>

        <div><br>

        </div>

        <div>The main disadvantage being that the Krylov space requires

          more storage.</div>

        <div> </div>

        <blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt

          0.8ex; border-left: 1px solid rgb(204, 204, 204);

          padding-left: 1ex;">

          <div text="#000000" bgcolor="#ffffff">

            <div>

              <blockquote type="cite">

                <div class="gmail_quote">

                  <blockquote class="gmail_quote" style="margin: 0pt 0pt

                    0pt 0.8ex; border-left: 1px solid rgb(204, 204,

                    204); padding-left: 1ex;">

                    <div text="#000000" bgcolor="#ffffff"> <br>

                      Regarding problem types, I have little experience

                      with using BDDC beyond Poisson problems and

                      elasticity. Recently, I have done some tests with

                      Stokes problems and incompressible Navier-Stokes

                      problems, using "brute force" rather than any

                      delicacy you may have in mind. The initial

                      experience with Stokes problem using Taylor-Hood

                      elements is quite good, things get worse for

                      Navier-Stokes where the convergence, with the

                      current simple coarse problem, deteriorates

                      quickly with increasing Reynolds number. However,

                      all these things should be better tested and, as

                      you probably know, are rather recent topic of

                      research and no clear conclusions have been really

                      achieved.<br>

                    </div>

                  </blockquote>

                  <div><br>

                  </div>

                  <div>I'm curious about what sort of problems you

                    tested with Stokes. In particular, I'm interested in

                    problems containing thin structures with large jumps

                    in coefficients (e.g. 10^8). (In this application,

                    I'm only interested in the Stokes problem, Re=1e-20

                    for these problems.)</div>

                  <div><br>

                  </div>

                </div>

              </blockquote>

            </div>

            Well, I have tested the solver only on various modifications

            of the 3D lid driven cavity problem (cubic geometry) for

            Stokes. My set-up were Taylor-Hood elements (serendipity or

            full). The problems were divided into regular or irregular

            subdomains. I have tested the results up to some 512, resp.

            1024 CPUs and ~28M dof. For all these, the method almost did

            not increased number of iterations when H/h was kept fixed

            and the overall solver also scaled quite well. This was a

            bit surprising for me and required some development - now I

            use arithmetic averages for each velocity component and

            pressure as constraints. Without pressure averages, it did

            not work so nicely. Also, as far as I know, these cases are

            not covered by the theory for Stokes, which usually assumes

            elements with discontinuous pressure.<br>

            I haven't tested any contrast in coefficients for Stokes,

            but I have done a few of such tests for elasticity and the

            experience says that BDDC (with exact subdomain solvers)

            behaves well with respect to coefficient jumps inside

            subdomains, or aligned to subdomain boundaries. There are

            problems with jumps going through faces of subdomains.<br>

          </div>

        </blockquote>

        <div><br>

        </div>

        <div>How bad do you consider these problems to be? I've heard

          rather mixed reports from different groups.</div>

      </div>

    </blockquote>

    OK. As I have mentioned, I think that jumps inside domains are

    naturally taken care for by subdomain solves, so very well for exact

    solvers. Jumps aligned by interfaces (for example coefficients

    constant for subdomains) are handled well by stiffness scaled

    averaging operator. The remaining group present troubles. We have

    created a small test problem (100k dofs) in a recent paper

    (<a class="moz-txt-link-freetext" href="http://dx.doi.org/10.1016/j.matcom.2011.03.014">http://dx.doi.org/10.1016/j.matcom.2011.03.014</a>) for elasticity

    computation, where much stiffer 'rods' embedded in softer material

    were punching through faces of subdomains. There, the contrast in

    coefficients was only 10^5 and the resulting estimated condition

    number even after preconditioning ~2x10^4 when using arithmetic

    averages on faces. I know it is not a complete test of the

    phenomena, but one can conclude that the standard type of coarse

    problem does not suffice here. I can remember that by playing with

    the difference in coefficients, we were essentially able to make the

    solver stagnate. Hopefully, I should look into it again in a month

    or so, so I can keep you updated of a bit more systematic tests.

    Papers by Pechstein and Scheichl (2009 a,b) contain systematic tests

    already.<br>

    <br>

    <blockquote

cite="mid:CAM9tzSk7gFH=AM2ZLJpW_wvVPDCgesN4py53e=CzA_cR+iFTQA@mail.gmail.com"

      type="cite">

      <div class="gmail_quote">

        <div> </div>

        <blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt

          0.8ex; border-left: 1px solid rgb(204, 204, 204);

          padding-left: 1ex;">

          <div text="#000000" bgcolor="#ffffff"> Some of the Stokes

            results were included in my recent talk (<a

              moz-do-not-send="true"

href="http://www.math.cas.cz/%7Esistek/talks/Sistek-2011-ENUMATH-talk.pdf"

              target="_blank">http://www.math.cas.cz/~sistek/talks/Sistek-2011-ENUMATH-talk.pdf</a>)

            if you were interested. But I was more focused there about

            timings of the solver when using more coarse levels. There,

            the largest problems are not mentioned simply because in

            those tables, I was always missing some case I did not

            compute. <br>

            <br>

            I am wondering how you would like to proceed with the BDDC

            implementation to PETSc, especially how users should specify

            the coarse problem.<br>

          </div>

        </blockquote>

        <div><br>

        </div>

        <div>I'm inclined to start by having the user specify the coarse

          space with the matrix C as shown above. Once the basic method

          is working, we can add more automatic support for defining C.</div>

      </div>

    </blockquote>

    This sounds as a good starting point. I think that it would be

    reasonable to start from the PCNN implementation, where most

    components are already implemented.<br>

    <blockquote

cite="mid:CAM9tzSk7gFH=AM2ZLJpW_wvVPDCgesN4py53e=CzA_cR+iFTQA@mail.gmail.com"

      type="cite">

      <div class="gmail_quote">

        <div><br>

        </div>

        <div>I periodically entertain plans of making a partially

          assembled format so that the user could assemble directly into

          the hierarchical format that has done a change of basis to

          make all coarse primal dofs explicit.</div>

        <div><br>

        </div>

        <div>One concern I have is how to apply the method recursively

          when rigid body modes are represented explicitly in coarse

          spaces. That is, the coarse levels start having a different

          number of dofs at different "nodes" (e.g. 6 dofs at a face,

          but only 3 at a node).</div>

      </div>

    </blockquote>

    <br>

    I haven't been as accurate in our implementation of multilevel BDDC.

    Although nullspaces of subdomains are contained, the coarse space is

    typically a lot larger in applications I have tested. Averages are

    generated over those "nodes" that have sufficient number of

    unknowns. We have mostly used arithmetic averages (where we have 3

    dofs on each face/edge/corner), and a bit of adaptive averages,

    where it is unclear in advance, how many averages are going to be

    enforced at a face, the upper limit of their number being the size

    of the face. I am here more used to the simpler view of constraints

    on continuity rather than coarse space properties - the more

    constraints, the closer functions are at neighbouring subdomains,

    and the better approximation one gets.<br>

    <br>

    Best regards,<br>

    <br>

    Jakub<br>

    <br>

    <pre class="moz-signature" cols="72">-- 

Jakub Sistek, Ph.D.

postdoctoral researcher

Institute of Mathematics of the AS CR

<a class="moz-txt-link-freetext" href="http://www.math.cas.cz/~sistek/">http://www.math.cas.cz/~sistek/</a>

tel: (+420) 222 090 710</pre>

  </body>

</html>