[petsc-dev] [petsc-users] user experience with PCNN

Thu Oct 6 22:00:47 CDT 2011

[Cc'ing petsc-dev]

On Thu, Oct 6, 2011 at 05:14, Jakub Sistek <sistek at math.cas.cz> wrote:

> **
> Dear Jed,
>
> thank you for you explanations. In the meantime, I have checked out the
> petsc-dev to look a bit at the components you mention. The support of
> distributed meshes in PETSc is one of the many things I do not know and I
> have never used.
>

The unstructured mesh component of PETSc, "Sieve", is not widely used at
present, so we should not rely on it. Most users have their own unstructured
mesh component, so we want to boil that down into common algebraic
interfaces instead of depending on any particular mesh.

>
> After reading your email, I think that your idea of writing something like
> PCBDDC is quite possible. But before I start individual comments, let me
> recall my idea of the core for the coarse space in BDDC, which we should
> discuss in detail.
> As you know, the coarse space in BDDC is given by a small set of
> constraints on continuity of the approximate solution among subdomains.
> These encapsulate the subdomain matrices into the saddle point problem,
> which is (hopefully) regular and solving these problems substitutes the
> Neumann solves in standard DD methods. This block of constraints on
> subdomain i is usually called C_i in literature. Now the constraints could
> be divided into two groups:
> (i) point-wise continuity - these translate to a single entry of 1 in the
> corresponding row of C and are usually related to so called corners
> (ii) generalized constrains - these are typically equivalence of some
> averages between subdomains. These averages might be arithmetic, or some
> weighted averages. These translate into full rows of C comprised of weights
> of the averages.
>
> It is sometimes beneficial to distinguish between the two in
> implementations. Group (ii) can be to some extent automated - faces, edges,
> and vertices of subdomains can be recognised automatically in the
> preconditioner by counting the presence of variables in subdomains.
> Group (i) is somehow more open. Identifying corners with verices is usually
> not sufficient. Algorithms for better selection depend on the problem type
> and are somewhat problematic to automate.
> If there are enough constraints of type (ii), group (i) is not necessary.
> On the other hand, some implementations rely on the fact, that group (i) is
> already sufficient to make the subdomain matrix regular and on the fact that
> these constraints can be implemented as Dirichlet boundary conditions to the
> subdomain problem. This then allows using standard (even inexact) direct
> solvers for factorization of subdomain matrix. Constraints (ii) my then be
> enforced via lagrange multipliers or some other way.
> I have used this in the past when the aim was to use frontal out-of-core
> solver (element by element loading) for subdomain problems.
> I have to admit that I haven't implemented the method with inexact solvers.
> In a more recent implementation, I have this implemented by forming the
> entire saddle point problem with C with both (i) and (ii) and using an
> indefinite direct solver (MUMPS) even for SPD problems. I think this is not
> possible for inexact local solves, but the previous approach should be.
>

Enforcing the general constraints is an important issue. Klawonn & Widlund
2006 "Dual-primal FETI methods for linear elasticity" has a section
describing ways to perform the change of basis. This is probably just
different language, but they change of basis keeps the subdomain problems
definite. Even when Lagrange multipliers are left in, they describe how the
unknowns can be ordered so that subdomain direct solves can always be done
without pivoting. I would usually prefer to remove the constraints because
iterative solvers for saddle point problems are more technical.

>   One thing I particularly enjoy on PETSc is the quick interchangeability
>> of preconditioners and Krylov methods within the KSP object. But I can see
>> this possible through strictly algebraic nature of the approach, where only
>> matrix object is passed.
>>
>
>  The KSP and PC objects have two "slots", the Krylov operator A and the
> "preconditioning matrix" B. I take a very liberal view of what B is. I
> consider it to be a container into which any problem/state-dependent
> information needed by the preconditioner should be placed. Topological and
> geometric information needed by the preconditioner does not change between
> nonlinear iterations/time steps/etc, so it can be given to the PC directly
> (e.g. PCBDDCSetCoarseSpaceCandidates() or something like that, though this
> could also be attached to the Mat).
>
> This would make some sense to me with respect to group (i) of constraints,
> for example as index set of unknows to fix on the subdomain.
>
>   On the other hand, all of the FETI-DP and BDDC implementations I have
>> heard of are related to FEM computations and make the mesh somewhat
>> accessible to the solver. Although I do not like this, also my third
>> generation of implementation of the BDDC method still needs some limited
>> information on geometry. Not really for construction of the coarse basis
>> functions (this is algebraic in BDDC), but rather indirectly for the
>> selection of coarse degrees of freedom. I am not aware of any existing
>> approach to selection of coarse DOFs at the moment, that would not require
>> some information on geometry for robust selection on unstructured 3D meshes.
>> I could imagine that the required information could be limited to positions
>> of unknowns and some information of the problem which is solved (the
>> nullspace size), the topology of the mesh is not really necessary.
>>
>
>  We recently introduced MatSetNearNullSpace() which is also needed by
> smoothed aggregation algebraic multigrid. (We decided that this belonged on
> the Mat because there are problems for which the near null space could
> change depending on the nonlinear regime, thus needing updating within a
> nonlinear iteration. For multiphysics problems, it is fragile to depend on
> access to the PC used for a particular "block" (if it exists), so I prefer
> to put information that may eventually need to be composed with or interact
> with other "blocks" into the Mat.)
>
> May I know what is the difference between nullspace and nearnullspace? I am
> not sure if this could be useful here...
>

NullSpace is specifically for singular operators (e.g. the global problem is
floating). NearNullSpace is just describing the low-energy modes (which need
to be suitably represented in the coarse space to have a scalable method).

>
>
>
>>  For this difficulty, I do not see it simple to write something like
>> PCBDDC preconditioner that would simply interchange with PCASM and others.
>> The situation would be simpler for BDDC if the preconditioner could use also
>> some kind of mesh description.
>>
>
>  I agree that it may always be necessary to provide extra information in
> order to use PCBDDC. The goal would not be to have a solver that only needs
> a (partially) assembled sparse matrix, but rather to have a purely algebraic
> interface by which that information can be provided.
>
>  Another way for the PC to access grid information is through PCSetDM().
> From the perspective of the solver, the DM is just an interface for
> providing grid- and discretization-dependent algebraic ingredients to the
> solver. This enables users of DM to have preconditioners automatically set
> up.
>
> This could be one option. If one would let the user to provide set of
> indices to constraint (essentially corners), then the mesh would probably
> not be necessary.
>
>
>
>>
>> The other issue I can see to be a bit conflicting with the KSP approach of
>> PETSc might be the fact, that BDDC implementations introduce some coupling
>> between preconditioner and Krylov method, which is in fact run only for the
>> Schur complement problem at the interface among subdomains. Multiplication
>> by the system matrix in Krylov method is performed by Dirichlet solves on
>> each subdomain, which corresponds to passing a special matrix-vector
>> multiplying routine to the Krylov method - at least, this is the approach I
>> follow in my last implementation of BDDC, in the BDDCML code, where
>> essentially the preconditioner provides the A*x function to the Krylov
>> method.
>> I have seen this circumvented in PCNN by resolving the vectors to the
>> original size after each application of the preconditioner, but in my
>> opinion, this approach then loses some of the efficiency of running Krylov
>> method on the Schur complement problem instead of the original problem,
>> which usually has a great effect on convergence by itself.
>>
>
>  There are tradeoffs both ways because iterating in the full space can
> accommodate inexact subdomain solves. There are a bunch of algorithms that
> use the same ingredients and are essentially equivalent when direct solvers
> are used, but different when inexact solvers are used:
>
>  BDDC: iterate in interface space, needs exact subdomain and coarse solves
> BDDC/primal: iterate in interface space plus coarse primal dofs, tolerant
> of inexact coarse level solve
> BDDC/full: iterate in full space, tolerant of inexact subdomain and coarse
> solves
> FETI-DP: iterate in space of Lagrange multipliers, much like BDDC above
> iFETI-DP: iterate in space of subdomains with duplicate interface dofs,
> coarse primal dofs, and Lagrange multipliers. tolerant of inexact subdomain
> and coarse level solves
> irFETI-DP: iterate in space of Lagrange multipliers and coarse dofs,
> tolerant of inexact coarse solves
>
>  One advantage of iterating in the full space is that the method can
> naturally be used to precondition a somewhat different matrix (e.g. a higher
> order discretization of the same physics on the same mesh) which can be
> applied matrix-free. Any method that iterates in a reduced space simply
> contains another KSP for that purpose.
>
> I agree, that in the context of PETSc, one should probably go for the full
> space after each application of the preconditioner.
>

The main disadvantage being that the Krylov space requires more storage.

>
>> Regarding problem types, I have little experience with using BDDC beyond
>> Poisson problems and elasticity. Recently, I have done some tests with
>> Stokes problems and incompressible Navier-Stokes problems, using "brute
>> force" rather than any delicacy you may have in mind. The initial experience
>> with Stokes problem using Taylor-Hood elements is quite good, things get
>> worse for Navier-Stokes where the convergence, with the current simple
>> coarse problem, deteriorates quickly with increasing Reynolds number.
>> However, all these things should be better tested and, as you probably know,
>> are rather recent topic of research and no clear conclusions have been
>> really achieved.
>>
>
>  I'm curious about what sort of problems you tested with Stokes. In
> particular, I'm interested in problems containing thin structures with large
> jumps in coefficients (e.g. 10^8). (In this application, I'm only interested
> in the Stokes problem, Re=1e-20 for these problems.)
>
>   Well, I have tested the solver only on various modifications of the 3D
> lid driven cavity problem (cubic geometry) for Stokes. My set-up were
> Taylor-Hood elements (serendipity or full). The problems were divided into
> regular or irregular subdomains. I have tested the results up to some 512,
> resp. 1024 CPUs and ~28M dof. For all these, the method almost did not
> increased number of iterations when H/h was kept fixed and the overall
> solver also scaled quite well. This was a bit surprising for me and required
> some development - now I use arithmetic averages for each velocity component
> and pressure as constraints. Without pressure averages, it did not work so
> nicely. Also, as far as I know, these cases are not covered by the theory
> for Stokes, which usually assumes elements with discontinuous pressure.
> I haven't tested any contrast in coefficients for Stokes, but I have done a
> few of such tests for elasticity and the experience says that BDDC (with
> exact subdomain solvers) behaves well with respect to coefficient jumps
> inside subdomains, or aligned to subdomain boundaries. There are problems
> with jumps going through faces of subdomains.
>

How bad do you consider these problems to be? I've heard rather mixed
reports from different groups.

> Some of the Stokes results were included in my recent talk (
> http://www.math.cas.cz/~sistek/talks/Sistek-2011-ENUMATH-talk.pdf) if you
> were interested. But I was more focused there about timings of the solver
> when using more coarse levels. There, the largest problems are not mentioned
> simply because in those tables, I was always missing some case I did not
> compute.
>
> I am wondering how you would like to proceed with the BDDC implementation
> to PETSc, especially how users should specify the coarse problem.
>

I'm inclined to start by having the user specify the coarse space with the
matrix C as shown above. Once the basic method is working, we can add more
automatic support for defining C.

I periodically entertain plans of making a partially assembled format so
that the user could assemble directly into the hierarchical format that has
done a change of basis to make all coarse primal dofs explicit.

One concern I have is how to apply the method recursively when rigid body
modes are represented explicitly in coarse spaces. That is, the coarse
levels start having a different number of dofs at different "nodes" (e.g. 6
dofs at a face, but only 3 at a node).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20111006/4e531e9e/attachment.html>