<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#ffffff">
Hello,<br>
<br>
On 10/07/2011 05:00 AM, Jed Brown wrote:
<blockquote
cite="mid:CAM9tzSk7gFH=AM2ZLJpW_wvVPDCgesN4py53e=CzA_cR+iFTQA@mail.gmail.com"
type="cite">[Cc'ing petsc-dev]<br>
<br>
<div class="gmail_quote">On Thu, Oct 6, 2011 at 05:14, Jakub
Sistek <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:sistek@math.cas.cz" target="_blank">sistek@math.cas.cz</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt
0.8ex; border-left: 1px solid rgb(204, 204, 204);
padding-left: 1ex;">
<div text="#000000" bgcolor="#ffffff"> Dear Jed,<br>
<br>
thank you for you explanations. In the meantime, I have
checked out the petsc-dev to look a bit at the components
you mention. The support of distributed meshes in PETSc is
one of the many things I do not know and I have never used.<br>
</div>
</blockquote>
<div><br>
</div>
<div>The unstructured mesh component of PETSc, "Sieve", is not
widely used at present, so we should not rely on it. Most
users have their own unstructured mesh component, so we want
to boil that down into common algebraic interfaces instead of
depending on any particular mesh.</div>
<div> </div>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt
0.8ex; border-left: 1px solid rgb(204, 204, 204);
padding-left: 1ex;">
<div text="#000000" bgcolor="#ffffff"> <br>
After reading your email, I think that your idea of writing
something like PCBDDC is quite possible. But before I start
individual comments, let me recall my idea of the core for
the coarse space in BDDC, which we should discuss in detail.<br>
As you know, the coarse space in BDDC is given by a small
set of constraints on continuity of the approximate solution
among subdomains. These encapsulate the subdomain matrices
into the saddle point problem, which is (hopefully) regular
and solving these problems substitutes the Neumann solves in
standard DD methods. This block of constraints on subdomain
i is usually called C_i in literature. Now the constraints
could be divided into two groups: <br>
(i) point-wise continuity - these translate to a single
entry of 1 in the corresponding row of C and are usually
related to so called corners<br>
(ii) generalized constrains - these are typically
equivalence of some averages between subdomains. These
averages might be arithmetic, or some weighted averages.
These translate into full rows of C comprised of weights of
the averages.<br>
<br>
It is sometimes beneficial to distinguish between the two in
implementations. Group (ii) can be to some extent automated
- faces, edges, and vertices of subdomains can be recognised
automatically in the preconditioner by counting the presence
of variables in subdomains.<br>
Group (i) is somehow more open. Identifying corners with
verices is usually not sufficient. Algorithms for better
selection depend on the problem type and are somewhat
problematic to automate.<br>
If there are enough constraints of type (ii), group (i) is
not necessary. On the other hand, some implementations rely
on the fact, that group (i) is already sufficient to make
the subdomain matrix regular and on the fact that these
constraints can be implemented as Dirichlet boundary
conditions to the subdomain problem. This then allows using
standard (even inexact) direct solvers for factorization of
subdomain matrix. Constraints (ii) my then be enforced via
lagrange multipliers or some other way.<br>
I have used this in the past when the aim was to use frontal
out-of-core solver (element by element loading) for
subdomain problems. <br>
I have to admit that I haven't implemented the method with
inexact solvers. In a more recent implementation, I have
this implemented by forming the entire saddle point problem
with C with both (i) and (ii) and using an indefinite direct
solver (MUMPS) even for SPD problems. I think this is not
possible for inexact local solves, but the previous approach
should be. <br>
</div>
</blockquote>
<div><br>
</div>
<div>Enforcing the general constraints is an important issue.
Klawonn & Widlund 2006 "Dual-primal FETI methods for
linear elasticity" has a section describing ways to perform
the change of basis. This is probably just different language,
but they change of basis keeps the subdomain problems
definite. Even when Lagrange multipliers are left in, they
describe how the unknowns can be ordered so that subdomain
direct solves can always be done without pivoting. I would
usually prefer to remove the constraints because iterative
solvers for saddle point problems are more technical.</div>
</div>
</blockquote>
Well, let me share my experience here. Mathematically, they are just
a different language. But not from the point of view of
implementation. In this respect, I like work of C. Dohrmann (2003,
2007), which is in my eye more implementation aware. I have already
tried to implement three ways to solve these saddle point problems,
changing of basis among them (A). The other two were splitting the
constraints into (i) and (ii), enforcing group (i) by changing the
system as Dirichlet boundary conditions and group (ii) by Lagrange
multipliers (B), and solving the saddle point system as such by a
direct method with suitable pivoting (C). Let me summarize (up to my
understanding to the approaches) my current (and very personal)
feeling of (A), (B) and (C):<br>
<br>
(A) <br>
(+) allows implementation of the preconditioner by solving system
with a large (assembled in subdomains and at coarse dofs)
distributed matrix<br>
(-) changing of basis produces a lot of fill into the matrix, which
becomes very expensive especially for faces among subdomains, making
direct solves after the change a lot more expensive<br>
(-) does not provide explicit coarse problem, which is the basis for
multilevel method<br>
(-) not simple to generalize for arbitrary weighted averages on
edges/faces (this may be my limited understanding)<br>
<br>
(B)<br>
(+) averages on faces do not present a problem<br>
(+) allows using variety of solvers for the block K (inexact,
iterative, ...)<br>
(-) requires two solves of the problem with K in each application of
preconditioner (one in constructing the RHS for Lagrange multipliers
problem, another one for actual solve with corrected RHS)<br>
(-) requires enough point constraints to regularize K for direct
solvers<br>
<br>
(C) <br>
(+) averages on faces do not present a problem<br>
(+) simplicity of implementation - no distinction between group (i)
and (ii) needed<br>
(-) quite restrictive for subdomain solvers, which need to be robust
for saddle-point type of systems<br>
<br>
Since inexact solvers seem to me now quite important, I would be
inclined to try (B) next time I would implement this task. This is
in my opinion pretty much the approach of Dohrmann (2007) in "An
approximate BDDC preconditioner". <br>
<blockquote
cite="mid:CAM9tzSk7gFH=AM2ZLJpW_wvVPDCgesN4py53e=CzA_cR+iFTQA@mail.gmail.com"
type="cite">
<div class="gmail_quote">
<div><br>
</div>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt
0.8ex; border-left: 1px solid rgb(204, 204, 204);
padding-left: 1ex;">
<div text="#000000" bgcolor="#ffffff">
<div> <br>
<blockquote type="cite">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin: 0pt 0pt
0pt 0.8ex; border-left: 1px solid rgb(204, 204,
204); padding-left: 1ex;">
<div text="#000000" bgcolor="#ffffff"> One thing I
particularly enjoy on PETSc is the quick
interchangeability of preconditioners and Krylov
methods within the KSP object. But I can see this
possible through strictly algebraic nature of the
approach, where only matrix object is passed. <br>
</div>
</blockquote>
<div><br>
</div>
<div>The KSP and PC objects have two "slots", the
Krylov operator A and the "preconditioning matrix"
B. I take a very liberal view of what B is. I
consider it to be a container into which any
problem/state-dependent information needed by the
preconditioner should be placed. Topological and
geometric information needed by the preconditioner
does not change between nonlinear iterations/time
steps/etc, so it can be given to the PC directly
(e.g. PCBDDCSetCoarseSpaceCandidates() or something
like that, though this could also be attached to the
Mat).</div>
</div>
</blockquote>
</div>
This would make some sense to me with respect to group (i)
of constraints, for example as index set of unknows to fix
on the subdomain.
<div><br>
<blockquote type="cite">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin: 0pt 0pt
0pt 0.8ex; border-left: 1px solid rgb(204, 204,
204); padding-left: 1ex;">
<div text="#000000" bgcolor="#ffffff"> On the other
hand, all of the FETI-DP and BDDC implementations
I have heard of are related to FEM computations
and make the mesh somewhat accessible to the
solver. Although I do not like this, also my third
generation of implementation of the BDDC method
still needs some limited information on geometry.
Not really for construction of the coarse basis
functions (this is algebraic in BDDC), but rather
indirectly for the selection of coarse degrees of
freedom. I am not aware of any existing approach
to selection of coarse DOFs at the moment, that
would not require some information on geometry for
robust selection on unstructured 3D meshes. I
could imagine that the required information could
be limited to positions of unknowns and some
information of the problem which is solved (the
nullspace size), the topology of the mesh is not
really necessary.<br>
</div>
</blockquote>
<div><br>
</div>
<div>We recently introduced MatSetNearNullSpace()
which is also needed by smoothed aggregation
algebraic multigrid. (We decided that this belonged
on the Mat because there are problems for which the
near null space could change depending on the
nonlinear regime, thus needing updating within a
nonlinear iteration. For multiphysics problems, it
is fragile to depend on access to the PC used for a
particular "block" (if it exists), so I prefer to
put information that may eventually need to be
composed with or interact with other "blocks" into
the Mat.)</div>
</div>
</blockquote>
</div>
May I know what is the difference between nullspace and
nearnullspace? I am not sure if this could be useful here...</div>
</blockquote>
<div><br>
</div>
<div>NullSpace is specifically for singular operators (e.g. the
global problem is floating). NearNullSpace is just describing
the low-energy modes (which need to be suitably represented in
the coarse space to have a scalable method).</div>
</div>
</blockquote>
Thank you for explanation. These would probably correspond to
subdomain coarse basis functions, which are computed in BDDC by
these saddle point problems rather than from an exact knowledge. The
coarse space here contains all the nullspaces of subdomains and
typically is much larger than exact nullspace of subdomain matrices
(e.g. for elasticity in 3D with cubic subdomains considering
arithmetic averages on faces and edges, each subdomain has 26*3 = 78
coarse basis functions compared to dimension 6 of exact nullspace.<br>
<blockquote
cite="mid:CAM9tzSk7gFH=AM2ZLJpW_wvVPDCgesN4py53e=CzA_cR+iFTQA@mail.gmail.com"
type="cite">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt
0.8ex; border-left: 1px solid rgb(204, 204, 204);
padding-left: 1ex;">
<div text="#000000" bgcolor="#ffffff">
<div><br>
<blockquote type="cite">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin: 0pt 0pt
0pt 0.8ex; border-left: 1px solid rgb(204, 204,
204); padding-left: 1ex;">
<div text="#000000" bgcolor="#ffffff"> For this
difficulty, I do not see it simple to write
something like PCBDDC preconditioner that would
simply interchange with PCASM and others. The
situation would be simpler for BDDC if the
preconditioner could use also some kind of mesh
description.<br>
</div>
</blockquote>
<div><br>
</div>
<div>I agree that it may always be necessary to
provide extra information in order to use PCBDDC.
The goal would not be to have a solver that only
needs a (partially) assembled sparse matrix, but
rather to have a purely algebraic interface by which
that information can be provided.</div>
<div><br>
</div>
<div>Another way for the PC to access grid information
is through PCSetDM(). From the perspective of the
solver, the DM is just an interface for providing
grid- and discretization-dependent algebraic
ingredients to the solver. This enables users of DM
to have preconditioners automatically set up.</div>
</div>
</blockquote>
</div>
This could be one option. If one would let the user to
provide set of indices to constraint (essentially corners),
then the mesh would probably not be necessary.
<div><br>
<blockquote type="cite">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin: 0pt 0pt
0pt 0.8ex; border-left: 1px solid rgb(204, 204,
204); padding-left: 1ex;">
<div text="#000000" bgcolor="#ffffff"> <br>
The other issue I can see to be a bit conflicting
with the KSP approach of PETSc might be the fact,
that BDDC implementations introduce some coupling
between preconditioner and Krylov method, which is
in fact run only for the Schur complement problem
at the interface among subdomains. Multiplication
by the system matrix in Krylov method is performed
by Dirichlet solves on each subdomain, which
corresponds to passing a special matrix-vector
multiplying routine to the Krylov method - at
least, this is the approach I follow in my last
implementation of BDDC, in the BDDCML code, where
essentially the preconditioner provides the A*x
function to the Krylov method. <br>
I have seen this circumvented in PCNN by resolving
the vectors to the original size after each
application of the preconditioner, but in my
opinion, this approach then loses some of the
efficiency of running Krylov method on the Schur
complement problem instead of the original
problem, which usually has a great effect on
convergence by itself.<br>
</div>
</blockquote>
<div><br>
</div>
<div>There are tradeoffs both ways because iterating
in the full space can accommodate inexact subdomain
solves. There are a bunch of algorithms that use the
same ingredients and are essentially equivalent when
direct solvers are used, but different when inexact
solvers are used:</div>
<div><br>
</div>
<div>BDDC: iterate in interface space, needs exact
subdomain and coarse solves</div>
<div>BDDC/primal: iterate in interface space plus
coarse primal dofs, tolerant of inexact coarse level
solve</div>
<div>BDDC/full: iterate in full space, tolerant of
inexact subdomain and coarse solves</div>
<div>FETI-DP: iterate in space of Lagrange
multipliers, much like BDDC above</div>
<div>iFETI-DP: iterate in space of subdomains with
duplicate interface dofs, coarse primal dofs, and
Lagrange multipliers. tolerant of inexact subdomain
and coarse level solves</div>
<div>irFETI-DP: iterate in space of Lagrange
multipliers and coarse dofs, tolerant of inexact
coarse solves</div>
<div><br>
</div>
<div>One advantage of iterating in the full space is
that the method can naturally be used to
precondition a somewhat different matrix (e.g. a
higher order discretization of the same physics on
the same mesh) which can be applied matrix-free. Any
method that iterates in a reduced space simply
contains another KSP for that purpose.</div>
</div>
</blockquote>
</div>
I agree, that in the context of PETSc, one should probably
go for the full space after each application of the
preconditioner. <br>
</div>
</blockquote>
<div><br>
</div>
<div>The main disadvantage being that the Krylov space requires
more storage.</div>
<div> </div>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt
0.8ex; border-left: 1px solid rgb(204, 204, 204);
padding-left: 1ex;">
<div text="#000000" bgcolor="#ffffff">
<div>
<blockquote type="cite">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin: 0pt 0pt
0pt 0.8ex; border-left: 1px solid rgb(204, 204,
204); padding-left: 1ex;">
<div text="#000000" bgcolor="#ffffff"> <br>
Regarding problem types, I have little experience
with using BDDC beyond Poisson problems and
elasticity. Recently, I have done some tests with
Stokes problems and incompressible Navier-Stokes
problems, using "brute force" rather than any
delicacy you may have in mind. The initial
experience with Stokes problem using Taylor-Hood
elements is quite good, things get worse for
Navier-Stokes where the convergence, with the
current simple coarse problem, deteriorates
quickly with increasing Reynolds number. However,
all these things should be better tested and, as
you probably know, are rather recent topic of
research and no clear conclusions have been really
achieved.<br>
</div>
</blockquote>
<div><br>
</div>
<div>I'm curious about what sort of problems you
tested with Stokes. In particular, I'm interested in
problems containing thin structures with large jumps
in coefficients (e.g. 10^8). (In this application,
I'm only interested in the Stokes problem, Re=1e-20
for these problems.)</div>
<div><br>
</div>
</div>
</blockquote>
</div>
Well, I have tested the solver only on various modifications
of the 3D lid driven cavity problem (cubic geometry) for
Stokes. My set-up were Taylor-Hood elements (serendipity or
full). The problems were divided into regular or irregular
subdomains. I have tested the results up to some 512, resp.
1024 CPUs and ~28M dof. For all these, the method almost did
not increased number of iterations when H/h was kept fixed
and the overall solver also scaled quite well. This was a
bit surprising for me and required some development - now I
use arithmetic averages for each velocity component and
pressure as constraints. Without pressure averages, it did
not work so nicely. Also, as far as I know, these cases are
not covered by the theory for Stokes, which usually assumes
elements with discontinuous pressure.<br>
I haven't tested any contrast in coefficients for Stokes,
but I have done a few of such tests for elasticity and the
experience says that BDDC (with exact subdomain solvers)
behaves well with respect to coefficient jumps inside
subdomains, or aligned to subdomain boundaries. There are
problems with jumps going through faces of subdomains.<br>
</div>
</blockquote>
<div><br>
</div>
<div>How bad do you consider these problems to be? I've heard
rather mixed reports from different groups.</div>
</div>
</blockquote>
OK. As I have mentioned, I think that jumps inside domains are
naturally taken care for by subdomain solves, so very well for exact
solvers. Jumps aligned by interfaces (for example coefficients
constant for subdomains) are handled well by stiffness scaled
averaging operator. The remaining group present troubles. We have
created a small test problem (100k dofs) in a recent paper
(<a class="moz-txt-link-freetext" href="http://dx.doi.org/10.1016/j.matcom.2011.03.014">http://dx.doi.org/10.1016/j.matcom.2011.03.014</a>) for elasticity
computation, where much stiffer 'rods' embedded in softer material
were punching through faces of subdomains. There, the contrast in
coefficients was only 10^5 and the resulting estimated condition
number even after preconditioning ~2x10^4 when using arithmetic
averages on faces. I know it is not a complete test of the
phenomena, but one can conclude that the standard type of coarse
problem does not suffice here. I can remember that by playing with
the difference in coefficients, we were essentially able to make the
solver stagnate. Hopefully, I should look into it again in a month
or so, so I can keep you updated of a bit more systematic tests.
Papers by Pechstein and Scheichl (2009 a,b) contain systematic tests
already.<br>
<br>
<blockquote
cite="mid:CAM9tzSk7gFH=AM2ZLJpW_wvVPDCgesN4py53e=CzA_cR+iFTQA@mail.gmail.com"
type="cite">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt
0.8ex; border-left: 1px solid rgb(204, 204, 204);
padding-left: 1ex;">
<div text="#000000" bgcolor="#ffffff"> Some of the Stokes
results were included in my recent talk (<a
moz-do-not-send="true"
href="http://www.math.cas.cz/%7Esistek/talks/Sistek-2011-ENUMATH-talk.pdf"
target="_blank">http://www.math.cas.cz/~sistek/talks/Sistek-2011-ENUMATH-talk.pdf</a>)
if you were interested. But I was more focused there about
timings of the solver when using more coarse levels. There,
the largest problems are not mentioned simply because in
those tables, I was always missing some case I did not
compute. <br>
<br>
I am wondering how you would like to proceed with the BDDC
implementation to PETSc, especially how users should specify
the coarse problem.<br>
</div>
</blockquote>
<div><br>
</div>
<div>I'm inclined to start by having the user specify the coarse
space with the matrix C as shown above. Once the basic method
is working, we can add more automatic support for defining C.</div>
</div>
</blockquote>
This sounds as a good starting point. I think that it would be
reasonable to start from the PCNN implementation, where most
components are already implemented.<br>
<blockquote
cite="mid:CAM9tzSk7gFH=AM2ZLJpW_wvVPDCgesN4py53e=CzA_cR+iFTQA@mail.gmail.com"
type="cite">
<div class="gmail_quote">
<div><br>
</div>
<div>I periodically entertain plans of making a partially
assembled format so that the user could assemble directly into
the hierarchical format that has done a change of basis to
make all coarse primal dofs explicit.</div>
<div><br>
</div>
<div>One concern I have is how to apply the method recursively
when rigid body modes are represented explicitly in coarse
spaces. That is, the coarse levels start having a different
number of dofs at different "nodes" (e.g. 6 dofs at a face,
but only 3 at a node).</div>
</div>
</blockquote>
<br>
I haven't been as accurate in our implementation of multilevel BDDC.
Although nullspaces of subdomains are contained, the coarse space is
typically a lot larger in applications I have tested. Averages are
generated over those "nodes" that have sufficient number of
unknowns. We have mostly used arithmetic averages (where we have 3
dofs on each face/edge/corner), and a bit of adaptive averages,
where it is unclear in advance, how many averages are going to be
enforced at a face, the upper limit of their number being the size
of the face. I am here more used to the simpler view of constraints
on continuity rather than coarse space properties - the more
constraints, the closer functions are at neighbouring subdomains,
and the better approximation one gets.<br>
<br>
Best regards,<br>
<br>
Jakub<br>
<br>
<pre class="moz-signature" cols="72">--
Jakub Sistek, Ph.D.
postdoctoral researcher
Institute of Mathematics of the AS CR
<a class="moz-txt-link-freetext" href="http://www.math.cas.cz/~sistek/">http://www.math.cas.cz/~sistek/</a>
tel: (+420) 222 090 710</pre>
</body>
</html>