[petsc-dev] API changes in MatIS

Mon May 21 15:45:46 CDT 2012

2012/5/21 Jed Brown <jedbrown at mcs.anl.gov>

> On Sat, May 12, 2012 at 7:09 AM, Stefano Zampini <
> stefano.zampini at gmail.com> wrote:
>
>> First, sorry for the long email...
>>
>> I think the construction of the prolongation/restriction operators, the
>> local part of the coarse matrix and the assembling (or subassembling) of
>> the global coarse matrix should all belong to PCIS (with PCBDDC and PCNN as
>> subclasses). In fact, for both PCBDDC and PCNN all stuffs involved in the
>> preconditioner application can be viewed as a subassembled matrices
>> (Prolongation/Restrictions, static condensation also). This would need to
>> change the actual structure of MATIS and allowing for the creation of
>> rectangular operators mapping between two different spaces; MATIS creation
>> will thus need both LocalToGlobalMapping (row mapping) and
>> GlobalToLocalMapping (column mapping) arguments to be created.
>
>
> How is the column mapping a global-to-local?
>
> Global-to-local is a *bad* thing and best avoided by algorithms.
> Fortunately, I don't think you actually mean that, you just mean a column
> scatter which is perfectly scalable.
>

Yes. In the rectangular case you will need both global_to_local and
local_to global scatters. The  square case is a special case for which they
are the same (except that they are performed in SCATTER_FORWARD or
SCATTER_REVERSE mode). With rectangular matrices, you can perform MatMult
using the SCATTER_FORWARD mode for the two different scatters;
MatMultTranspose using  SCATTER REVERSEs.

>
>
>> A brand new logic in PCIS setup I would like can be
>>
>> PCISSetup() /* common to all PCIS methods */
>> "PCISDealWithAllLocalStuffNeededByTheSpecificNonOverlappingMethod()"
>> PCISCreateRestrictionAndProlongationOperators(pc)
>> PCISAssembleLocalCoarseMat(pc)
>> PCISCreatePartitionOfCoarseMesh(pc,&partition)
>> PCISAssembleGlobalCoarseMat(pc,partition)
>>
>
> So this is starting to look rather similar to multigrid. I think that's a
> good thing, there is extra flexibility and consistency in making everything
> look like a variant of multigrid.
>
>
>>
>> "PCISDealWithAllLocalStuffNeededByTheSpecificNonOverlappingMethod()"
>> will contain the construction of the "Neumann" solver (for BDDC, it is
>> actually a saddle point problem)
>>
>
> I think what you're talking about is the pinned Neumann problem. Dohrmann
> solves this by first dropping the corners to make the remaining patch
> non-singular, factoring the result, and then enforcing the integral
> constraints with Lagrange multipliers (the Schur complement can be formed
> and factored). Although this is practical, especially for 1-1
> process-subdomain mapping, I'd rather not hard-code it. I can imagine
> having huge multi-process subdomains with many integral constraints as well
> as very small subdomains where it's more practical to just factor directly,
> parhaps after change of basis
>

The Dohrmann's approach for solving the local saddle point problems is
strictly related on how the are currently solved in PCBDDC. I'm planning to
add a new feature in PCBDDC which can give the possibility to factor and
solve the local saddle point problems directly (factoring the whole local
saddle point problems). I used the change of basis approach in the past,
but I'm more inclined to adopt the saddle point approach, because you can
(almost) easily switch inexact subdomain solvers. However, I think the
change of basis can also be added to the current method with minor code
changes.

>
> Anyway, if I'm thinking of these methods as multigrid, we have:
>
> 1. composite smoother:
>     weighted split of load, optionally harmonically extend with Dirichlet
> problems, solve pinned Neumann problem, balance correction
>
>
>
2. restriction/prolongation:
>     formed by solving a series of pinned Neumann problems, Eq 2 of
> Dohrmann 2007.
>
> 3. Constructing coarse grid. Some write it as a Schur complement (which I
> was thinking of earlier because it's convenient for analysis), but Dohrmann
> writes it as PtAP. I realize that it's implemented differently from normal
> PtAP because it's done as a local dense operation that gets (perhaps
> partially) assembled.
>
>
Actually in PCBDDC the local coarse matrix is computed using a theoretical
equivalence of the PtAP operation of coarse basis and the unassembled MATIS
matrix (coarse basis are continuous only at vertex and constraints dofs).
PtAP (where P is dense) is just avoided for its computational costs.

> The coarse grid should be self-similar so it would reuse the components
> above.
>
> For my variants, I want to be able to insert an overlapping strip
> correction to the smoother and to add an operator-dependent adaptive
> process to coarse space enrichment. I believe that the smoother above also
> makes sense as a one-level method, when combined with some other correction
> that allows the coarse points to move. I'd like to eventually factor the
> code so that such things are easy.
>

I'm not familiar with multigrid, but if you can implement an MG framework
for the application of a nonoverlapping (or strip) preconditioner it would
be great. I can help you with the Neumann-Neumann and BDDC cases.

>
>> For PCBDDC and PCNN:
>>
>> PCISCreateRestrictionAndProlongation_NN will create a MATIS representing
>> a default P which, in case of a scalar PDE, will be the constant function
>> scaled by the partition of unity operator, with global dimensions N x
>> sum^N_{i=1}pcis->n with N the number of subdomains and pcis->n the size of
>> the local matis matrix (local dirichlet plus interface nodes) (in case of
>> more complex vector valued PDEs it will need a MatNearNullSpace object as
>> already implemented in BDDC)
>>
>
> This sounds good. We can MatConvert this MATIS to something else if we
> want it assembled. As mentioned earlier, I want MATIS to support
> multiprocess subdomains as well as mutiple subdomains per process.
>

I think it is a problem. All MATIS code is inherenlty written for
uniprocessor subdomains and one subdomain per process. Maybe it would be
better to think at a brand new class of matrices (your PA I think) for
which construct the new class and subclasses of preconditioners. Maybe such
a new class can be implemented with SF instead of VecScatter.

>
>
>> PCISCreateRestrictionAndProlongation_BDDC: (in case of exact solvers for
>> the Dirichlet problems) default P will be of size
>> n_coarse_dofs*sum^N_{i=1}pcis->n_B with local matrices of P the actual
>> pcbddc->coarse_phi_B
>>
>> PCISCreateRestrictionAndProlongation_BDDC: (in case of inexact solvers
>> for the Dirichlet problems) default P will be of size
>> n_coarse_dofs*sum^N_{i=1}pcis->n with local matrices of P the actual
>> pcbddc->coarse_phi_B concatenated with pcbddc->coarse_phi_D
>> (pcis->n=pcis->n_B+pcis->n_D)
>>
>
> This sounds okay. The reduced space iteration with exact Dirichlet solvers
> bothers me somewhat. We should be able to implement as running PCFieldSplit
> to restrict the inner iteration to the interface problem, but with our
> current data structures, that may have thrown away the interior information
> that we need.
>

I'm not getting the point here. Can you clarify?

>
>
>>
>> PCISAssembleLocalCoarseMat will assemble the sequential matrix
>> representing subdomains' contribution to the global coarse matrix (_NN and
>> _BDDC cases can be easily written using already existing codes)
>>
>> PCISAssembleCoarseMat(pc,IS partition) would then decide how to finally
>> assemble the coarse matrix depending on the partition passed in (and
>> possibly change the row mapping of the default prolongation operators).
>>
>> Does this logic fits what you have in mind?
>
>
> Yeah, this sounds fine. I guess "local" here means "to the subdomain"
> rather than process?
>

Yes. I'm thinking for MATIS matrices, so local is equivalent to "to the
subdomain".

>
>
>>
>>
>> 2012/5/12 Jed Brown <jedbrown at mcs.anl.gov>
>>
>>> On Fri, May 11, 2012 at 6:11 PM, Stefano Zampini <
>>> stefano.zampini at gmail.com> wrote:
>>>
>>>> Isn't PtAP still the right operation, you just have a particular
>>>>> implementation that takes advantage of structure?
>>>>>
>>>>
>>>> yes it is, but since it is an expensive operations (P is dense), in
>>>> BDDC, once you solved the local problems to create P, you have almost
>>>> straigthly (and at a very low cost) the columns of the coarse matrix. The
>>>> latter can be obtained (as it is implemented in the code) as C^T\Lambda
>>>> where C is the local sparse  matrix of constraints, and \Lambda is a dense
>>>> and small matrix of lagrange multipliers.
>>>>
>>>>> I know you can also assemble B A^{-1} B^T, which is the same thing,
>>>>> and maybe we should provide a generic op for that.
>>>>>
>>>> What is B? the jump operator?
>>>>
>>>
>>> Your C above.
>>>
>>> I have other algorithms in mind where the the interpolants would be
>>> constructed somewhat differently. I may need to think a bit about what the
>>> right common operation is for that case. I just feel like we may be getting
>>> too tightly dependent on the specific BDDC algorithm (which has exponential
>>> condition number growth in the number of levels), which we probably want to
>>> generalize in the future.
>>>
>>
>>
>>
>> --
>> Stefano
>>
>
>

-- 
Stefano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120521/ea0815cc/attachment.html>