MatMerge_SeqsToMPI

Fri May 23 11:20:14 CDT 2008

Thank you for the useful comments. For sure I will consider them. I've just starting my research by writing a DDM substructuring code which scales for now up-to 60 CPUs using petsc KSP solver for the interface problem and Lapack direct factorization for the interior problem. I split the domain using METIS library and assign each subdomain to one process then solve the global Schur complement using parallel preconditioned iterative solver. As an initial attempt, I solved a 2D elasticity problem (about 100,000 DOFs) within seconds using this algorithm. I notice Lapack solver for the interior problem takes a lot of time compare to the iterative solver for the interface, so now I am replacing the direct factorization with petsc KSP solver.

I would like very much to have a look at your implementation, and I think that will be very useful to me.

Thanks 

Waad

Lisandro Dalcin <dalcinl at gmail.com> wrote: On 5/20/08, Waad Subber  wrote:
> 1) How do you actually get the local Schur complements. You
> explicitelly compute its entries, or do you compute it after computing
> the inverse (or LU factors) of a 'local' matrix?
>
> I construct the local Schur complement matrices after getting the inversion
> of A_II matrix for each subdomain.

Fine,

> 2) Your R_i matrix is actually a matrix? In that case, it is a trivial
> restrinction operation with ones and zeros? Or R_i is actually a
> VecScatter?
>
> R_i is the restriction matrix maps the global boundary nodes to the local
> boundary nodes and its entries is zero and one I store it as spare matrix,
> so only I need to store the nonzero entries which one entry per a row

I believe a VecScatter will perform much better for this task.

> And finally: are you trying to apply a Krylov method over the global
> Schur complement? In such a case, are you going to implement a
> preconditioner for it?
>
> Yes, that what I am trying to do

Well, please let me make some comments. I've spent many days and month
optimizing Schur complement iterations, and I ended giving up. I was
never able to get it perform better than ASM preconditioner (iff
appropriatelly used, ie. solving local problems with LU, and
implementing subdomain subpartitioning the smart way, not the way
currently implemented in PETSc, were subpartitioning is done by chunks
of continuous rows).

If you are doing research on this, I would love to know your
conclusion when you get your work done. If you are doing all this just
with the hope of getting better running times, well, remember my above
comments but also remember that I do not consider myself a smart guy
;-)

As I said before, I worked hard for implementing general Schur
complement iteration. All this code is avalable in the SVN repository
of petsc4py (PETSc for Python), but it could be easily stripped out
for use in any PETSc-based code in C/C++. This implementation requires
the use of a MATIS matrix type (there is also a separate
implementation for MATMPIAIJ maatrices), I've implemented subdomain
subpartitioning (using a simple recursive graph splitting procedure
reusing matrix reordering routines built-in in PETSc, could be done
better with METIS); when the A_ii problems are large, their LU
factorization can be a real bootleneck.  I've even implemented a
global preconditioner operation for the interface problem, based on
iterating over a 'strip' of nodes around the interface; it improves
convergence and is usefull for ill-conditioned systems, but the costs
are increased.

If you ever want to take a look at my implemention for try to use it,
or perhaps take ideas for your own implementation, let me know.

> > Now having the Schur complement matrix for each subdomain, I need to solve
> > the interface problem
> (Sum[R_i^T*S_i*R_i])u=Sum[R_i^T*g_i],
> > .. i=1.. to No. of process (subdomains) in parallel.
> >
> > For the global vector I construct one MPI vector and use VecGetArray ()
> for
> > each of the sequential vector then use VecSetValues () to add the values
> > into the global MPI vector. That works fine.
> >
> > However for the global schur complement matix I try the same idea by
> > creating one parallel MPIAIJ matrix and using MatGetArray( ) and
> > MatSetValues () in order to add the values to the global matrix.
> > MatGetArray( ) gives me only the values without indices, so I don't know
> how
> > to add these valuse to the global MPI matrix.
> >
> > Thanks agin
> >
> > Waad
> >
> > Barry Smith wrote:
> >
> > On May 20, 2008, at 3:16 PM, Waad Subber wrote:
> >
> > > Thank you Matt,
> > >
> > > Any suggestion to solve the problem I am trying to tackle. I want to
> > > solve a linear system:
> > >
> > > Sum(A_i) u= Sum(f_i) , i=1.... to No. of CPUs.
> > >
> > > Where A_i is a sparse sequential matrix and f_i is a sequential
> > > vector. Each CPU has one matrix and one vector of the same size. Now
> > > I want to sum up and solve the system in parallel.
> >
> > Does each A_i have nonzero entries (mostly) associated with one
> > part of the matrix? Or does each process have values
> > scattered all around the matrix?
> >
> > In the former case you should simply create one parallel MPIAIJ
> > matrix and call MatSetValues() to put the values
> > into it. We don't have any kind of support for the later case, perhaps
> > if you describe how the matrix entries come about someone
> > would have suggestions on how to proceed.
> >
> > Barry
> >
> > >
> > >
> > > Thanks again
> > >
> > > Waad
> > >
> > > Matthew Knepley wrote: On Tue, May 20, 2008 at
> > > 2:12 PM, Waad Subber wrote:
> > > > Hi,
> > > >
> > > > I am trying to construct a sparse parallel matrix (MPIAIJ) by
> > > adding up
> > > > sparse sequential matrices (SeqAIJ) from each CPU. I am using
> > > >
> > > > MatMerge_SeqsToMPI(MPI_Comm comm,Mat seqmat,PetscInt m,PetscInt
> > > n,MatReuse
> > > > scall,Mat *mpimat)
> > > >
> > > > to do that. However, when I compile the code I get the following
> > > >
> > > > undefined reference to `matmerge_seqstompi_'
> > > > collect2: ld returned 1 exit status
> > > > make: *** [all] Error 1
> > > >
> > > > Am I using this function correctly ?
> > >
> > > These have no Fortran bindings right now.
> > >
> > > Matt
> > >
> > > > Thanks
> > > >
> > > > Waad
> > > >
> > >
> > >
> > >
> > > --
> > > What most experimenters take for granted before they begin their
> > > experiments is infinitely more interesting than any results to which
> > > their experiments lead.
> > > -- Norbert Wiener
> > >
> > >
> > >
> >
> >
> >
> >
> >
>
>
> --
> Lisandro Dalcín
> ---------------
> Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
> Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
> Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
> PTLC - Güemes 3450, (3000) Santa Fe, Argentina
> Tel/Fax: +54-(0)342-451.1594
>
>
>
>
>

-- 
Lisandro Dalcín
---------------
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080523/46b8cef3/attachment.htm>