MatMerge_SeqsToMPI
Lisandro Dalcin
dalcinl at gmail.com
Fri May 23 13:45:22 CDT 2008
On 5/23/08, Barry Smith <bsmith at mcs.anl.gov> wrote:
> I notice Lapack solver for the interior
> problem takes a lot of time compare to the iterative solver for the
> interface, so now I am replacing the direct factorization with petsc KSP
> solver.
> >
>
> In my experience you will want to use a KSP direct solve, for example
> just -ksp_type preonly -pc_type lu; using an iterative solver in such Schur
> complement is
> always way to slow.
Indeed!!
Waad, do not do that. If your interior problem is big, the only way to
go is to implement subdomain subpartitioning. That is, in your local
subdomain, you have to select some DOF's and 'mark' them as
'interface' nodes. Then you end-up having many A_ii's at a local
subdomain, each corresponding with a sub-subdomain... Of course,
implementing all this logic is not trivial at all..
My implementation of all this stuff can be found in petsc4py SVN
repository hosted at Google Code. Here you have the (long) link:
http://petsc4py.googlecode.com/svn/trunk/petsc/lib/ext/petsc/src/ksp/pc/impls/
In directory 'schur' you have a version requiring MATIS matrix type
(this is somewhat natural in the context of substructuring and finite
elements methods). This corresponds to what Y.Sadd's book calls
'edge-based' partitionings.
In directory 'schur_aij' hou have a version (never carefully tested)
working with MATMPIAIJ matrices. This corresponds to what Y.Sadd's
book calls 'vertex-based' partitionings (more typical to appear in
finite difference methods).
> > I would like very much to have a look at your implementation, and I think
> that will be very useful to me.
> >
> > Thanks
> >
> > Waad
> >
> > Lisandro Dalcin <dalcinl at gmail.com> wrote: On 5/20/08, Waad Subber wrote:
> > > 1) How do you actually get the local Schur complements. You
> > > explicitelly compute its entries, or do you compute it after computing
> > > the inverse (or LU factors) of a 'local' matrix?
> > >
> > > I construct the local Schur complement matrices after getting the
> inversion
> > > of A_II matrix for each subdomain.
> >
> > Fine,
> >
> > > 2) Your R_i matrix is actually a matrix? In that case, it is a trivial
> > > restrinction operation with ones and zeros? Or R_i is actually a
> > > VecScatter?
> > >
> > > R_i is the restriction matrix maps the global boundary nodes to the
> local
> > > boundary nodes and its entries is zero and one I store it as spare
> matrix,
> > > so only I need to store the nonzero entries which one entry per a row
> >
> > I believe a VecScatter will perform much better for this task.
> >
> >
> > > And finally: are you trying to apply a Krylov method over the global
> > > Schur complement? In such a case, are you going to implement a
> > > preconditioner for it?
> > >
> > > Yes, that what I am trying to do
> >
> > Well, please let me make some comments. I've spent many days and month
> > optimizing Schur complement iterations, and I ended giving up. I was
> > never able to get it perform better than ASM preconditioner (iff
> > appropriatelly used, ie. solving local problems with LU, and
> > implementing subdomain subpartitioning the smart way, not the way
> > currently implemented in PETSc, were subpartitioning is done by chunks
> > of continuous rows).
> >
> > If you are doing research on this, I would love to know your
> > conclusion when you get your work done. If you are doing all this just
> > with the hope of getting better running times, well, remember my above
> > comments but also remember that I do not consider myself a smart guy
> > ;-)
> >
> > As I said before, I worked hard for implementing general Schur
> > complement iteration. All this code is avalable in the SVN repository
> > of petsc4py (PETSc for Python), but it could be easily stripped out
> > for use in any PETSc-based code in C/C++. This implementation requires
> > the use of a MATIS matrix type (there is also a separate
> > implementation for MATMPIAIJ maatrices), I've implemented subdomain
> > subpartitioning (using a simple recursive graph splitting procedure
> > reusing matrix reordering routines built-in in PETSc, could be done
> > better with METIS); when the A_ii problems are large, their LU
> > factorization can be a real bootleneck. I've even implemented a
> > global preconditioner operation for the interface problem, based on
> > iterating over a 'strip' of nodes around the interface; it improves
> > convergence and is usefull for ill-conditioned systems, but the costs
> > are increased.
> >
> > If you ever want to take a look at my implemention for try to use it,
> > or perhaps take ideas for your own implementation, let me know.
> >
> >
> >
> >
> >
> > > > Now having the Schur complement matrix for each subdomain, I need to
> solve
> > > > the interface problem
> > > (Sum[R_i^T*S_i*R_i])u=Sum[R_i^T*g_i],
> > > > .. i=1.. to No. of process (subdomains) in parallel.
> > > >
> > > > For the global vector I construct one MPI vector and use VecGetArray
> ()
> > > for
> > > > each of the sequential vector then use VecSetValues () to add the
> values
> > > > into the global MPI vector. That works fine.
> > > >
> > > > However for the global schur complement matix I try the same idea by
> > > > creating one parallel MPIAIJ matrix and using MatGetArray( ) and
> > > > MatSetValues () in order to add the values to the global matrix.
> > > > MatGetArray( ) gives me only the values without indices, so I don't
> know
> > > how
> > > > to add these valuse to the global MPI matrix.
> > > >
> > > > Thanks agin
> > > >
> > > > Waad
> > > >
> > > > Barry Smith wrote:
> > > >
> > > > On May 20, 2008, at 3:16 PM, Waad Subber wrote:
> > > >
> > > > > Thank you Matt,
> > > > >
> > > > > Any suggestion to solve the problem I am trying to tackle. I want to
> > > > > solve a linear system:
> > > > >
> > > > > Sum(A_i) u= Sum(f_i) , i=1.... to No. of CPUs.
> > > > >
> > > > > Where A_i is a sparse sequential matrix and f_i is a sequential
> > > > > vector. Each CPU has one matrix and one vector of the same size. Now
> > > > > I want to sum up and solve the system in parallel.
> > > >
> > > > Does each A_i have nonzero entries (mostly) associated with one
> > > > part of the matrix? Or does each process have values
> > > > scattered all around the matrix?
> > > >
> > > > In the former case you should simply create one parallel MPIAIJ
> > > > matrix and call MatSetValues() to put the values
> > > > into it. We don't have any kind of support for the later case, perhaps
> > > > if you describe how the matrix entries come about someone
> > > > would have suggestions on how to proceed.
> > > >
> > > > Barry
> > > >
> > > > >
> > > > >
> > > > > Thanks again
> > > > >
> > > > > Waad
> > > > >
> > > > > Matthew Knepley wrote: On Tue, May 20, 2008 at
> > > > > 2:12 PM, Waad Subber wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I am trying to construct a sparse parallel matrix (MPIAIJ) by
> > > > > adding up
> > > > > > sparse sequential matrices (SeqAIJ) from each CPU. I am using
> > > > > >
> > > > > > MatMerge_SeqsToMPI(MPI_Comm comm,Mat seqmat,PetscInt m,PetscInt
> > > > > n,MatReuse
> > > > > > scall,Mat *mpimat)
> > > > > >
> > > > > > to do that. However, when I compile the code I get the following
> > > > > >
> > > > > > undefined reference to `matmerge_seqstompi_'
> > > > > > collect2: ld returned 1 exit status
> > > > > > make: *** [all] Error 1
> > > > > >
> > > > > > Am I using this function correctly ?
> > > > >
> > > > > These have no Fortran bindings right now.
> > > > >
> > > > > Matt
> > > > >
> > > > > > Thanks
> > > > > >
> > > > > > Waad
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > What most experimenters take for granted before they begin their
> > > > > experiments is infinitely more interesting than any results to which
> > > > > their experiments lead.
> > > > > -- Norbert Wiener
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Lisandro Dalcín
> > > ---------------
> > > Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
> > > Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
> > > Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
> > > PTLC - Güemes 3450, (3000) Santa Fe, Argentina
> > > Tel/Fax: +54-(0)342-451.1594
> > >
> > >
> > >
> > >
> > >
> >
> >
> > --
> > Lisandro Dalcín
> > ---------------
> > Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
> > Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
> > Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
> > PTLC - Güemes 3450, (3000) Santa Fe, Argentina
> > Tel/Fax: +54-(0)342-451.1594
> >
> >
> >
> >
>
>
--
Lisandro Dalcín
---------------
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594
More information about the petsc-users
mailing list