[petsc-users] MatMult with one sequential and one parallel vector

Junchao Zhang junchao.zhang at gmail.com
Fri Oct 8 10:08:46 CDT 2021


Hi, Claudio,

You might be aware that petsc internally splits a MATMPIAIJ matrix A in two
MATSEQAIJ matrices Ad and Ao,  for the *diagonal* portion and the *off-diagonal
*portion respectively, see
https://petsc.org/release/docs/manualpages/Mat/MatCreateAIJ.html#MatCreateAIJ

There are tricks in MatMult().  Ao is a  'reduced' matrix, i.e., using
local column indices. Ao's column size (as returned by MatGetLocalSize) is
the number of nonzero columns in the off-diagonal portion.

petsc kind of implements MatMult(A,x, y) as y = Ad*x + Ao*lvec, where lvec
is a sequential vector with entries gathered/communicated from other
processes. petsc only communicates entries corresponding to nonzero
columns.

For your experimental purpose, you can un-split the matrix
with MatMPIAIJGetLocalMat(A,scall,&Aloc), and do  y2 = Aloc * x2.  Note y2
is a sequential vector, since Aloc and x2 are sequential.

You can alias y2 with y, by letting them share the data array. Basically,
VecGetArray(y,&a) and VecCreateSeqWithArray(..,a,&y2).  That is easy and
you can search petsc doc to find info.

For benchmarking, you need to consider the MatMPIAIJGetLocalMat() cost. It
does memory copying to merge Ad and Ao into Aloc.

Hope that helps.

--Junchao Zhang


On Fri, Oct 8, 2021 at 5:29 AM Claudio Kozický <koziccla at fit.cvut.cz> wrote:

> Hello,
>
> I am using PETSc in a performance comparison that evaluates the
> performance of parallel sparse matrix-vector multiplication (SpMV).  For
> this purpose I have implemented a simple SpMV operation using PETSc,
> which multiplies parallel matrix A (type MatAIJ) with parallel vector x1
> and stores the result in parallel vector y1.  Thus I perform SpMV using
> PETSc as MatMult(A, x1, y1).  This part works without any problems.
>
> I would also like to implement SpMV operation y2 = A * x2, where x2 is a
> sequential vector (i.e. created using VecCreateSeq) but where A and y2
> are still parallel.  The resulting implementation would be something
> like:
>
> MatCreateAIJ(..., &A);     // a parallel matrix
> VecCreateSeq(..., &x2);    // a per-process _sequential_ vector
> VecCreateMPI(..., &y2);    // a parallel vector
> MatMult(A, x2, y2);
>
> The motivation of storing all of x2 in each process to is remove the
> need of broadcasting any elements of x2 (this approach makes sense in
> the context of what I am benchmarking).  However I cannot seem to get
> this approach to work in PETSc.  For example when I try this approach
> with a 4-by-4 matrix, with two order-4 vectors and using two MPI
> processes, then PETSc prints:
>
> Nonconforming object sizes
> Mat mat,Vec x: local dim 2 4
>
> I have attached a minimal working example that demonstrates what I am
> attempting to perform.  Could it be that PETSc does not support
> combining a parallel and sequential vector in a single MatMult call?
>
> I have found functions for scattering and gathering vectors in the
> documentation of PETSc, but these do not seem to be a good match for
> what I am trying to benchmark.  My intention is for each process to keep
> an identical copy of vector x2 and therefore the necessity to scatter or
> gather values in x2 should never arise.
>
> I would appreciate if somebody could help point me in the right
> direction regarding my failing MatMult call.  Thanks!
>
> --
> Claudio Kozický
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211008/4137a481/attachment.html>


More information about the petsc-users mailing list