PETSc on GPUs

Thu Sep 18 14:18:40 CDT 2008

Ahmed,

If you take a look in

   src/mat/impls/aij/mpi/

at mpiaij.h and mpiaij.c, you can see that an MPIAIJ matrix is stored locally 
as two "sequential" matrics, A and B.  A contains the "diagonal" portion of 
the matrix, and B contains the "off-diagonal" portion.  See page 59 of the 
PETSc User manual for a quick illustration of how this partitioning works. 
Because vectors are partitioned row-wise among processors in the same manner 
as the matrix, each process already has the vector entries required to form 
the portion of the matrix-vector product ("mat-vec") involving the diagonal 
portion 'A' can be done with data that reside locally.  The off-diagonal 
portion 'B', however, requires gathering vector entries that reside 
off-processor.  If you look in mpiaij.c at the MatMult_MPIAIJ, you will see 
the following lines:

   ierr = 
VecScatterBegin(a->Mvctx,xx,a->lvec,INSERT_VALUES,SCATTER_FORWARD);CHKERRQ(ierr);
   ierr = (*a->A->ops->mult)(a->A,xx,yy);CHKERRQ(ierr);
   ierr = 
VecScatterEnd(a->Mvctx,xx,a->lvec,INSERT_VALUES,SCATTER_FORWARD);CHKERRQ(ierr);
   ierr = (*a->B->ops->multadd)(a->B,a->lvec,yy,yy);CHKERRQ(ierr);

The mat-vec involving a->A can be completed without any VecScatter, but the 
one involving a->B needs the VecScatter to complete before it can be done. 
(Note that the VecScatterBegin() occurs before the mat-vec for a->A so that 
there might be some overlap of communication and computation).

Hopefully this helps elucidate what Matt meant, and didn't just confuse you.

--Richard

Ahmed El Zein wrote:
> On Thu, 2008-09-18 at 10:20 -0500, Matthew Knepley wrote:
>>> A question that I had regarding the PETSc code when I was thinking
>> about
>>> this was:
>>> You have the SeqAIJ matrix type and the the MPIAIJ type built around
>> it
>>> (or that is what I understand from the code). So basically you
>> implement
>>> the SeqAIJ type for the GPU and you get the MPI type for free?
>> Yes, that is true. However, note that in the MPI step, you will need a
>> gather
>> operation to get the second matrix multiply to work.
> Matt,
> Could you explain a bit more?
> 
> Ahmed

-- 
Richard Tran Mills, Ph.D.            |   E-mail: rmills at climate.ornl.gov
Computational Scientist              |   Phone:  (865) 241-3198
Computational Earth Sciences Group   |   Fax:    (865) 574-0405
Oak Ridge National Laboratory        |   http://climate.ornl.gov/~rmills