[petsc-dev] a few issues with current CUDA code for Mat.

Lisandro Dalcin dalcinl at gmail.com
Thu Aug 19 18:59:06 CDT 2010


I bet all involved people is aware of most of this issues, but just in case.

* MatAssemblyEnd_SeqAIJCUDA: What about mode=MAT_FLUSH_ASSEMBLY?
What's the point of coping to the GPU?

* MatAssemblyEnd_SeqAIJCUDA: the 'tempvec'  cusp array is always
allocated, but not used for MatMult when no commpressed row. Of
course, this issue is very low priority.

* MatAssemblyEnd_SeqAIJCUDA: Perhaps memory allocation on the GPU is
cheap, but if nonzeros do not change, we could avoid re-creating the
GPU mat from scratch.

* There are some calls that operate on assembled matrices (MatScale,
MatZeroRows, MatDiagonalScale, etc.). These operations need GPU
syncing. Am I missing something?

* MatShift: seqaij does not implement MatShift, then MatSetValues is
used in a loop, next the matrix is re-assembled. This will cause an
extra copy to the GPU (take into account that used code already
assembled the matrix before the MatShift call). Other calls will
suffer from this issue: MatDiagonalSet

* MatGetArray: if the user updates values, we are in trouble.

All that being said, I'm still unsure why the GPU coping was
implemented at MatAssemblyEnd_SeqAIJ. What about using a
valid_GPU_data flag for Mat, set it to false in MatAssembly_Begin, and
make the GPU coping at the time MatMult_SeqAIJ is called? Of course,
such appoach would not solve all the previous issues... I'm just
asking the rationale for the current approach.


-- 
Lisandro Dalcin
---------------
CIMEC (INTEC/CONICET-UNL)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
Tel: +54-342-4511594 (ext 1011)
Tel/Fax: +54-342-4511169



More information about the petsc-dev mailing list