[petsc-dev] patch for BiCG on GPUs.

Mon Jan 7 10:19:04 CST 2013

No problem. Thanks for the feedback.
> Sorry about the slow reply. Yes, petsc-dev is a good place for this. I 
> have a couple comments. First, it's /much/ better to split the 
> "organization" part of the patch (which should be benign) from the new 
> functionality. Small independent patches are so much easier to 
> evaluate than monolithic patches. When in doubt, split them apart.
>
Ok.

I looked at the code for VecCUSPGetArrayWrite() in cuspvecimpl.h. I 
can't see why the compiler complains about uninitialized values although 
this only happens, I think, when I compile in double complex.
>
> -  CUSPARRAY      *xarray,*yarray;
> +  CUSPARRAY      *xarray,*yarray=PETSC_NULL;
> This is a very bad method for squashing warnings because it prevents 
> valgrind and static analysis from checking for uninitialized access. 
> It's much better to go to the routine that initializes yarray and have 
> it initialize explicitly at the start. This fixes the warnings in all 
> the cases I've encountered. (Hopefully this bug, in which the compiler 
> sometimes does not figure out that that a return value is checked so 
> that a variable is never used uninitialized, will eventually be fixed 
> so this problem goes away.)
>
> +  }
> +  else {
>
Ok. Will fix.
> The PETSc developer's manual says to use '} else {' instead.
>
> +PetscErrorCode MatSeqAIJCUSPARSEAnalyzeTransposeForSolve(Mat A)
> +{
> +  Mat_SeqAIJCUSPARSETriFactors *cusparseTriFactors  = 
> (Mat_SeqAIJCUSPARSETriFactors*)A->spptr;
> +  GPU_Matrix_Ifc* cusparseMatLo  = 
> (GPU_Matrix_Ifc*)cusparseTriFactors->loTriFactorPtr;
> +  GPU_Matrix_Ifc* cusparseMatUp  = 
> (GPU_Matrix_Ifc*)cusparseTriFactors->upTriFactorPtr;
> +  cusparseStatus_t stat;
> +  PetscFunctionBegin;
> +  stat = cusparseMatLo->initializeCusparseMat(MAT_cusparseHandle,
> ...
> PetscErrorCode MatSolveTranspose_SeqAIJCUSPARSE(Mat A,Vec bb,Vec xx)
> +{
> +  Mat_SeqAIJ     *a = (Mat_SeqAIJ*)A->data;
> +  PetscErrorCode ierr;
> +  CUSPARRAY      *xGPU=PETSC_NULL, *bGPU;
> +  cusparseStatus_t stat;
> +  Mat_SeqAIJCUSPARSETriFactors *cusparseTriFactors  = 
> (Mat_SeqAIJCUSPARSETriFactors*)A->spptr;
> +  GPU_Matrix_Ifc *cusparseMatLo  = 
> (GPU_Matrix_Ifc*)cusparseTriFactors->loTriFactorPtr;
> +  GPU_Matrix_Ifc *cusparseMatUp  = 
> (GPU_Matrix_Ifc*)cusparseTriFactors->upTriFactorPtr;
> +  CUSPARRAY * tempGPU = (CUSPARRAY*) cusparseTriFactors->tempvec;
> +
> +  PetscFunctionBegin;
> +  /* Analyze the matrix ... on the fly */
> +  ierr = MatSeqAIJCUSPARSEAnalyzeTransposeForSolve(A);CHKERRQ(ierr);
>

Yes this should only be called once, I will add a fix to this. Although 
there is no memory leak as I check this under the hood in txpetscgpu.

Is there any way to determine from a MAT object whether a Tranpose solve 
is needed? That is, is there any global information that I can access 
that tells my I'm running BiCG and thus I need a Transpose Solve?  If 
so, then I could do all of this in the Setup functions (i.e. the 
functions that do the ILU factorization and then push the data down to 
the GPU.)
> Is this a memory leak on repeat solves? In any case, shouldn't this be 
> done once rather than every time MatSolveTranspose() is called? Same 
> story with repeat calls to MatSeqAIJCUSPARSEGenerateTransposeForMult?
>
> +  ierr = VecCUSPGetArrayRead(xin,&xarray);CHKERRQ(ierr);
> +#if defined(PETSC_USE_COMPLEX)
> +  thrust::transform(xarray->begin(), xarray->end(), xarray->begin(), 
> conjugate());
> +#endif
> +  ierr = VecCUSPRestoreArrayRead(xin,&xarray);CHKERRQ(ierr);
>

Oops. Will fix.
> WAT? The *Read* accessor returns a non-const pointer and you're using 
> it to modify the data? Oh the humanity.
>
>
> On Tue, Jan 1, 2013 at 2:20 PM, Paul Mullowney <paulm at txcorp.com 
> <mailto:paulm at txcorp.com>> wrote:
>
>     Hi,
>
>     Here's a patch for running BiCG on GPUs (with ILU(0))
>     preconditioners on GPUs for the aijcusparse.cu
>     <http://aijcusparse.cu> class. I needed to add
>     (1) a VecConjugate implementation in veccusp.cu <http://veccusp.cu>
>     (2) various methods in aijcusparse.cu <http://aijcusparse.cu> for
>     building the transpose matrices for MatSolveTranspose* methods.
>     The implementation of the solves is done under the hood in the
>     txpetscgpu library.
>
>     Everything builds and runs fine on my end. Let me know if this ok
>     to push.
>
>     Also, do patches like this go to petsc-maint or petsc-dev?
>
>     Thanks,
>     -Paul
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20130107/648b432f/attachment.html>