[petsc-dev] patch for BiCG on GPUs.

Matthew Knepley knepley at gmail.com
Mon Jan 7 10:29:33 CST 2013


On Mon, Jan 7, 2013 at 10:19 AM, Paul Mullowney <paulm at txcorp.com> wrote:

>  No problem. Thanks for the feedback.
>
> Sorry about the slow reply. Yes, petsc-dev is a good place for this. I
> have a couple comments. First, it's *much* better to split the
> "organization" part of the patch (which should be benign) from the new
> functionality. Small independent patches are so much easier to evaluate
> than monolithic patches. When in doubt, split them apart.
>
>   Ok.
>
> I looked at the code for VecCUSPGetArrayWrite() in cuspvecimpl.h. I can't
> see why the compiler complains about uninitialized values although this
> only happens, I think, when I compile in double complex.
>
>
>  -  CUSPARRAY      *xarray,*yarray;
> +  CUSPARRAY      *xarray,*yarray=PETSC_NULL;
>
> This is a very bad method for squashing warnings because it prevents
> valgrind and static analysis from checking for uninitialized access. It's
> much better to go to the routine that initializes yarray and have it
> initialize explicitly at the start. This fixes the warnings in all the
> cases I've encountered. (Hopefully this bug, in which the compiler
> sometimes does not figure out that that a return value is checked so that a
> variable is never used uninitialized, will eventually be fixed so this
> problem goes away.)
>
>  +  }
> +  else {
>
>    Ok. Will fix.
>
>   The PETSc developer's manual says to use '} else {' instead.
>
>  +PetscErrorCode MatSeqAIJCUSPARSEAnalyzeTransposeForSolve(Mat A)
> +{
> +  Mat_SeqAIJCUSPARSETriFactors *cusparseTriFactors  =
> (Mat_SeqAIJCUSPARSETriFactors*)A->spptr;
> +  GPU_Matrix_Ifc* cusparseMatLo  =
> (GPU_Matrix_Ifc*)cusparseTriFactors->loTriFactorPtr;
> +  GPU_Matrix_Ifc* cusparseMatUp  =
> (GPU_Matrix_Ifc*)cusparseTriFactors->upTriFactorPtr;
> +  cusparseStatus_t stat;
> +  PetscFunctionBegin;
> +  stat = cusparseMatLo->initializeCusparseMat(MAT_cusparseHandle,
> ...
>  PetscErrorCode MatSolveTranspose_SeqAIJCUSPARSE(Mat A,Vec bb,Vec xx)
> +{
> +  Mat_SeqAIJ     *a = (Mat_SeqAIJ*)A->data;
> +  PetscErrorCode ierr;
> +  CUSPARRAY      *xGPU=PETSC_NULL, *bGPU;
> +  cusparseStatus_t stat;
> +  Mat_SeqAIJCUSPARSETriFactors *cusparseTriFactors  =
> (Mat_SeqAIJCUSPARSETriFactors*)A->spptr;
> +  GPU_Matrix_Ifc *cusparseMatLo  =
> (GPU_Matrix_Ifc*)cusparseTriFactors->loTriFactorPtr;
> +  GPU_Matrix_Ifc *cusparseMatUp  =
> (GPU_Matrix_Ifc*)cusparseTriFactors->upTriFactorPtr;
> +  CUSPARRAY * tempGPU = (CUSPARRAY*) cusparseTriFactors->tempvec;
> +
> +  PetscFunctionBegin;
> +  /* Analyze the matrix ... on the fly */
> +  ierr = MatSeqAIJCUSPARSEAnalyzeTransposeForSolve(A);CHKERRQ(ierr);
>
>
> Yes this should only be called once, I will add a fix to this. Although
> there is no memory leak as I check this under the hood in txpetscgpu.
>
> Is there any way to determine from a MAT object whether a Tranpose solve
> is needed? That is, is there any global information that I can access that
> tells my I'm running BiCG and thus I need a Transpose Solve?  If so, then I
> could do all of this in the Setup functions (i.e. the functions that do the
> ILU factorization and then push the data down to the GPU.)
>

A Mat can tell you whether TransposeSolve is supported by the format. The
KSP could potentially tell you whether TransposeSolve is
used in the solve, although I do not think it does so now. Perhaps we could
add a query to the KSP interface.

   Matt


>    Is this a memory leak on repeat solves? In any case, shouldn't this be
> done once rather than every time MatSolveTranspose() is called? Same story
> with repeat calls to MatSeqAIJCUSPARSEGenerateTransposeForMult?
>
> +  ierr = VecCUSPGetArrayRead(xin,&xarray);CHKERRQ(ierr);
> +#if defined(PETSC_USE_COMPLEX)
> +  thrust::transform(xarray->begin(), xarray->end(), xarray->begin(),
> conjugate());
> +#endif
> +  ierr = VecCUSPRestoreArrayRead(xin,&xarray);CHKERRQ(ierr);
>
>
> Oops. Will fix.
>
>   WAT? The *Read* accessor returns a non-const pointer and you're using
> it to modify the data? Oh the humanity.
>
>
>  On Tue, Jan 1, 2013 at 2:20 PM, Paul Mullowney <paulm at txcorp.com> wrote:
>
>> Hi,
>>
>> Here's a patch for running BiCG on GPUs (with ILU(0)) preconditioners on
>> GPUs for the aijcusparse.cu class. I needed to add
>> (1) a VecConjugate implementation in veccusp.cu
>> (2) various methods in aijcusparse.cu for building the transpose
>> matrices for MatSolveTranspose* methods. The implementation of the solves
>> is done under the hood in the txpetscgpu library.
>>
>> Everything builds and runs fine on my end. Let me know if this ok to push.
>>
>> Also, do patches like this go to petsc-maint or petsc-dev?
>>
>> Thanks,
>> -Paul
>>
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20130107/4dcb5eb0/attachment.html>


More information about the petsc-dev mailing list