[petsc-dev] patch for BiCG on GPUs.
Paul Mullowney
paulm at txcorp.com
Mon Jan 7 10:19:04 CST 2013
No problem. Thanks for the feedback.
> Sorry about the slow reply. Yes, petsc-dev is a good place for this. I
> have a couple comments. First, it's /much/ better to split the
> "organization" part of the patch (which should be benign) from the new
> functionality. Small independent patches are so much easier to
> evaluate than monolithic patches. When in doubt, split them apart.
>
Ok.
I looked at the code for VecCUSPGetArrayWrite() in cuspvecimpl.h. I
can't see why the compiler complains about uninitialized values although
this only happens, I think, when I compile in double complex.
>
> - CUSPARRAY *xarray,*yarray;
> + CUSPARRAY *xarray,*yarray=PETSC_NULL;
> This is a very bad method for squashing warnings because it prevents
> valgrind and static analysis from checking for uninitialized access.
> It's much better to go to the routine that initializes yarray and have
> it initialize explicitly at the start. This fixes the warnings in all
> the cases I've encountered. (Hopefully this bug, in which the compiler
> sometimes does not figure out that that a return value is checked so
> that a variable is never used uninitialized, will eventually be fixed
> so this problem goes away.)
>
> + }
> + else {
>
Ok. Will fix.
> The PETSc developer's manual says to use '} else {' instead.
>
> +PetscErrorCode MatSeqAIJCUSPARSEAnalyzeTransposeForSolve(Mat A)
> +{
> + Mat_SeqAIJCUSPARSETriFactors *cusparseTriFactors =
> (Mat_SeqAIJCUSPARSETriFactors*)A->spptr;
> + GPU_Matrix_Ifc* cusparseMatLo =
> (GPU_Matrix_Ifc*)cusparseTriFactors->loTriFactorPtr;
> + GPU_Matrix_Ifc* cusparseMatUp =
> (GPU_Matrix_Ifc*)cusparseTriFactors->upTriFactorPtr;
> + cusparseStatus_t stat;
> + PetscFunctionBegin;
> + stat = cusparseMatLo->initializeCusparseMat(MAT_cusparseHandle,
> ...
> PetscErrorCode MatSolveTranspose_SeqAIJCUSPARSE(Mat A,Vec bb,Vec xx)
> +{
> + Mat_SeqAIJ *a = (Mat_SeqAIJ*)A->data;
> + PetscErrorCode ierr;
> + CUSPARRAY *xGPU=PETSC_NULL, *bGPU;
> + cusparseStatus_t stat;
> + Mat_SeqAIJCUSPARSETriFactors *cusparseTriFactors =
> (Mat_SeqAIJCUSPARSETriFactors*)A->spptr;
> + GPU_Matrix_Ifc *cusparseMatLo =
> (GPU_Matrix_Ifc*)cusparseTriFactors->loTriFactorPtr;
> + GPU_Matrix_Ifc *cusparseMatUp =
> (GPU_Matrix_Ifc*)cusparseTriFactors->upTriFactorPtr;
> + CUSPARRAY * tempGPU = (CUSPARRAY*) cusparseTriFactors->tempvec;
> +
> + PetscFunctionBegin;
> + /* Analyze the matrix ... on the fly */
> + ierr = MatSeqAIJCUSPARSEAnalyzeTransposeForSolve(A);CHKERRQ(ierr);
>
Yes this should only be called once, I will add a fix to this. Although
there is no memory leak as I check this under the hood in txpetscgpu.
Is there any way to determine from a MAT object whether a Tranpose solve
is needed? That is, is there any global information that I can access
that tells my I'm running BiCG and thus I need a Transpose Solve? If
so, then I could do all of this in the Setup functions (i.e. the
functions that do the ILU factorization and then push the data down to
the GPU.)
> Is this a memory leak on repeat solves? In any case, shouldn't this be
> done once rather than every time MatSolveTranspose() is called? Same
> story with repeat calls to MatSeqAIJCUSPARSEGenerateTransposeForMult?
>
> + ierr = VecCUSPGetArrayRead(xin,&xarray);CHKERRQ(ierr);
> +#if defined(PETSC_USE_COMPLEX)
> + thrust::transform(xarray->begin(), xarray->end(), xarray->begin(),
> conjugate());
> +#endif
> + ierr = VecCUSPRestoreArrayRead(xin,&xarray);CHKERRQ(ierr);
>
Oops. Will fix.
> WAT? The *Read* accessor returns a non-const pointer and you're using
> it to modify the data? Oh the humanity.
>
>
> On Tue, Jan 1, 2013 at 2:20 PM, Paul Mullowney <paulm at txcorp.com
> <mailto:paulm at txcorp.com>> wrote:
>
> Hi,
>
> Here's a patch for running BiCG on GPUs (with ILU(0))
> preconditioners on GPUs for the aijcusparse.cu
> <http://aijcusparse.cu> class. I needed to add
> (1) a VecConjugate implementation in veccusp.cu <http://veccusp.cu>
> (2) various methods in aijcusparse.cu <http://aijcusparse.cu> for
> building the transpose matrices for MatSolveTranspose* methods.
> The implementation of the solves is done under the hood in the
> txpetscgpu library.
>
> Everything builds and runs fine on my end. Let me know if this ok
> to push.
>
> Also, do patches like this go to petsc-maint or petsc-dev?
>
> Thanks,
> -Paul
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20130107/648b432f/attachment.html>
More information about the petsc-dev
mailing list