[petsc-dev] patch for BiCG on GPUs.
Matthew Knepley
knepley at gmail.com
Mon Jan 7 10:29:33 CST 2013
On Mon, Jan 7, 2013 at 10:19 AM, Paul Mullowney <paulm at txcorp.com> wrote:
> No problem. Thanks for the feedback.
>
> Sorry about the slow reply. Yes, petsc-dev is a good place for this. I
> have a couple comments. First, it's *much* better to split the
> "organization" part of the patch (which should be benign) from the new
> functionality. Small independent patches are so much easier to evaluate
> than monolithic patches. When in doubt, split them apart.
>
> Ok.
>
> I looked at the code for VecCUSPGetArrayWrite() in cuspvecimpl.h. I can't
> see why the compiler complains about uninitialized values although this
> only happens, I think, when I compile in double complex.
>
>
> - CUSPARRAY *xarray,*yarray;
> + CUSPARRAY *xarray,*yarray=PETSC_NULL;
>
> This is a very bad method for squashing warnings because it prevents
> valgrind and static analysis from checking for uninitialized access. It's
> much better to go to the routine that initializes yarray and have it
> initialize explicitly at the start. This fixes the warnings in all the
> cases I've encountered. (Hopefully this bug, in which the compiler
> sometimes does not figure out that that a return value is checked so that a
> variable is never used uninitialized, will eventually be fixed so this
> problem goes away.)
>
> + }
> + else {
>
> Ok. Will fix.
>
> The PETSc developer's manual says to use '} else {' instead.
>
> +PetscErrorCode MatSeqAIJCUSPARSEAnalyzeTransposeForSolve(Mat A)
> +{
> + Mat_SeqAIJCUSPARSETriFactors *cusparseTriFactors =
> (Mat_SeqAIJCUSPARSETriFactors*)A->spptr;
> + GPU_Matrix_Ifc* cusparseMatLo =
> (GPU_Matrix_Ifc*)cusparseTriFactors->loTriFactorPtr;
> + GPU_Matrix_Ifc* cusparseMatUp =
> (GPU_Matrix_Ifc*)cusparseTriFactors->upTriFactorPtr;
> + cusparseStatus_t stat;
> + PetscFunctionBegin;
> + stat = cusparseMatLo->initializeCusparseMat(MAT_cusparseHandle,
> ...
> PetscErrorCode MatSolveTranspose_SeqAIJCUSPARSE(Mat A,Vec bb,Vec xx)
> +{
> + Mat_SeqAIJ *a = (Mat_SeqAIJ*)A->data;
> + PetscErrorCode ierr;
> + CUSPARRAY *xGPU=PETSC_NULL, *bGPU;
> + cusparseStatus_t stat;
> + Mat_SeqAIJCUSPARSETriFactors *cusparseTriFactors =
> (Mat_SeqAIJCUSPARSETriFactors*)A->spptr;
> + GPU_Matrix_Ifc *cusparseMatLo =
> (GPU_Matrix_Ifc*)cusparseTriFactors->loTriFactorPtr;
> + GPU_Matrix_Ifc *cusparseMatUp =
> (GPU_Matrix_Ifc*)cusparseTriFactors->upTriFactorPtr;
> + CUSPARRAY * tempGPU = (CUSPARRAY*) cusparseTriFactors->tempvec;
> +
> + PetscFunctionBegin;
> + /* Analyze the matrix ... on the fly */
> + ierr = MatSeqAIJCUSPARSEAnalyzeTransposeForSolve(A);CHKERRQ(ierr);
>
>
> Yes this should only be called once, I will add a fix to this. Although
> there is no memory leak as I check this under the hood in txpetscgpu.
>
> Is there any way to determine from a MAT object whether a Tranpose solve
> is needed? That is, is there any global information that I can access that
> tells my I'm running BiCG and thus I need a Transpose Solve? If so, then I
> could do all of this in the Setup functions (i.e. the functions that do the
> ILU factorization and then push the data down to the GPU.)
>
A Mat can tell you whether TransposeSolve is supported by the format. The
KSP could potentially tell you whether TransposeSolve is
used in the solve, although I do not think it does so now. Perhaps we could
add a query to the KSP interface.
Matt
> Is this a memory leak on repeat solves? In any case, shouldn't this be
> done once rather than every time MatSolveTranspose() is called? Same story
> with repeat calls to MatSeqAIJCUSPARSEGenerateTransposeForMult?
>
> + ierr = VecCUSPGetArrayRead(xin,&xarray);CHKERRQ(ierr);
> +#if defined(PETSC_USE_COMPLEX)
> + thrust::transform(xarray->begin(), xarray->end(), xarray->begin(),
> conjugate());
> +#endif
> + ierr = VecCUSPRestoreArrayRead(xin,&xarray);CHKERRQ(ierr);
>
>
> Oops. Will fix.
>
> WAT? The *Read* accessor returns a non-const pointer and you're using
> it to modify the data? Oh the humanity.
>
>
> On Tue, Jan 1, 2013 at 2:20 PM, Paul Mullowney <paulm at txcorp.com> wrote:
>
>> Hi,
>>
>> Here's a patch for running BiCG on GPUs (with ILU(0)) preconditioners on
>> GPUs for the aijcusparse.cu class. I needed to add
>> (1) a VecConjugate implementation in veccusp.cu
>> (2) various methods in aijcusparse.cu for building the transpose
>> matrices for MatSolveTranspose* methods. The implementation of the solves
>> is done under the hood in the txpetscgpu library.
>>
>> Everything builds and runs fine on my end. Let me know if this ok to push.
>>
>> Also, do patches like this go to petsc-maint or petsc-dev?
>>
>> Thanks,
>> -Paul
>>
>>
>
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20130107/4dcb5eb0/attachment.html>
More information about the petsc-dev
mailing list