[petsc-dev] patch for BiCG on GPUs (reworked).

Karl Rupp rupp at mcs.anl.gov
Sat Feb 2 16:46:54 CST 2013


Hi,

alright, here we go:
https://bitbucket.org/petsc/petsc-dev/commits/ccdf0150dce67cfc50e1ec80872f3d5d

Satish, could you please upload txpetscgpu-0.0.9.tar.gz (and eventually 
update the download URL in the build system)?

Thanks and best regards,
Karli


On 02/02/2013 02:04 PM, Paul Mullowney wrote:
> Hi Karl,
>
> I pulled from petsc-dev this morning and reworked the patch. Everything
> is working as expected. Regarding your comments, the initialization of
> CUSPARRAY * variable is done correctly in VecCUSPGetArrayRead() and
> VecCUSPGetArrayWrite(). Thus the initializations to PETSC_NULL is not
> required and the compiler warning are removed. In this patch, I fixed
> the initialization of VecCUSPGetArrayWrite() (ArrayRead() was working
> correctly previous to this patch).
>
> Regarding your second comment, the PETSc KSP algorithms use an identity
> when doing Hermitian solves and multiplies. In particular, the
> conjugation of the input and output vectors is done so that one should
> only do the Transpose multiply and solve. For instance in bicg.c, one has
>
>    ierr = VecConjugate(Rl);CHKERRQ(ierr);
>    ierr = KSP_PCApplyTranspose(ksp,Rl,Zl);CHKERRQ(ierr);
>    ierr = VecConjugate(Rl);CHKERRQ(ierr);
>    ierr = VecConjugate(Zl);CHKERRQ(ierr);
>
> The conjugation of the input and output vectors forces one to use the
> Transpose solve and not the Hermitian solve. The same holds for the
> multiplies.
>
> Also attached is a new tarball for download once this patch is pushed.
> Thanks,
> -Paul
>
>> Hi Paul,
>>
>> just a few questions on your patch:
>>
>> I've spotted a few replacements of the kind:
>>   -  CUSPARRAY      *xGPU=PETSC_NULL, *bGPU=PETSC_NULL;
>>   +  CUSPARRAY      *xGPU, *bGPU;
>> Is this intentional? This is likely to lead to warnings. I skipped
>> these changes.
>>
>> Also, there is
>> -#if !defined(PETSC_USE_COMPLEX)
>>      ierr = cusparseMat->mat->multiply(...,TRANSPOSE);...
>> -#else
>> -    ierr = cusparseMat->mat->multiply(...,HERMITIAN);...
>> -#endif
>> Is it safe to throw out the Hermitian transpose here? I've seen that
>> the path adds a kernel for hermitian transpose, but I want to make
>> sure this does not cause any side effects.
>>
>> A patch for the current tip is attached, including the removal of the
>> preprocessor switch for PETSC_USE_COMPLEX. However, I can't test it on
>> my AMD machine right now...
>>
>> Best regards,
>> Karli
>>
>>
>>
>>
>> On 02/01/2013 06:41 PM, Jed Brown wrote:
>>> That's gonna suck. Karl, can you apply his patch to the old code, run
>>> uncrustify on it, then send out the diff (which should apply cleanly to
>>> head).
>>>
>>> On Feb 1, 2013 6:32 PM, "Karl Rupp" <rupp at mcs.anl.gov
>>> <mailto:rupp at mcs.anl.gov>> wrote:
>>>
>>>     Hi Paul,
>>>
>>>     I just uncrustified src/mat/impls/aij/* and pushed it to petsc-dev.
>>>     Could you please re-generate your patch based on the latest commit?
>>>
>>>     Thanks and best regards,
>>>     Karli
>>>
>>>
>>>     On 02/01/2013 06:11 PM, Paul Mullowney wrote:
>>>
>>>         Hi,
>>>
>>>         Here's a reworked patch for running BiCG on GPUs (with ILU(0)
>>>         preconditioners) on GPUs for the aijcusparse.cu
>>> <http://aijcusparse.cu> class. I fixed the
>>>         comments from the previous emails on this patch. In particular,
>>>         I added
>>>
>>>         (1) VecConjugate implementation in veccusp.cu
>>> <http://veccusp.cu> with the correct method
>>>         for getting the device ptr (VecCUSPGetArrayReadWrite()).
>>>         (2) Various methods in aijcusparse.cu <http://aijcusparse.cu>
>>>         for building the transpose
>>>         matrices for MatSolveTranspose* methods. The implementation
>>> of the
>>>         solves is done under the hood in the txpetscgpu library. A
>>>         protection
>>>         was added to ensure the matrix generation routines are only
>>>         called once.
>>>         (3) I fixed the uninitialized compiler warning when building in
>>>         double
>>>         complex. This required a slight fix in VecCUSPGetArrayWrite().
>>>         (4) Small Style fixes.
>>>
>>>         I wasn't clear to me how to break this up patch into a small
>>>         organizational patch and then a large implementation patch. If
>>>         you have
>>>         suggestions on what corresponds to organization and what
>>>         corresponds to
>>>         implementation, I can try to do that in subsequent patches.
>>>
>>>         Everything builds and runs fine on my end.
>>>
>>>         Thanks,
>>>         -Paul
>>>
>>>
>>
>




More information about the petsc-dev mailing list