[petsc-dev] patch for BiCG on GPUs (reworked).

Paul Mullowney paulm at txcorp.com
Sat Feb 2 14:04:06 CST 2013


Hi Karl,

I pulled from petsc-dev this morning and reworked the patch. Everything 
is working as expected. Regarding your comments, the initialization of 
CUSPARRAY * variable is done correctly in VecCUSPGetArrayRead() and 
VecCUSPGetArrayWrite(). Thus the initializations to PETSC_NULL is not 
required and the compiler warning are removed. In this patch, I fixed 
the initialization of VecCUSPGetArrayWrite() (ArrayRead() was working 
correctly previous to this patch).

Regarding your second comment, the PETSc KSP algorithms use an identity 
when doing Hermitian solves and multiplies. In particular, the 
conjugation of the input and output vectors is done so that one should 
only do the Transpose multiply and solve. For instance in bicg.c, one has

   ierr = VecConjugate(Rl);CHKERRQ(ierr);
   ierr = KSP_PCApplyTranspose(ksp,Rl,Zl);CHKERRQ(ierr);
   ierr = VecConjugate(Rl);CHKERRQ(ierr);
   ierr = VecConjugate(Zl);CHKERRQ(ierr);

The conjugation of the input and output vectors forces one to use the 
Transpose solve and not the Hermitian solve. The same holds for the 
multiplies.

Also attached is a new tarball for download once this patch is pushed.
Thanks,
-Paul

> Hi Paul,
>
> just a few questions on your patch:
>
> I've spotted a few replacements of the kind:
>   -  CUSPARRAY      *xGPU=PETSC_NULL, *bGPU=PETSC_NULL;
>   +  CUSPARRAY      *xGPU, *bGPU;
> Is this intentional? This is likely to lead to warnings. I skipped 
> these changes.
>
> Also, there is
> -#if !defined(PETSC_USE_COMPLEX)
>      ierr = cusparseMat->mat->multiply(...,TRANSPOSE);...
> -#else
> -    ierr = cusparseMat->mat->multiply(...,HERMITIAN);...
> -#endif
> Is it safe to throw out the Hermitian transpose here? I've seen that 
> the path adds a kernel for hermitian transpose, but I want to make 
> sure this does not cause any side effects.
>
> A patch for the current tip is attached, including the removal of the 
> preprocessor switch for PETSC_USE_COMPLEX. However, I can't test it on 
> my AMD machine right now...
>
> Best regards,
> Karli
>
>
>
>
> On 02/01/2013 06:41 PM, Jed Brown wrote:
>> That's gonna suck. Karl, can you apply his patch to the old code, run
>> uncrustify on it, then send out the diff (which should apply cleanly to
>> head).
>>
>> On Feb 1, 2013 6:32 PM, "Karl Rupp" <rupp at mcs.anl.gov
>> <mailto:rupp at mcs.anl.gov>> wrote:
>>
>>     Hi Paul,
>>
>>     I just uncrustified src/mat/impls/aij/* and pushed it to petsc-dev.
>>     Could you please re-generate your patch based on the latest commit?
>>
>>     Thanks and best regards,
>>     Karli
>>
>>
>>     On 02/01/2013 06:11 PM, Paul Mullowney wrote:
>>
>>         Hi,
>>
>>         Here's a reworked patch for running BiCG on GPUs (with ILU(0)
>>         preconditioners) on GPUs for the aijcusparse.cu
>> <http://aijcusparse.cu> class. I fixed the
>>         comments from the previous emails on this patch. In particular,
>>         I added
>>
>>         (1) VecConjugate implementation in veccusp.cu
>> <http://veccusp.cu> with the correct method
>>         for getting the device ptr (VecCUSPGetArrayReadWrite()).
>>         (2) Various methods in aijcusparse.cu <http://aijcusparse.cu>
>>         for building the transpose
>>         matrices for MatSolveTranspose* methods. The implementation 
>> of the
>>         solves is done under the hood in the txpetscgpu library. A
>>         protection
>>         was added to ensure the matrix generation routines are only
>>         called once.
>>         (3) I fixed the uninitialized compiler warning when building in
>>         double
>>         complex. This required a slight fix in VecCUSPGetArrayWrite().
>>         (4) Small Style fixes.
>>
>>         I wasn't clear to me how to break this up patch into a small
>>         organizational patch and then a large implementation patch. If
>>         you have
>>         suggestions on what corresponds to organization and what
>>         corresponds to
>>         implementation, I can try to do that in subsequent patches.
>>
>>         Everything builds and runs fine on my end.
>>
>>         Thanks,
>>         -Paul
>>
>>
>

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: bicg-complex-3rdtry.patch
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20130202/f69489d1/attachment.ksh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: txpetscgpu-0.0.9.tar.gz
Type: application/x-compressed-tar
Size: 23028 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20130202/f69489d1/attachment.bin>


More information about the petsc-dev mailing list