[petsc-dev] patch for BiCG on GPUs (reworked).

Karl Rupp rupp at mcs.anl.gov
Fri Feb 1 21:21:42 CST 2013


Hi Paul,

just a few questions on your patch:

I've spotted a few replacements of the kind:
   -  CUSPARRAY      *xGPU=PETSC_NULL, *bGPU=PETSC_NULL;
   +  CUSPARRAY      *xGPU, *bGPU;
Is this intentional? This is likely to lead to warnings. I skipped these 
changes.

Also, there is
-#if !defined(PETSC_USE_COMPLEX)
      ierr = cusparseMat->mat->multiply(...,TRANSPOSE);...
-#else
-    ierr = cusparseMat->mat->multiply(...,HERMITIAN);...
-#endif
Is it safe to throw out the Hermitian transpose here? I've seen that the 
path adds a kernel for hermitian transpose, but I want to make sure this 
does not cause any side effects.

A patch for the current tip is attached, including the removal of the 
preprocessor switch for PETSC_USE_COMPLEX. However, I can't test it on 
my AMD machine right now...

Best regards,
Karli




On 02/01/2013 06:41 PM, Jed Brown wrote:
> That's gonna suck. Karl, can you apply his patch to the old code, run
> uncrustify on it, then send out the diff (which should apply cleanly to
> head).
>
> On Feb 1, 2013 6:32 PM, "Karl Rupp" <rupp at mcs.anl.gov
> <mailto:rupp at mcs.anl.gov>> wrote:
>
>     Hi Paul,
>
>     I just uncrustified src/mat/impls/aij/* and pushed it to petsc-dev.
>     Could you please re-generate your patch based on the latest commit?
>
>     Thanks and best regards,
>     Karli
>
>
>     On 02/01/2013 06:11 PM, Paul Mullowney wrote:
>
>         Hi,
>
>         Here's a reworked patch for running BiCG on GPUs (with ILU(0)
>         preconditioners) on GPUs for the aijcusparse.cu
>         <http://aijcusparse.cu> class. I fixed the
>         comments from the previous emails on this patch. In particular,
>         I added
>
>         (1) VecConjugate implementation in veccusp.cu
>         <http://veccusp.cu> with the correct method
>         for getting the device ptr (VecCUSPGetArrayReadWrite()).
>         (2) Various methods in aijcusparse.cu <http://aijcusparse.cu>
>         for building the transpose
>         matrices for MatSolveTranspose* methods. The implementation of the
>         solves is done under the hood in the txpetscgpu library. A
>         protection
>         was added to ensure the matrix generation routines are only
>         called once.
>         (3) I fixed the uninitialized compiler warning when building in
>         double
>         complex. This required a slight fix in VecCUSPGetArrayWrite().
>         (4) Small Style fixes.
>
>         I wasn't clear to me how to break this up patch into a small
>         organizational patch and then a large implementation patch. If
>         you have
>         suggestions on what corresponds to organization and what
>         corresponds to
>         implementation, I can try to do that in subsequent patches.
>
>         Everything builds and runs fine on my end.
>
>         Thanks,
>         -Paul
>
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: bicg-complex-2ndtry-rewrite.patch
Type: text/x-patch
Size: 17064 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20130201/a8f9f514/attachment.bin>


More information about the petsc-dev mailing list