[petsc-dev] patch for BiCG on GPUs (reworked).

Satish Balay balay at mcs.anl.gov
Sat Feb 2 21:37:48 CST 2013


uploaded tarball and pushed
https://bitbucket.org/petsc/petsc-dev/commits/e95fd54300be1e05489068b844fd57c7

satish

On Sat, 2 Feb 2013, Karl Rupp wrote:

> Hi,
> 
> alright, here we go:
> https://bitbucket.org/petsc/petsc-dev/commits/ccdf0150dce67cfc50e1ec80872f3d5d
> 
> Satish, could you please upload txpetscgpu-0.0.9.tar.gz (and eventually update
> the download URL in the build system)?
> 
> Thanks and best regards,
> Karli
> 
> 
> On 02/02/2013 02:04 PM, Paul Mullowney wrote:
> > Hi Karl,
> > 
> > I pulled from petsc-dev this morning and reworked the patch. Everything
> > is working as expected. Regarding your comments, the initialization of
> > CUSPARRAY * variable is done correctly in VecCUSPGetArrayRead() and
> > VecCUSPGetArrayWrite(). Thus the initializations to PETSC_NULL is not
> > required and the compiler warning are removed. In this patch, I fixed
> > the initialization of VecCUSPGetArrayWrite() (ArrayRead() was working
> > correctly previous to this patch).
> > 
> > Regarding your second comment, the PETSc KSP algorithms use an identity
> > when doing Hermitian solves and multiplies. In particular, the
> > conjugation of the input and output vectors is done so that one should
> > only do the Transpose multiply and solve. For instance in bicg.c, one has
> > 
> >    ierr = VecConjugate(Rl);CHKERRQ(ierr);
> >    ierr = KSP_PCApplyTranspose(ksp,Rl,Zl);CHKERRQ(ierr);
> >    ierr = VecConjugate(Rl);CHKERRQ(ierr);
> >    ierr = VecConjugate(Zl);CHKERRQ(ierr);
> > 
> > The conjugation of the input and output vectors forces one to use the
> > Transpose solve and not the Hermitian solve. The same holds for the
> > multiplies.
> > 
> > Also attached is a new tarball for download once this patch is pushed.
> > Thanks,
> > -Paul
> > 
> > > Hi Paul,
> > > 
> > > just a few questions on your patch:
> > > 
> > > I've spotted a few replacements of the kind:
> > >   -  CUSPARRAY      *xGPU=PETSC_NULL, *bGPU=PETSC_NULL;
> > >   +  CUSPARRAY      *xGPU, *bGPU;
> > > Is this intentional? This is likely to lead to warnings. I skipped
> > > these changes.
> > > 
> > > Also, there is
> > > -#if !defined(PETSC_USE_COMPLEX)
> > >      ierr = cusparseMat->mat->multiply(...,TRANSPOSE);...
> > > -#else
> > > -    ierr = cusparseMat->mat->multiply(...,HERMITIAN);...
> > > -#endif
> > > Is it safe to throw out the Hermitian transpose here? I've seen that
> > > the path adds a kernel for hermitian transpose, but I want to make
> > > sure this does not cause any side effects.
> > > 
> > > A patch for the current tip is attached, including the removal of the
> > > preprocessor switch for PETSC_USE_COMPLEX. However, I can't test it on
> > > my AMD machine right now...
> > > 
> > > Best regards,
> > > Karli
> > > 
> > > 
> > > 
> > > 
> > > On 02/01/2013 06:41 PM, Jed Brown wrote:
> > > > That's gonna suck. Karl, can you apply his patch to the old code, run
> > > > uncrustify on it, then send out the diff (which should apply cleanly to
> > > > head).
> > > > 
> > > > On Feb 1, 2013 6:32 PM, "Karl Rupp" <rupp at mcs.anl.gov
> > > > <mailto:rupp at mcs.anl.gov>> wrote:
> > > > 
> > > >     Hi Paul,
> > > > 
> > > >     I just uncrustified src/mat/impls/aij/* and pushed it to petsc-dev.
> > > >     Could you please re-generate your patch based on the latest commit?
> > > > 
> > > >     Thanks and best regards,
> > > >     Karli
> > > > 
> > > > 
> > > >     On 02/01/2013 06:11 PM, Paul Mullowney wrote:
> > > > 
> > > >         Hi,
> > > > 
> > > >         Here's a reworked patch for running BiCG on GPUs (with ILU(0)
> > > >         preconditioners) on GPUs for the aijcusparse.cu
> > > > <http://aijcusparse.cu> class. I fixed the
> > > >         comments from the previous emails on this patch. In particular,
> > > >         I added
> > > > 
> > > >         (1) VecConjugate implementation in veccusp.cu
> > > > <http://veccusp.cu> with the correct method
> > > >         for getting the device ptr (VecCUSPGetArrayReadWrite()).
> > > >         (2) Various methods in aijcusparse.cu <http://aijcusparse.cu>
> > > >         for building the transpose
> > > >         matrices for MatSolveTranspose* methods. The implementation
> > > > of the
> > > >         solves is done under the hood in the txpetscgpu library. A
> > > >         protection
> > > >         was added to ensure the matrix generation routines are only
> > > >         called once.
> > > >         (3) I fixed the uninitialized compiler warning when building in
> > > >         double
> > > >         complex. This required a slight fix in VecCUSPGetArrayWrite().
> > > >         (4) Small Style fixes.
> > > > 
> > > >         I wasn't clear to me how to break this up patch into a small
> > > >         organizational patch and then a large implementation patch. If
> > > >         you have
> > > >         suggestions on what corresponds to organization and what
> > > >         corresponds to
> > > >         implementation, I can try to do that in subsequent patches.
> > > > 
> > > >         Everything builds and runs fine on my end.
> > > > 
> > > >         Thanks,
> > > >         -Paul
> > > > 
> > > > 
> > > 
> > 
> 
> 




More information about the petsc-dev mailing list