[petsc-dev] patch for BiCG on GPUs (reworked).
Satish Balay
balay at mcs.anl.gov
Sat Feb 2 21:37:48 CST 2013
uploaded tarball and pushed
https://bitbucket.org/petsc/petsc-dev/commits/e95fd54300be1e05489068b844fd57c7
satish
On Sat, 2 Feb 2013, Karl Rupp wrote:
> Hi,
>
> alright, here we go:
> https://bitbucket.org/petsc/petsc-dev/commits/ccdf0150dce67cfc50e1ec80872f3d5d
>
> Satish, could you please upload txpetscgpu-0.0.9.tar.gz (and eventually update
> the download URL in the build system)?
>
> Thanks and best regards,
> Karli
>
>
> On 02/02/2013 02:04 PM, Paul Mullowney wrote:
> > Hi Karl,
> >
> > I pulled from petsc-dev this morning and reworked the patch. Everything
> > is working as expected. Regarding your comments, the initialization of
> > CUSPARRAY * variable is done correctly in VecCUSPGetArrayRead() and
> > VecCUSPGetArrayWrite(). Thus the initializations to PETSC_NULL is not
> > required and the compiler warning are removed. In this patch, I fixed
> > the initialization of VecCUSPGetArrayWrite() (ArrayRead() was working
> > correctly previous to this patch).
> >
> > Regarding your second comment, the PETSc KSP algorithms use an identity
> > when doing Hermitian solves and multiplies. In particular, the
> > conjugation of the input and output vectors is done so that one should
> > only do the Transpose multiply and solve. For instance in bicg.c, one has
> >
> > ierr = VecConjugate(Rl);CHKERRQ(ierr);
> > ierr = KSP_PCApplyTranspose(ksp,Rl,Zl);CHKERRQ(ierr);
> > ierr = VecConjugate(Rl);CHKERRQ(ierr);
> > ierr = VecConjugate(Zl);CHKERRQ(ierr);
> >
> > The conjugation of the input and output vectors forces one to use the
> > Transpose solve and not the Hermitian solve. The same holds for the
> > multiplies.
> >
> > Also attached is a new tarball for download once this patch is pushed.
> > Thanks,
> > -Paul
> >
> > > Hi Paul,
> > >
> > > just a few questions on your patch:
> > >
> > > I've spotted a few replacements of the kind:
> > > - CUSPARRAY *xGPU=PETSC_NULL, *bGPU=PETSC_NULL;
> > > + CUSPARRAY *xGPU, *bGPU;
> > > Is this intentional? This is likely to lead to warnings. I skipped
> > > these changes.
> > >
> > > Also, there is
> > > -#if !defined(PETSC_USE_COMPLEX)
> > > ierr = cusparseMat->mat->multiply(...,TRANSPOSE);...
> > > -#else
> > > - ierr = cusparseMat->mat->multiply(...,HERMITIAN);...
> > > -#endif
> > > Is it safe to throw out the Hermitian transpose here? I've seen that
> > > the path adds a kernel for hermitian transpose, but I want to make
> > > sure this does not cause any side effects.
> > >
> > > A patch for the current tip is attached, including the removal of the
> > > preprocessor switch for PETSC_USE_COMPLEX. However, I can't test it on
> > > my AMD machine right now...
> > >
> > > Best regards,
> > > Karli
> > >
> > >
> > >
> > >
> > > On 02/01/2013 06:41 PM, Jed Brown wrote:
> > > > That's gonna suck. Karl, can you apply his patch to the old code, run
> > > > uncrustify on it, then send out the diff (which should apply cleanly to
> > > > head).
> > > >
> > > > On Feb 1, 2013 6:32 PM, "Karl Rupp" <rupp at mcs.anl.gov
> > > > <mailto:rupp at mcs.anl.gov>> wrote:
> > > >
> > > > Hi Paul,
> > > >
> > > > I just uncrustified src/mat/impls/aij/* and pushed it to petsc-dev.
> > > > Could you please re-generate your patch based on the latest commit?
> > > >
> > > > Thanks and best regards,
> > > > Karli
> > > >
> > > >
> > > > On 02/01/2013 06:11 PM, Paul Mullowney wrote:
> > > >
> > > > Hi,
> > > >
> > > > Here's a reworked patch for running BiCG on GPUs (with ILU(0)
> > > > preconditioners) on GPUs for the aijcusparse.cu
> > > > <http://aijcusparse.cu> class. I fixed the
> > > > comments from the previous emails on this patch. In particular,
> > > > I added
> > > >
> > > > (1) VecConjugate implementation in veccusp.cu
> > > > <http://veccusp.cu> with the correct method
> > > > for getting the device ptr (VecCUSPGetArrayReadWrite()).
> > > > (2) Various methods in aijcusparse.cu <http://aijcusparse.cu>
> > > > for building the transpose
> > > > matrices for MatSolveTranspose* methods. The implementation
> > > > of the
> > > > solves is done under the hood in the txpetscgpu library. A
> > > > protection
> > > > was added to ensure the matrix generation routines are only
> > > > called once.
> > > > (3) I fixed the uninitialized compiler warning when building in
> > > > double
> > > > complex. This required a slight fix in VecCUSPGetArrayWrite().
> > > > (4) Small Style fixes.
> > > >
> > > > I wasn't clear to me how to break this up patch into a small
> > > > organizational patch and then a large implementation patch. If
> > > > you have
> > > > suggestions on what corresponds to organization and what
> > > > corresponds to
> > > > implementation, I can try to do that in subsequent patches.
> > > >
> > > > Everything builds and runs fine on my end.
> > > >
> > > > Thanks,
> > > > -Paul
> > > >
> > > >
> > >
> >
>
>
More information about the petsc-dev
mailing list