[petsc-dev] patch for BiCG on GPUs (reworked).
Karl Rupp
rupp at mcs.anl.gov
Sat Feb 2 16:46:54 CST 2013
Hi,
alright, here we go:
https://bitbucket.org/petsc/petsc-dev/commits/ccdf0150dce67cfc50e1ec80872f3d5d
Satish, could you please upload txpetscgpu-0.0.9.tar.gz (and eventually
update the download URL in the build system)?
Thanks and best regards,
Karli
On 02/02/2013 02:04 PM, Paul Mullowney wrote:
> Hi Karl,
>
> I pulled from petsc-dev this morning and reworked the patch. Everything
> is working as expected. Regarding your comments, the initialization of
> CUSPARRAY * variable is done correctly in VecCUSPGetArrayRead() and
> VecCUSPGetArrayWrite(). Thus the initializations to PETSC_NULL is not
> required and the compiler warning are removed. In this patch, I fixed
> the initialization of VecCUSPGetArrayWrite() (ArrayRead() was working
> correctly previous to this patch).
>
> Regarding your second comment, the PETSc KSP algorithms use an identity
> when doing Hermitian solves and multiplies. In particular, the
> conjugation of the input and output vectors is done so that one should
> only do the Transpose multiply and solve. For instance in bicg.c, one has
>
> ierr = VecConjugate(Rl);CHKERRQ(ierr);
> ierr = KSP_PCApplyTranspose(ksp,Rl,Zl);CHKERRQ(ierr);
> ierr = VecConjugate(Rl);CHKERRQ(ierr);
> ierr = VecConjugate(Zl);CHKERRQ(ierr);
>
> The conjugation of the input and output vectors forces one to use the
> Transpose solve and not the Hermitian solve. The same holds for the
> multiplies.
>
> Also attached is a new tarball for download once this patch is pushed.
> Thanks,
> -Paul
>
>> Hi Paul,
>>
>> just a few questions on your patch:
>>
>> I've spotted a few replacements of the kind:
>> - CUSPARRAY *xGPU=PETSC_NULL, *bGPU=PETSC_NULL;
>> + CUSPARRAY *xGPU, *bGPU;
>> Is this intentional? This is likely to lead to warnings. I skipped
>> these changes.
>>
>> Also, there is
>> -#if !defined(PETSC_USE_COMPLEX)
>> ierr = cusparseMat->mat->multiply(...,TRANSPOSE);...
>> -#else
>> - ierr = cusparseMat->mat->multiply(...,HERMITIAN);...
>> -#endif
>> Is it safe to throw out the Hermitian transpose here? I've seen that
>> the path adds a kernel for hermitian transpose, but I want to make
>> sure this does not cause any side effects.
>>
>> A patch for the current tip is attached, including the removal of the
>> preprocessor switch for PETSC_USE_COMPLEX. However, I can't test it on
>> my AMD machine right now...
>>
>> Best regards,
>> Karli
>>
>>
>>
>>
>> On 02/01/2013 06:41 PM, Jed Brown wrote:
>>> That's gonna suck. Karl, can you apply his patch to the old code, run
>>> uncrustify on it, then send out the diff (which should apply cleanly to
>>> head).
>>>
>>> On Feb 1, 2013 6:32 PM, "Karl Rupp" <rupp at mcs.anl.gov
>>> <mailto:rupp at mcs.anl.gov>> wrote:
>>>
>>> Hi Paul,
>>>
>>> I just uncrustified src/mat/impls/aij/* and pushed it to petsc-dev.
>>> Could you please re-generate your patch based on the latest commit?
>>>
>>> Thanks and best regards,
>>> Karli
>>>
>>>
>>> On 02/01/2013 06:11 PM, Paul Mullowney wrote:
>>>
>>> Hi,
>>>
>>> Here's a reworked patch for running BiCG on GPUs (with ILU(0)
>>> preconditioners) on GPUs for the aijcusparse.cu
>>> <http://aijcusparse.cu> class. I fixed the
>>> comments from the previous emails on this patch. In particular,
>>> I added
>>>
>>> (1) VecConjugate implementation in veccusp.cu
>>> <http://veccusp.cu> with the correct method
>>> for getting the device ptr (VecCUSPGetArrayReadWrite()).
>>> (2) Various methods in aijcusparse.cu <http://aijcusparse.cu>
>>> for building the transpose
>>> matrices for MatSolveTranspose* methods. The implementation
>>> of the
>>> solves is done under the hood in the txpetscgpu library. A
>>> protection
>>> was added to ensure the matrix generation routines are only
>>> called once.
>>> (3) I fixed the uninitialized compiler warning when building in
>>> double
>>> complex. This required a slight fix in VecCUSPGetArrayWrite().
>>> (4) Small Style fixes.
>>>
>>> I wasn't clear to me how to break this up patch into a small
>>> organizational patch and then a large implementation patch. If
>>> you have
>>> suggestions on what corresponds to organization and what
>>> corresponds to
>>> implementation, I can try to do that in subsequent patches.
>>>
>>> Everything builds and runs fine on my end.
>>>
>>> Thanks,
>>> -Paul
>>>
>>>
>>
>
More information about the petsc-dev
mailing list