[petsc-dev] refactoring petsccusp.h needed
Jose E. Roman
jroman at dsic.upv.es
Sat Mar 16 06:17:23 CDT 2013
El 16/03/2013, a las 00:46, Karl Rupp escribió:
> Hi Paul,
>
>> For GMRES, the current performance of VecMDot_SeqCUSP sucks. I have an
>> solution, but I haven't tested all cases yet.
>> For BCGS, some part of the algorithm is broken but I don't know what it
>> is. By broken, I mean that CPU and GPU residuals diverge fairly quickly.
>
> Since I just stumbled over VecMDot_SeqCUSP() when interfacing ViennaCL: Do you know what was the reason why the 'old' version was replaced by this expensive call to gemv() including the creation of temporaries, etc.? Just writing a custom kernel with one work group per dot-product should do the job perfectly, shouldn't it?
>
> Best regards,
> Karli
My fault: https://bitbucket.org/petsc/petsc-hg/commits/ec7a7de2acd477e5edd24cc5a3af441ce7a68a36
The motivation was that the previous version was even worse for me (VecMDot is used a lot in SLEPc and GPU performance was really bad). At that time I did not have the time to write a custom kernel. If you write one, I could help in testing and measuring performance.
Jose
More information about the petsc-dev
mailing list