[petsc-dev] refactoring petsccusp.h needed

Jose E. Roman jroman at dsic.upv.es
Sat Mar 16 06:17:23 CDT 2013


El 16/03/2013, a las 00:46, Karl Rupp escribió:

> Hi Paul,
> 
>> For GMRES, the current performance of VecMDot_SeqCUSP sucks. I have an
>> solution, but I haven't tested all cases yet.
>> For BCGS, some part of the algorithm is broken but I don't know what it
>> is. By broken, I mean that CPU and GPU residuals diverge fairly quickly.
> 
> Since I just stumbled over VecMDot_SeqCUSP() when interfacing ViennaCL: Do you know what was the reason why the 'old' version was replaced by this expensive call to gemv() including the creation of temporaries, etc.? Just writing a custom kernel with one work group per dot-product should do the job perfectly, shouldn't it?
> 
> Best regards,
> Karli

My fault: https://bitbucket.org/petsc/petsc-hg/commits/ec7a7de2acd477e5edd24cc5a3af441ce7a68a36

The motivation was that the previous version was even worse for me (VecMDot is used a lot in SLEPc and GPU performance was really bad). At that time I did not have the time to write a custom kernel. If you write one, I could help in testing and measuring performance.

Jose




More information about the petsc-dev mailing list