[petsc-dev] Using cuBlas as the vendor blas for PETSc

Jack Poulson jack.poulson at gmail.com
Fri Feb 24 21:16:13 CST 2012


Dave,

That will probably not be a very good idea due to the overhead associated
with transferring data to and from the GPU being more expensive than the
computation itself for small problems. This issue can be somewhat avoided
by writing trivial wrappers for routines like dgemm which only run the
multiply on the GPU when the dimensions of the problem are above some
threshold, but this would require slightly more work than simply replacing
BLAS with CUBLAS.

Jack

On Fri, Feb 24, 2012 at 8:28 PM, Dave Nystrom <Dave.Nystrom at tachyonlogic.com
> wrote:

> I was wondering if anyone had ever tried using cuBlas as a substitute for
> something like MKL with PETSc.  I've been wondering if it would give better
> performance than MKL for my direct solves with cholmod even though the
> block
> sizes are small for cholmod i.e. 32x32 is the default I believe.  If so,
> were
> there any tricky aspects to using cuBlas in this way?
>
> Thanks,
>
> Dave
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120224/22816e8b/attachment.html>


More information about the petsc-dev mailing list