[petsc-dev] SuiteSparse 4.0
dnystrom1 at comcast.net
dnystrom1 at comcast.net
Wed Jun 13 09:38:17 CDT 2012
I understand the issue of some uses of blas being too small to justify shipping to the gpu.
So I was thinking maybe petsc could choose at runtime based on problem size whether
to use a cpu implementation of the blas or a gpu implementation of the blas. Seems like
that should be feasible but more complex than just having a petsc interface to substitute
use of cublas for everything.
Yes, I will back off tonight and try to work with a petsc example problem. I had to add
an additional library to CHOLMOD.py to satisfy an unstatisfied external. I think it was
named libsuitesparseconfig.a and was needed to resolve the SuiteSparse_time symbol.
Would probably be good to have a SuiteSparse.py module. I'll send more info if I can
reproduce the problem with a petsc example.
Thanks,
Dave
----- Original Message -----
From: "Jed Brown" <jedbrown at mcs.anl.gov>
To: "Dave Nystrom" <dnystrom1 at comcast.net>
Cc: "For users of the development version of PETSc" <petsc-dev at mcs.anl.gov>
Sent: Wednesday, June 13, 2012 7:41:05 AM
Subject: Re: [petsc-dev] SuiteSparse 4.0
You can't use cublas everywhere because it's better to do small sizes on the CPU and different threads need to be able to operate independently (without multiple devices).
Yes, try a PETSc example before your own code. Did petsc build without any changes to the cholmod interface? ALWAYS send the error message/stack trace---even if we don't have an answer, it usually tells us how difficult the problem is likely to be to fix.
On Jun 13, 2012 8:09 AM, "Dave Nystrom" < dnystrom1 at comcast.net > wrote:
Well, I tried it last night but without success. I got a memory corruption
error when I tried running my app. I suppose I should try first to get it
working with a petsc example. I'm building SuiteSparse but only using
cholmod out of that package. Several of the SuiteSparse packages use blas
but cholmod is the only one so far that has the option to use cublas. Tim
gets about a 10x speedup over the single core result when using cublas on an
example problem that spends a lot of time in blas. So seems worth trying.
Anyway, I think I must be doing something wrong so far. But for the same
problem, Tim gets about a 20+ percent speedup over using multi-threaded Goto
blas.
Probably a better solution for petsc would be support for using cublas for
all of the petsc blas needs. Why should just one petsc blas client have
access to cublas? But not sure how much work is involved for that - I see it
is on the petsc ToDo list. I will try again tonight but would welcome advice
or experiences from anyone else who has tried the new cholmod.
Dave
Jed Brown writes:
> Nope, why don't you try it and send us a patch if you get it working.
>
> On Wed, Jun 13, 2012 at 12:49 AM, Dave Nystrom < dnystrom1 at comcast.net >wrote:
>
> > Has anyone tried building petsc-dev to use cholmod-2.0 which is part of
> > SuiteSparse-4.0 with the cholmod support for using cublas enabled? I am
> > interested in trying the cublas support for cholmod to see how it compares
> > with mkl and goto for my solves.
> >
> > Thanks,
> >
> > Dave
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120613/0120280e/attachment.html>
More information about the petsc-dev
mailing list