[petsc-dev] [GPU - slepc] Hands-on exercise 4 (SVD) not working with GPU and default configurations
Matthew Knepley
knepley at gmail.com
Mon Aug 10 08:27:53 CDT 2015
On Mon, Aug 10, 2015 at 7:47 AM, Leoni, Massimiliano <
Massimiliano.Leoni at rolls-royce.com> wrote:
> > -----Original Message-----
> > From: Karl Rupp [mailto:rupp at iue.tuwien.ac.at]
> > Sent: 10 August 2015 11:54
> > To: Leoni, Massimiliano
> > Cc: slepc-maint at upv.es; petsc-dev at mcs.anl.gov
> > Subject: Re: [petsc-dev] [GPU - slepc] Hands-on exercise 4 (SVD) not
> working
> > with GPU and default configurations
>
> > The use of aijcusp instead of a dense matrix type certainly adds to the
> issue.
> I know, but I couldn't find a dense gpu type in the petsc manual, please
> correct me if there is any.
>
> > Please send the output of -log_summary so that we can see where most
> > time is spent.
> I am unable to do that as somehow I am having no output when I use that
> option. I also tried to explicitly call PetscLogView but still nothing is
> printed out.
> If I try with one of the slepc examples, I get the output.
> Why is this happening? If I run my code with -info or -log_trace I see
> their output, only -log_summary is shy!
>
Try calling PetscLogBegin() after PetscInitialize(). We have now put in an
error if this is not initialized correctly.
I agree with Karl that not much speedup can be expected with GPUs. This is
the fault of dishonest marketing. None
of the computations in PETSc are limited by the computation rate, rather
they are limited by memory bandwidth. The
bandwidth is at best 2-3x better, and less for modern CPUs. The dense SVD
can be better than this, but you are
eventually limited by offload times and memory latency. The story of 100x,
or even 10x, speedups is just a fraud.
Thanks,
Matt
> > If you have good (recent) CPUs in dual-socket configuration, it's more
> than
> > unlikely that you will gain anything beyond ~2x with an optimized GPU
> setup.
> > Even that ~2x may only be possible with heavily tweaking the current SVD-
> > implementation in SLEPc, of which I don't know the details.
> I used Xeon processors from 2010, just like the GPUs.
> This is not good news, as my supervisor is really optimist about using
> GPUs and getting high speed-ups!
> Anyway, at the moment my gpu version is several times slower than the cpu
> version, so even a 2x would be a win now :D
>
>
> Massimiliano
> The data contained in, or attached to, this e-mail, may contain
> confidential information. If you have received it in error you should
> notify the sender immediately by reply e-mail, delete the message from your
> system and contact +44 (0) 1332 622800(Security Operations Centre) if you
> need assistance. Please do not copy it for any purpose, or disclose its
> contents to any other person.
>
> An e-mail response to this address may be subject to interception or
> monitoring for operational reasons or for lawful business practices.
>
> (c) 2015 Rolls-Royce plc
>
> Registered office: 62 Buckingham Gate, London SW1E 6AT Company number:
> 1003142. Registered in England.
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20150810/3d530d14/attachment.html>
More information about the petsc-dev
mailing list