[petsc-dev] [GPU - slepc] Hands-on exercise 4 (SVD) not working with GPU and default configurations

Leoni, Massimiliano Massimiliano.Leoni at Rolls-Royce.com
Mon Aug 10 08:55:18 CDT 2015


> -----Original Message-----

> From: Karl Rupp [mailto:rupp at iue.tuwien.ac.at]

> Sent: 10 August 2015 14:13

> To: Leoni, Massimiliano

> Cc: slepc-maint at upv.es; petsc-dev at mcs.anl.gov

> Subject: Re: [petsc-dev] [GPU - slepc] Hands-on exercise 4 (SVD) not working

> with GPU and default configurations



> Maybe you forgot to call SlepcFinalize()?

Unfortunately it's not it, if I omit SlepcFinalize() an error message shows up at runtime to remind me.



> Ok, this is actually a relatively GPU-friendly setup, because CPUs have

> reduced the gap in terms of FLOPs quite a bit (see for example

> http://www.karlrupp.net/2013/06/cpu-gpu-and-mic-hardware-

> characteristics-over-time/  )

Read, thanks for sharing!

> I'd suggest to convince your supervisor into buying/using a cluster with

> current hardware and enjoy a higher speedup compared to what you could

> get in an ideal setting with a GPU from 2010 anyway ;-)

This could partly be overcome as I was told I *might*, eventually, have access to a big cluster with many NVIDIA Tesla K20.



> (Having said that, I carefully estimate that you can get some

> performance gains for SVD if you deep-dive into the existing SVD

> implementation, carefully redesign it to minimize CPU<->GPU

> communication, and use optimized library routines from the BLAS 3

> operations. Currently there is not enough GPU-infrastructure in PETSc to

> achieve this via command line parameters only.)

Mmm, can you give a rough estimate of the effort involved in this?





>From: Matthew Knepley [mailto:knepley at gmail.com]

>Sent: 10 August 2015 14:28

>To: Leoni, Massimiliano

>Cc: Karl Rupp; slepc-maint at upv.es; petsc-dev at mcs.anl.gov

>Subject: Re: [petsc-dev] [GPU - slepc] Hands-on exercise 4 (SVD) not working with GPU and default configurations



>Try calling PetscLogBegin() after PetscInitialize(). We have now put in an error if this is not initialized correctly.

This didn’t do the trick, unfortunately 

Do I have to pull from the repo and rebuild?



[In general, can I pull and rebuild without running configure again?]



>I agree with Karl that not much speedup can be expected with GPUs. This is the fault of dishonest marketing. None

>of the computations in PETSc are limited by the computation rate, rather they are limited by memory bandwidth. The

>bandwidth is at best 2-3x better, and less for modern CPUs. The dense SVD can be better than this, but you are

>eventually limited by offload times and memory latency. The story of 100x, or even 10x, speedups is just a fraud.

I remember reading this in one of the petsc reports [the “Preliminary evaluation” one?].

I’ll see what I can do



Best regards,

Massimiliano

The data contained in, or attached to, this e-mail, may contain confidential information. If you have received it in error you should notify the sender immediately by reply e-mail, delete the message from your system and contact +44 (0) 1332 622800(Security Operations Centre) if you need assistance. Please do not copy it for any purpose, or disclose its contents to any other person.

An e-mail response to this address may be subject to interception or monitoring for operational reasons or for lawful business practices.

(c) 2015 Rolls-Royce plc

Registered office: 62 Buckingham Gate, London SW1E 6AT Company number: 1003142. Registered in England.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20150810/c01c2f1a/attachment.html>


More information about the petsc-dev mailing list