[petsc-dev] [GPU - slepc] Hands-on exercise 4 (SVD) not working with GPU and default configurations

Jose E. Roman jroman at dsic.upv.es
Mon Aug 10 11:59:03 CDT 2015


Massimiliano,

You should not be getting slower times on the GPU. I tried with a hardware similar to what you mention, running SVD on a dense square matrix stored as aij, and also with sparse rectangular matrices. In all cases, executions on the GPU were roughly 2x faster than on the CPU. Are you running with an optimized build? There might be something wrong with your code. I would need to know the exact options that you are using. Maybe you can share your code with us, or even the matrix.

For the case of a dense matrix, one could create a customized shell matrix that stores data on the GPU and uses cuBLAS for the matrix-vector product. We have recently done this on a different problem and results were quite good. However, it is much more low-level programming compared to just setting AIJCUSP type for the matrix.

Jose



> El 10/8/2015, a las 15:55, Leoni, Massimiliano <Massimiliano.Leoni at rolls-royce.com> escribió:
> 
> > -----Original Message-----
> > From: Karl Rupp [mailto:rupp at iue.tuwien.ac.at]
> > Sent: 10 August 2015 14:13
> > To: Leoni, Massimiliano
> > Cc: slepc-maint at upv.es; petsc-dev at mcs.anl.gov
> > Subject: Re: [petsc-dev] [GPU - slepc] Hands-on exercise 4 (SVD) not working
> > with GPU and default configurations
>  
> > Maybe you forgot to call SlepcFinalize()?
> Unfortunately it's not it, if I omit SlepcFinalize() an error message shows up at runtime to remind me.
>  
> > Ok, this is actually a relatively GPU-friendly setup, because CPUs have
> > reduced the gap in terms of FLOPs quite a bit (see for example
> > http://www.karlrupp.net/2013/06/cpu-gpu-and-mic-hardware-
> > characteristics-over-time/  )
> Read, thanks for sharing!
> > I'd suggest to convince your supervisor into buying/using a cluster with
> > current hardware and enjoy a higher speedup compared to what you could
> > get in an ideal setting with a GPU from 2010 anyway ;-)
> This could partly be overcome as I was told I *might*, eventually, have access to a big cluster with many NVIDIA Tesla K20.
>  
> > (Having said that, I carefully estimate that you can get some
> > performance gains for SVD if you deep-dive into the existing SVD
> > implementation, carefully redesign it to minimize CPU<->GPU
> > communication, and use optimized library routines from the BLAS 3
> > operations. Currently there is not enough GPU-infrastructure in PETSc to
> > achieve this via command line parameters only.)
> Mmm, can you give a rough estimate of the effort involved in this?
>  
>  
> >From: Matthew Knepley [mailto:knepley at gmail.com] 
> >Sent: 10 August 2015 14:28
> >To: Leoni, Massimiliano
> >Cc: Karl Rupp; slepc-maint at upv.es; petsc-dev at mcs.anl.gov
> >Subject: Re: [petsc-dev] [GPU - slepc] Hands-on exercise 4 (SVD) not working with GPU and default configurations
>  
> >Try calling PetscLogBegin() after PetscInitialize(). We have now put in an error if this is not initialized correctly.
> This didn’t do the trick, unfortunately 
> Do I have to pull from the repo and rebuild?
>  
> [In general, can I pull and rebuild without running configure again?]
>  
> >I agree with Karl that not much speedup can be expected with GPUs. This is the fault of dishonest marketing. None
> >of the computations in PETSc are limited by the computation rate, rather they are limited by memory bandwidth. The
> >bandwidth is at best 2-3x better, and less for modern CPUs. The dense SVD can be better than this, but you are
> >eventually limited by offload times and memory latency. The story of 100x, or even 10x, speedups is just a fraud.
> I remember reading this in one of the petsc reports [the “Preliminary evaluation” one?].
> I’ll see what I can do
>  
> Best regards,
> Massimiliano
>  
> 
> The data contained in, or attached to, this e-mail, may contain confidential information. If you have received it in error you should notify the sender immediately by reply e-mail, delete the message from your system and contact +44 (0) 1332 622800(Security Operations Centre) if you need assistance. Please do not copy it for any purpose, or disclose its contents to any other person.
> 
> An e-mail response to this address may be subject to interception or monitoring for operational reasons or for lawful business practices.
> 
> (c) 2015 Rolls-Royce plc
> 
> Registered office: 62 Buckingham Gate, London SW1E 6AT Company number: 1003142. Registered in England.
> 




More information about the petsc-dev mailing list