[petsc-dev] Fwd: Pflotran perf on GPU

Barry Smith bsmith at mcs.anl.gov
Thu Apr 7 07:27:14 CDT 2011


  I sent the following email, but forgot to forward the attachments, here they are.

Jeswin,

   Thanks for the numbers. The linear solver is doing very well and you can really see Amdahl's Law kicking in, even when the KSPSolve is several times faster because of the GPU the whole application code is NOT several times faster because the function evaluation and Jacobian evaluations which are not on the GPU are hold the speedup down. Look for example at the row labeled KSPSolve() and the row labeled SNESSolve().

   The reason the tracer code is not showing great improvement is because the problem is WAY WAY to small to run on the GPU. GPU's can only work well with lots and lots of data, this is a fact of life and cannot be changed by better software.

  This is very exciting, the GPU is really helping the linear solver as much as one could hope for. I'm glad you got it running,


  Barry


Begin forwarded message:

> From: Jeswin Godwin <jeswingodwin at gmail.com>
> Date: April 6, 2011 3:46:55 PM CDT
> To: Barry Smith <bsmith at mcs.anl.gov>, Victor Minden <victorminden at gmail.com>, Matthew Knepley <knepley at gmail.com>
> Subject: Fwd: Pflotran perf on GPU
> 
> 
> 
> Petsc Team,
> 
> I am student from Ohio-State University working in Hpcrl lab.
> 
>     
> 
> --This shows the performance of Pflotran on the GPU.
> 
> I am also attaching the –log_summary output for 1st case of both the example cases. 
> 
> --I built Petsc with-debugging=0. 
> 
> -- I did not use the preconditioner for any of the cases....
> 
> -- Would you people be able to explain the performance in the these cases. (i.e) The performance of 100_100_100 is better in the GPU case compared to the ID tracer. 
> 
> 
> 
> 
> 
> 
>    
> Performance of Pflotran on the GPU.
> 
> example 100_100_100
> 
> No Gpu
> 
> Gpu
> 
> Nx
> 
> Ny
> 
> Ny
> 
> time sec
> 
> Gflops
> 
> Time sec
> 
> Gflops
> 
> 32
> 
> 32
> 
> 32
> 
> 4.58E+01
> 
> 1.78E+08
> 
> 3.82E+01
> 
> 2.19E+08
> 
> 64
> 
> 64
> 
> 64
> 
> 6.83E+02
> 
> 2.66E+08
> 
> 3.64E+02
> 
> 5.14E+08
> 
> 96
> 
> 96
> 
> 96
> 
> 3.56E+03
> 
> 3.32E+08
> 
> 1.27E+03
> 
> 9.53E+08
> 
> example 1-D Tracer
> 
> No Gpu
> 
> Gpu
> 
> Nx
> 
> Ny
> 
> Nz
> 
> time sec
> 
> Gflops
> 
> Time sec
> 
> Gflops
> 
> 1000
> 
> 1
> 
> 1
> 
> 3.43E+00
> 
> 5.04E+08
> 
> 2.62E+01
> 
> 7.21E+07
> 
> 5000
> 
> 1
> 
> 1
> 
> 1.12E+02
> 
> 6.51E+08
> 
> 2.02E+02
> 
> 3.74E+08
> 
> 8000
> 
> 1
> 
> 1
> 
> 3.97E+02
> 
> 6.41E+08
> 
> 4.73E+02
> 
> 5.67E+08
> 
>  
>  
>  
>  
>  
>  
>  
>  
> Thank you,
> 
> Jeswin
> 
>  
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20110407/d2d0b157/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: output_100_100_100.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20110407/d2d0b157/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: output_seqcusp100_100_100.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20110407/d2d0b157/attachment-0001.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: output_seqcuspcuda.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20110407/d2d0b157/attachment-0002.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: output_tracer.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20110407/d2d0b157/attachment-0003.txt>


More information about the petsc-dev mailing list