[petsc-dev] Parallel calculation on GPU

Projet_TRIOU triou at cea.fr
Wed Aug 20 09:39:22 CDT 2014

On 08/20/14 16:03, Karl Rupp wrote:
>>> What you could do with 4N procs for PETSc is to define your own matrix
>>> layout, where only one out of four processes actually owns part of the
>>> matrix. After MatAssemblyBegin()/MatAssemblyEnd() the full data gets
>>> correctly transferred to N procs, with the other 3*N procs being
>>> 'empty'. You should then be able to run the solver with all 4*N
>>> processors, but only N of them actually do the work on the GPUs.
>> OK, I understand your solution, as I was already thinking about that,
>> thanks to confirm it. But, my fear was that the performance was not
>> improved. Indeed, I still don't understand (even after
>> analyzing -log_summary profiles and searching in the petsc-dev archives)
>> what is slowing down with several MPI tasks sharing one GPU, compared to
>> one MPI task working with one GPU...
>> In the proposed solution, 4*N processes will still exchange MPI messages
>> during a KSP iteration, and the amount of data copy will be the same
>> between GPU and CPU(s), so if you could enlighten
>> me, I will be glad.
> One of the causes of the performance penalty you observe is the higher 
> PCI-Express communication: If four ranks share a single GPU, then each 
> matrix-vector product requires at least 8 vector transfers between 
> host and device, rather than just 2 with a single MPI rank. Similarly, 
> you have four times the number of kernel launches. It may well be that 
> these overheads just eat up all the performance gains you could 
> otherwise obtain. I don't know your profiling data, so I can't be more 
> specific at this point. 

Thanks a lot Karli for the explanations. I am currently trying your 

> Best regards,
> Karli

*Trio_U support team*
Marthe ROUX (01 69 08 00 02) Saclay
Pierre LEDAC (04 38 78 91 49) Grenoble
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20140820/96233015/attachment.html>

More information about the petsc-dev mailing list