[petsc-users] overlap cpu and gpu?

Jed Brown jed at jedbrown.org
Sat Aug 1 14:24:41 CDT 2020


You can use MPI and split the communicator so n-1 ranks create a DMDA for one part of your system and the other rank drives the GPU in the other part.  They can all be part of the same coupled system on the full communicator, but PETSc doesn't currently support some ranks having their Vec arrays on GPU and others on host, so you'd be paying host-device transfer costs on each iteration (and that might swamp any performance benefit you would have gotten).

In any case, be sure to think about the execution time of each part.  Load balancing with matching time-to-solution for each part can be really hard.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Kronbichler-fig4-crop.png
Type: image/png
Size: 111247 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200801/489a591b/attachment-0001.png>
-------------- next part --------------

Barry Smith <bsmith at petsc.dev> writes:

>   Nicola,
>
>     This is really viable or practical at this time with PETSc. It is not impossible but requires careful coding with threads, another possibility is to use one half of the virtual GPUs for each solve, this is also not trivial. I would recommend first seeing what kind of performance you can get on the GPU for each type of solve and revist this idea in the future.
>
>    Barry
>
>
>
>
>> On Jul 31, 2020, at 9:23 AM, nicola varini <nicola.varini at gmail.com> wrote:
>> 
>> Hello, I would like to know if it is possible to overlap CPU and GPU with DMDA.
>> I've a machine where each node has 1P100+1Haswell.
>> I've to resolve Poisson and Ampere equation for each time step.
>> I'm using 2D DMDA for each of them. Would be possible to compute poisson 
>> and ampere equation at the same time? One on CPU and the other on GPU?
>> 
>> Thanks


More information about the petsc-users mailing list