[petsc-dev] PetscSF and/or VecScatter with device pointers

Lawrence Mitchell lawrence.mitchell at imperial.ac.uk
Thu Jul 12 05:47:17 CDT 2018


Dear petsc-dev,

we're starting to explore (with Andreas cc'd) residual assembly on
GPUs.  The question naturally arises: how to do GlobalToLocal and
LocalToGlobal.

I have:

A PetscSF describing the communication pattern.

A Vec holding the data to communicate.  This will have an up-to-date
device pointer.

I would like:

PetscSFBcastBegin/End (and ReduceBegin/End, etc...) to (optionally)
work with raw device pointers.  I am led to believe that modern MPIs
can plug directly into device memory, so I would like to avoid copying
data to the host, doing the communication there, and then going back
up to the device.

Given that I think that the window implementation (which just
delegates the MPI for all the packing) is not considered prime time
(mostly due to MPI implementation bugs, I think), I think this means
implementing a version of PetscSF_Basic that can handle the
pack/unpack directly on the device, and then just hands off to MPI.

The next thing is how to put a higher-level interface on top of this.
What, if any, suggestions are there for doing something where the
top-level API is agnostic to whether the data are on the host or the
device.

We had thought something like:

- Make PetscSF handle device pointers (possibly with new implementation?)

- Make VecScatter use SF.

Calling VecScatterBegin/End on a Vec with up-to-date device pointers
just uses the SF directly.

Have there been any thoughts about how you want to do multi-GPU
interaction?

Cheers,

Lawrence


More information about the petsc-dev mailing list