[petsc-dev] PetscSF and/or VecScatter with device pointers
Lawrence Mitchell
lawrence.mitchell at imperial.ac.uk
Thu Jul 12 05:47:17 CDT 2018
Dear petsc-dev,
we're starting to explore (with Andreas cc'd) residual assembly on
GPUs. The question naturally arises: how to do GlobalToLocal and
LocalToGlobal.
I have:
A PetscSF describing the communication pattern.
A Vec holding the data to communicate. This will have an up-to-date
device pointer.
I would like:
PetscSFBcastBegin/End (and ReduceBegin/End, etc...) to (optionally)
work with raw device pointers. I am led to believe that modern MPIs
can plug directly into device memory, so I would like to avoid copying
data to the host, doing the communication there, and then going back
up to the device.
Given that I think that the window implementation (which just
delegates the MPI for all the packing) is not considered prime time
(mostly due to MPI implementation bugs, I think), I think this means
implementing a version of PetscSF_Basic that can handle the
pack/unpack directly on the device, and then just hands off to MPI.
The next thing is how to put a higher-level interface on top of this.
What, if any, suggestions are there for doing something where the
top-level API is agnostic to whether the data are on the host or the
device.
We had thought something like:
- Make PetscSF handle device pointers (possibly with new implementation?)
- Make VecScatter use SF.
Calling VecScatterBegin/End on a Vec with up-to-date device pointers
just uses the SF directly.
Have there been any thoughts about how you want to do multi-GPU
interaction?
Cheers,
Lawrence
More information about the petsc-dev
mailing list