[petsc-dev] model for parallel ASM

Mark Adams mfadams at lbl.gov
Mon Jan 11 10:35:38 CST 2021


Jacob, I'm not sure I understand this response. I could not find you on the
Kokkos slack channel.

Me: And My colleague in PETSc, Jacob Faibussowitsch, has talked to you
about Kokkos taking a Cuda, Hip, etc., stream. This is something that would
make it easier to deal with asynchronous GPU solvers in PETSc. We just
wanted to check on this.

Trott: Kokkos itself can do it for practically every operation

Maybe you want to talk with him at some point, but we can worry about
getting Cuda to work for now.

On Sun, Jan 10, 2021 at 2:28 PM Jacob Faibussowitsch <jacob.fai at gmail.com>
wrote:

> I would like as much as possible to pass the cuda and hip streams to
> Kokkos, since I can directly handle much of the annoyance with wrangling
> multiple streams and stream objects externally. Last I checked on this
> Kokkos was moving towards allowing association of streams to functions, but
> admittedly this was a while back.
>
> Best regards,
>
> Jacob Faibussowitsch
> (Jacob Fai - booss - oh - vitch)
> Cell: (312) 694-3391
>
> On Jan 10, 2021, at 13:10, Mark Adams <mfadams at lbl.gov> wrote:
>
>
>
> On Sat, Jan 9, 2021 at 7:37 PM Jacob Faibussowitsch <jacob.fai at gmail.com>
> wrote:
>
>> It is a single object that holds a pointer to every stream implementation
>> and toggleable type so it can be universally passed around. Currently has a
>> cudaStream and a hipStream but this is easily extendable to any other
>> stream implementation.
>>
>
> Do you have any thoughts on how this would work with Kokkos?
>
> Would you want to feed Kokkos your Cuda/Hip, etc, stream or add a Kokkos
> backend to your object?
>
> Junchao might be the person to ask. I would guess Kokkos View (vector)
> objects carry a stream because they block on a "deep_copy", that moves data
> to/from the GPU, and it is blocking.
>
> Thanks,
> Mark
>
>
>> Best regards,
>>
>> Jacob Faibussowitsch
>> (Jacob Fai - booss - oh - vitch)
>> Cell: +1 (312) 694-3391
>>
>> On Jan 9, 2021, at 18:19, Mark Adams <mfadams at lbl.gov> wrote:
>>
>> 
>> Is this stream object going to have Cuda, Kokkos, etc., implementations?
>>
>> On Sat, Jan 9, 2021 at 4:09 PM Jacob Faibussowitsch <jacob.fai at gmail.com>
>> wrote:
>>
>>> I’m currently working on an implementation of a general PetscStream
>>> object. Currently it only supports Vector ops and has a proof of concept
>>> KSPCG, but should be extensible to other objects when finished. Junchao is
>>> also indirectly working on pipeline support in his NVSHMEM MR. Take a look
>>> at either MR, it would be very useful to get your input, as tailoring
>>> either of these approaches for pipelined algorithms is key.
>>>
>>> Best regards,
>>>
>>> Jacob Faibussowitsch
>>> (Jacob Fai - booss - oh - vitch)
>>> Cell: (312) 694-3391
>>>
>>> On Jan 9, 2021, at 15:01, Mark Adams <mfadams at lbl.gov> wrote:
>>>
>>> I would like to put a non-overlapping ASM solve on the GPU. It's not
>>> clear that we have a model for this.
>>>
>>> PCApply_ASM currently pipelines the scater with the subdomain solves. I
>>> think we would want to change this and do a 1) scatter begin loop, 2)
>>> scatter end and non-blocking solve loop, 3) solve-wait and scatter
>>> begging loop and 4) scatter end loop.
>>>
>>> I'm not sure how to go about doing this.
>>>  * Should we make a new PCApply_ASM_PARALLEL or dump this pipelining
>>> algorithm and rewrite PCApply_ASM?
>>>  * Add a solver-wait method to KSP?
>>>
>>> Thoughts?
>>>
>>> Mark
>>>
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20210111/5eb2f525/attachment.html>


More information about the petsc-dev mailing list