[petsc-dev] model for parallel ASM
Mark Adams
mfadams at lbl.gov
Sun Jan 10 13:10:12 CST 2021
On Sat, Jan 9, 2021 at 7:37 PM Jacob Faibussowitsch <jacob.fai at gmail.com>
wrote:
> It is a single object that holds a pointer to every stream implementation
> and toggleable type so it can be universally passed around. Currently has a
> cudaStream and a hipStream but this is easily extendable to any other
> stream implementation.
>
Do you have any thoughts on how this would work with Kokkos?
Would you want to feed Kokkos your Cuda/Hip, etc, stream or add a Kokkos
backend to your object?
Junchao might be the person to ask. I would guess Kokkos View (vector)
objects carry a stream because they block on a "deep_copy", that moves data
to/from the GPU, and it is blocking.
Thanks,
Mark
> Best regards,
>
> Jacob Faibussowitsch
> (Jacob Fai - booss - oh - vitch)
> Cell: +1 (312) 694-3391
>
> On Jan 9, 2021, at 18:19, Mark Adams <mfadams at lbl.gov> wrote:
>
>
> Is this stream object going to have Cuda, Kokkos, etc., implementations?
>
> On Sat, Jan 9, 2021 at 4:09 PM Jacob Faibussowitsch <jacob.fai at gmail.com>
> wrote:
>
>> I’m currently working on an implementation of a general PetscStream
>> object. Currently it only supports Vector ops and has a proof of concept
>> KSPCG, but should be extensible to other objects when finished. Junchao is
>> also indirectly working on pipeline support in his NVSHMEM MR. Take a look
>> at either MR, it would be very useful to get your input, as tailoring
>> either of these approaches for pipelined algorithms is key.
>>
>> Best regards,
>>
>> Jacob Faibussowitsch
>> (Jacob Fai - booss - oh - vitch)
>> Cell: (312) 694-3391
>>
>> On Jan 9, 2021, at 15:01, Mark Adams <mfadams at lbl.gov> wrote:
>>
>> I would like to put a non-overlapping ASM solve on the GPU. It's not
>> clear that we have a model for this.
>>
>> PCApply_ASM currently pipelines the scater with the subdomain solves. I
>> think we would want to change this and do a 1) scatter begin loop, 2)
>> scatter end and non-blocking solve loop, 3) solve-wait and scatter
>> begging loop and 4) scatter end loop.
>>
>> I'm not sure how to go about doing this.
>> * Should we make a new PCApply_ASM_PARALLEL or dump this pipelining
>> algorithm and rewrite PCApply_ASM?
>> * Add a solver-wait method to KSP?
>>
>> Thoughts?
>>
>> Mark
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20210110/bf5a23cb/attachment.html>
More information about the petsc-dev
mailing list