[petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

Thu Jan 5 04:24:42 CST 2023

Support of HIP and CUDA hardware together would be crazy, but supporting
(Kokkos) OpenMP and a device backend would require something that looks
like two device back-ends and long term CPUs (eg, Grace) may not go away
wrt kernels.
PETSc does not support OMP well, of course, but that support could grow if
that is where hardware and applications go.

On Wed, Jan 4, 2023 at 7:38 PM Mark Lohry <mlohry at gmail.com> wrote:

> You have my condolences if you have to support all those things
> simultaneously.
>
> On Wed, Jan 4, 2023, 7:27 PM Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Wed, Jan 4, 2023 at 7:22 PM Junchao Zhang <junchao.zhang at gmail.com>
>> wrote:
>>
>>> We don't have a machine for us to test with both "--with-cuda --with-hip"
>>>
>>
>> Yes, but your answer suggested that the structure of the code prevented
>> this combination.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> --Junchao Zhang
>>>
>>>
>>> On Wed, Jan 4, 2023 at 6:17 PM Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>>
>>>> On Wed, Jan 4, 2023 at 7:09 PM Junchao Zhang <junchao.zhang at gmail.com>
>>>> wrote:
>>>>
>>>>> On Wed, Jan 4, 2023 at 6:02 PM Matthew Knepley <knepley at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On Wed, Jan 4, 2023 at 6:49 PM Junchao Zhang <junchao.zhang at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> On Wed, Jan 4, 2023 at 5:40 PM Mark Lohry <mlohry at gmail.com> wrote:
>>>>>>>
>>>>>>>> Oh, is the device backend not known at compile time?
>>>>>>>>
>>>>>>> Currently it is known at compile time.
>>>>>>>
>>>>>>
>>>>>> Are you sure? I don't think it is known at compile time.
>>>>>>
>>>>> We define either PETSC_HAVE_CUDA or PETSC_HAVE_HIP or NONE, but not
>>>>> both
>>>>>
>>>>
>>>> Where is the logic for that in the code? This seems like a crazy design.
>>>>
>>>>   Thanks,
>>>>
>>>>     Matt
>>>>
>>>>
>>>>>   Thanks,
>>>>>>
>>>>>>      Matt
>>>>>>
>>>>>>
>>>>>>> Or multiple backends can be alive at once?
>>>>>>>>
>>>>>>>
>>>>>>> Some petsc developers (Jed and Barry) want to support this, but we
>>>>>>> are incapable now.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jan 4, 2023, 6:27 PM Junchao Zhang <junchao.zhang at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry <mlohry at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but
>>>>>>>>>>> then we would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on
>>>>>>>>>>> AMD GPUs, ...
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Wouldn't one function suffice? Assuming these are contiguous
>>>>>>>>>> arrays in CSR format, they're just raw device pointers in all cases.
>>>>>>>>>>
>>>>>>>>> But we need to know what device it is (to dispatch to either
>>>>>>>>> petsc-CUDA or petsc-HIP backend)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang <
>>>>>>>>>> junchao.zhang at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> No, we don't have a counterpart of MatCreateSeqAIJWithArrays()
>>>>>>>>>>> for GPUs. Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but
>>>>>>>>>>> then we would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on
>>>>>>>>>>> AMD GPUs, ...
>>>>>>>>>>>
>>>>>>>>>>> The real problem I think is to deal with multiple MPI ranks.
>>>>>>>>>>> Providing the split arrays for petsc MATMPIAIJ is not easy and thus is
>>>>>>>>>>> discouraged for users to do so.
>>>>>>>>>>>
>>>>>>>>>>> A workaround is to let petsc build the matrix and allocate the
>>>>>>>>>>> memory, then you call MatSeqAIJCUSPARSEGetArray() to get the array and fill
>>>>>>>>>>> it up.
>>>>>>>>>>>
>>>>>>>>>>> We recently added routines to support matrix assembly on GPUs,
>>>>>>>>>>> see if MatSetValuesCOO
>>>>>>>>>>> <https://petsc.org/release/docs/manualpages/Mat/MatSetValuesCOO/>
>>>>>>>>>>>  helps
>>>>>>>>>>>
>>>>>>>>>>> --Junchao Zhang
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry <mlohry at gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I have a sparse matrix constructed in non-petsc code using a
>>>>>>>>>>>> standard CSR representation where I compute the Jacobian to be used in an
>>>>>>>>>>>> implicit TS context. In the CPU world I call
>>>>>>>>>>>>
>>>>>>>>>>>> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols,
>>>>>>>>>>>> rowidxptr, colidxptr, valptr, Jac);
>>>>>>>>>>>>
>>>>>>>>>>>> which as I understand it -- (1) never copies/allocates that
>>>>>>>>>>>> information, and the matrix Jac is just a non-owning view into the already
>>>>>>>>>>>> allocated CSR, (2) I can write directly into the original data structures
>>>>>>>>>>>> and the Mat just "knows" about it, although it still needs a call to
>>>>>>>>>>>> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far this
>>>>>>>>>>>> works great with GAMG.
>>>>>>>>>>>>
>>>>>>>>>>>> I have the same CSR representation filled in GPU data allocated
>>>>>>>>>>>> with cudaMalloc and filled on-device. Is there an equivalent Mat
>>>>>>>>>>>> constructor for GPU arrays, or some other way to avoid unnecessary copies?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Mark
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>
>>>>>> --
>>>>>> What most experimenters take for granted before they begin their
>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>> experiments lead.
>>>>>> -- Norbert Wiener
>>>>>>
>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230105/49010fb0/attachment.html>