[petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

Junchao Zhang junchao.zhang at gmail.com
Wed Jan 4 18:09:02 CST 2023


On Wed, Jan 4, 2023 at 6:02 PM Matthew Knepley <knepley at gmail.com> wrote:

> On Wed, Jan 4, 2023 at 6:49 PM Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>>
>> On Wed, Jan 4, 2023 at 5:40 PM Mark Lohry <mlohry at gmail.com> wrote:
>>
>>> Oh, is the device backend not known at compile time?
>>>
>> Currently it is known at compile time.
>>
>
> Are you sure? I don't think it is known at compile time.
>
We define either PETSC_HAVE_CUDA or PETSC_HAVE_HIP or NONE, but not both


>
>   Thanks,
>
>      Matt
>
>
>> Or multiple backends can be alive at once?
>>>
>>
>> Some petsc developers (Jed and Barry) want to support this, but we are
>> incapable now.
>>
>>
>>>
>>> On Wed, Jan 4, 2023, 6:27 PM Junchao Zhang <junchao.zhang at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry <mlohry at gmail.com> wrote:
>>>>
>>>>> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we
>>>>>> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD
>>>>>> GPUs, ...
>>>>>
>>>>>
>>>>> Wouldn't one function suffice? Assuming these are contiguous arrays in
>>>>> CSR format, they're just raw device pointers in all cases.
>>>>>
>>>> But we need to know what device it is (to dispatch to either petsc-CUDA
>>>> or petsc-HIP backend)
>>>>
>>>>
>>>>>
>>>>> On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang <junchao.zhang at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for
>>>>>> GPUs. Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we
>>>>>> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD
>>>>>> GPUs, ...
>>>>>>
>>>>>> The real problem I think is to deal with multiple MPI ranks.
>>>>>> Providing the split arrays for petsc MATMPIAIJ is not easy and thus is
>>>>>> discouraged for users to do so.
>>>>>>
>>>>>> A workaround is to let petsc build the matrix and allocate the
>>>>>> memory, then you call MatSeqAIJCUSPARSEGetArray() to get the array and fill
>>>>>> it up.
>>>>>>
>>>>>> We recently added routines to support matrix assembly on GPUs, see if
>>>>>>  MatSetValuesCOO
>>>>>> <https://petsc.org/release/docs/manualpages/Mat/MatSetValuesCOO/>
>>>>>>  helps
>>>>>>
>>>>>> --Junchao Zhang
>>>>>>
>>>>>>
>>>>>> On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry <mlohry at gmail.com> wrote:
>>>>>>
>>>>>>> I have a sparse matrix constructed in non-petsc code using a
>>>>>>> standard CSR representation where I compute the Jacobian to be used in an
>>>>>>> implicit TS context. In the CPU world I call
>>>>>>>
>>>>>>> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols, rowidxptr,
>>>>>>> colidxptr, valptr, Jac);
>>>>>>>
>>>>>>> which as I understand it -- (1) never copies/allocates that
>>>>>>> information, and the matrix Jac is just a non-owning view into the already
>>>>>>> allocated CSR, (2) I can write directly into the original data structures
>>>>>>> and the Mat just "knows" about it, although it still needs a call to
>>>>>>> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far this
>>>>>>> works great with GAMG.
>>>>>>>
>>>>>>> I have the same CSR representation filled in GPU data allocated with
>>>>>>> cudaMalloc and filled on-device. Is there an equivalent Mat constructor for
>>>>>>> GPU arrays, or some other way to avoid unnecessary copies?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Mark
>>>>>>>
>>>>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230104/2942f2df/attachment-0001.html>


More information about the petsc-users mailing list