[petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

Junchao Zhang junchao.zhang at gmail.com
Wed Jan 4 17:27:08 CST 2023


On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry <mlohry at gmail.com> wrote:

> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we
>> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD
>> GPUs, ...
>
>
> Wouldn't one function suffice? Assuming these are contiguous arrays in CSR
> format, they're just raw device pointers in all cases.
>
But we need to know what device it is (to dispatch to either petsc-CUDA or
petsc-HIP backend)


>
> On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for GPUs.
>> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we would
>> need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD GPUs, ...
>>
>> The real problem I think is to deal with multiple MPI ranks. Providing
>> the split arrays for petsc MATMPIAIJ is not easy and thus is discouraged
>> for users to do so.
>>
>> A workaround is to let petsc build the matrix and allocate the memory,
>> then you call MatSeqAIJCUSPARSEGetArray() to get the array and fill it up.
>>
>> We recently added routines to support matrix assembly on GPUs, see if
>> MatSetValuesCOO
>> <https://petsc.org/release/docs/manualpages/Mat/MatSetValuesCOO/> helps
>>
>> --Junchao Zhang
>>
>>
>> On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry <mlohry at gmail.com> wrote:
>>
>>> I have a sparse matrix constructed in non-petsc code using a standard
>>> CSR representation where I compute the Jacobian to be used in an implicit
>>> TS context. In the CPU world I call
>>>
>>> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols, rowidxptr,
>>> colidxptr, valptr, Jac);
>>>
>>> which as I understand it -- (1) never copies/allocates that information,
>>> and the matrix Jac is just a non-owning view into the already allocated
>>> CSR, (2) I can write directly into the original data structures and the Mat
>>> just "knows" about it, although it still needs a call to
>>> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far this
>>> works great with GAMG.
>>>
>>> I have the same CSR representation filled in GPU data allocated with
>>> cudaMalloc and filled on-device. Is there an equivalent Mat constructor for
>>> GPU arrays, or some other way to avoid unnecessary copies?
>>>
>>> Thanks,
>>> Mark
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230104/f4f9b5ab/attachment.html>


More information about the petsc-users mailing list