[petsc-users] building kokkos matrices on the device

Junchao Zhang junchao.zhang at gmail.com
Wed Feb 26 10:11:16 CST 2025


This fuction *MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, const
PetscInt garray[], Mat *mat) *is rarely used. To compute the global
matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think it is
a waste, as the caller usually knows M, N already. So I think we can depart
from it and have a new one:

MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt
N, Mat A, Mat B, const PetscInt garray[], Mat *mat)
* M, N are global row/col size of  mat
* A, B are MATSEQAIJKOKKOS
* M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from A,
i.e.,  M = Sum of A's M,  N= Sum of A's N
* if garray is NULL, B uses global column indices (and B's N  should be
equal to the output mat's N)
* if garray is not NULL, B uses local column indices; garray[] was
allocated by PetscMalloc() and after the call,  garray will be owned by mat
(user should not free garray afterwards).

What do you think? If you agree, could you contribute an MR?

BTW, I think we need to create a new header, petscmat_kokkos.hpp to declare
  PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm,
KokkosCsrMatrix csr, Mat *A)
but
PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm,
PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat)
can be in petscmat.h as it uses only C types.

Barry, what do you think of the two new APIs?

--Junchao Zhang


On Wed, Feb 26, 2025 at 6:26 AM Steven Dargaville <
dargaville.steven at gmail.com> wrote:

> Those two constructors would definitely meet my needs, thanks!
>
> Also I should note that the comment about garray and B in
> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray is
> passed in as NULL, it's just that if you pass in a completed garray it
> doesn't bother creating one or changing the column indices of B. So I would
> suggest the comment be: "if garray is NULL the offdiag matrix B should
> have global col ids; if garray is not NULL the offdiag matrix B should have
> local col ids"
>
> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> Mat_SeqAIJKokkos is private because it is in a private header
>> src/mat/impls/aij/seq/kokkos/aijkok.hpp
>>
>> Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices()
>> might be right.  The comment
>>
>> - B - the offdiag matrix using global col ids
>>
>> is out of date. Perhaps it should be "the offdiag matrix uses local
>> column indices and garray contains the local to global mapping".  But I
>> need to double check it.
>>
>> Since you use Kokkos, I think we could provide these two constructors for
>> MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively
>>
>>    - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm,
>>    KokkosCsrMatrix csr, Mat *A)
>>
>>
>>    - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat B,
>>    PetscInt *garray, Mat *mat)
>>
>>          // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm
>> comm, Mat A, Mat B, const PetscInt garray[], Mat *mat);
>>          // A and B are MATSEQAIJKOKKOS matrices and use local column
>> indices
>>
>> Do they meet your needs?
>>
>> --Junchao Zhang
>>
>>
>> On Tue, Feb 25, 2025 at 5:35 PM Steven Dargaville <
>> dargaville.steven at gmail.com> wrote:
>>
>>> Thanks for the response!
>>>
>>> Although MatSetValuesCOO happens on the device if the input coo_v
>>> pointer is device memory, I believe MatSetPreallocationCOO requires host
>>> pointers for coo_i and coo_j, and the preallocation (and construction of
>>> the COO structures) happens on the host and is then copied onto the device?
>>> I need to be able to create a matrix object with minimal work on the host
>>> (like many of the routines in aijkok.kokkos.cxx do internally). I
>>> originally used the COO interface to build the matrices I need, but that
>>> was around 5x slower than constructing the aij structures myself on the
>>> device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix
>>> type methods.
>>>
>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be
>>> made public is that the Mat_SeqAIJKokkos constructors are already publicly
>>> accessible? In particular one of those constructors takes in pointers to
>>> the Kokkos dual views which store a,i,j, and hence one can build a
>>> sequential matrix with nothing (or very little) occuring on the host. The
>>> only change I can see that would be necessary is for
>>> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to
>>> be public is to change the PETSC_INTERN to PETSC_EXTERN?
>>>
>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all that
>>> is required is declaring the method in the .hpp, as it's already defined as
>>> static in mpiaijkok.kokkos.cxx. In particular, the comments
>>> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the
>>> off-diagonal block B needs to be built with global column ids, with
>>> mpiaij->garray constructed on the host along with the rewriting of the
>>> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but
>>> checking the code there shows that if you pass in a non-null garray to
>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and
>>> compatification is skipped, meaning B can be built with local column ids as
>>> long as garray is provided on the host (which I also build on the device
>>> and then just copy to the host). Again this is what some of the internal
>>> Kokkos routines rely on, like the matrix-product.
>>>
>>> I am happy to try doing this and submitting a request to the petsc
>>> gitlab if this seems sensible, I just wanted to double check that I wasn't
>>> missing something important?
>>> Thanks
>>> Steven
>>>
>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang <junchao.zhang at gmail.com>
>>> wrote:
>>>
>>>> Hi, Steven,
>>>>   MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses
>>>> a private data type Mat_SeqAIJKokkos, so it can not be directly made
>>>> public.
>>>>   If you already use COO, then why not directly make the matrix of type
>>>> MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()?
>>>>   So I am confused by your needs.
>>>>
>>>> Thanks!
>>>> --Junchao Zhang
>>>>
>>>>
>>>> On Tue, Feb 25, 2025 at 3:39 PM Steven Dargaville <
>>>> dargaville.steven at gmail.com> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> I'm just wondering if there is any possibility of making:
>>>>> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix
>>>>> in src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx
>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in
>>>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx
>>>>>
>>>>> publicly accessible outside of petsc, or if there is an interface I
>>>>> have missed for creating Kokkos matrices entirely on the device?
>>>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so
>>>>> I can't link to it.
>>>>>
>>>>> I've currently just copied the code inside of those methods so that I
>>>>> can build without any preallocation on the host (e.g., through the COO
>>>>> interface) and it works really well.
>>>>>
>>>>> Thanks for your help
>>>>> Steven
>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250226/000a681d/attachment.html>


More information about the petsc-users mailing list