[petsc-users] building kokkos matrices on the device

Barry Smith bsmith at petsc.dev
Wed Feb 26 11:02:16 CST 2025


    The new function doesn't seem to have anything to do with Kokkos so why have any new functions? Just have MatCreateMPIAIJWithSeqAIJ() work properly when the two matrices are Kokkos (or CUDA or HIP).   Or if you want to eliminate the global reduction maybe make your new function MatCreateMPIWithSeq() and have it work for any type of submatrix and eventually we could deprecate the MatCreateMPIAIJWithSeqAIJ() 

   Barry




> On Feb 26, 2025, at 11:54 AM, Steven Dargaville <dargaville.steven at gmail.com> wrote:
> 
> I think that sounds great, I'm happy to put together an MR (likely next week) for review. 
> 
> On Wed, 26 Feb 2025 at 16:11, Junchao Zhang <junchao.zhang at gmail.com <mailto:junchao.zhang at gmail.com>> wrote:
>> This fuction MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, const PetscInt garray[], Mat *mat) is rarely used. To compute the global matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think it is a waste, as the caller usually knows M, N already. So I think we can depart from it and have a new one:
>> 
>> MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) 
>> * M, N are global row/col size of  mat
>> * A, B are MATSEQAIJKOKKOS
>> * M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from A, i.e.,  M = Sum of A's M,  N= Sum of A's N
>> * if garray is NULL, B uses global column indices (and B's N  should be equal to the output mat's N)  
>> * if garray is not NULL, B uses local column indices; garray[] was allocated by PetscMalloc() and after the call,  garray will be owned by mat (user should not free garray afterwards).
>>  
>> What do you think? If you agree, could you contribute an MR?
>> 
>> BTW, I think we need to create a new header, petscmat_kokkos.hpp to declare
>>   PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, KokkosCsrMatrix csr, Mat *A)
>> but 
>>   PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) 
>> can be in petscmat.h as it uses only C types.
>> 
>> Barry, what do you think of the two new APIs?
>> 
>> --Junchao Zhang
>> 
>> 
>> On Wed, Feb 26, 2025 at 6:26 AM Steven Dargaville <dargaville.steven at gmail.com <mailto:dargaville.steven at gmail.com>> wrote:
>>> Those two constructors would definitely meet my needs, thanks!
>>> 
>>> Also I should note that the comment about garray and B in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray is passed in as NULL, it's just that if you pass in a completed garray it doesn't bother creating one or changing the column indices of B. So I would suggest the comment be: "if garray is NULL the offdiag matrix B should have global col ids; if garray is not NULL the offdiag matrix B should have local col ids"
>>> 
>>> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang <junchao.zhang at gmail.com <mailto:junchao.zhang at gmail.com>> wrote:
>>>> Mat_SeqAIJKokkos is private because it is in a private header src/mat/impls/aij/seq/kokkos/aijkok.hpp
>>>> 
>>>> Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices() might be right.  The comment
>>>> 
>>>> -  B     - the offdiag matrix using global col ids
>>>> 
>>>> is out of date. Perhaps it should be "the offdiag matrix uses local column indices and garray contains the local to global mapping".  But I need to double check it.
>>>> 
>>>> Since you use Kokkos, I think we could provide these two constructors for MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively
>>>> MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, KokkosCsrMatrix csr, Mat *A)
>>>> MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat B, PetscInt *garray, Mat *mat) 
>>>>          // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, const PetscInt garray[], Mat *mat);
>>>>          // A and B are MATSEQAIJKOKKOS matrices and use local column indices
>>>> 
>>>> Do they meet your needs?
>>>> 
>>>> --Junchao Zhang
>>>> 
>>>> 
>>>> On Tue, Feb 25, 2025 at 5:35 PM Steven Dargaville <dargaville.steven at gmail.com <mailto:dargaville.steven at gmail.com>> wrote:
>>>>> Thanks for the response!
>>>>> 
>>>>> Although MatSetValuesCOO happens on the device if the input coo_v pointer is device memory, I believe MatSetPreallocationCOO requires host pointers for coo_i and coo_j, and the preallocation (and construction of the COO structures) happens on the host and is then copied onto the device? I need to be able to create a matrix object with minimal work on the host (like many of the routines in aijkok.kokkos.cxx do internally). I originally used the COO interface to build the matrices I need, but that was around 5x slower than constructing the aij structures myself on the device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix type methods.
>>>>> 
>>>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be made public is that the Mat_SeqAIJKokkos constructors are already publicly accessible? In particular one of those constructors takes in pointers to the Kokkos dual views which store a,i,j, and hence one can build a sequential matrix with nothing (or very little) occuring on the host. The only change I can see that would be necessary is for MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to be public is to change the PETSC_INTERN to PETSC_EXTERN?
>>>>> 
>>>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all that is required is declaring the method in the .hpp, as it's already defined as static in mpiaijkok.kokkos.cxx. In particular, the comments above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the off-diagonal block B needs to be built with global column ids, with mpiaij->garray constructed on the host along with the rewriting of the global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but checking the code there shows that if you pass in a non-null garray to MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and compatification is skipped, meaning B can be built with local column ids as long as garray is provided on the host (which I also build on the device and then just copy to the host). Again this is what some of the internal Kokkos routines rely on, like the matrix-product.  
>>>>> 
>>>>> I am happy to try doing this and submitting a request to the petsc gitlab if this seems sensible, I just wanted to double check that I wasn't missing something important?
>>>>> Thanks
>>>>> Steven
>>>>> 
>>>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang <junchao.zhang at gmail.com <mailto:junchao.zhang at gmail.com>> wrote:
>>>>>> Hi, Steven,
>>>>>>   MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses a private data type Mat_SeqAIJKokkos, so it can not be directly made public. 
>>>>>>   If you already use COO, then why not directly make the matrix of type MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()?
>>>>>>   So I am confused by your needs. 
>>>>>> 
>>>>>> Thanks!
>>>>>> --Junchao Zhang
>>>>>> 
>>>>>> 
>>>>>> On Tue, Feb 25, 2025 at 3:39 PM Steven Dargaville <dargaville.steven at gmail.com <mailto:dargaville.steven at gmail.com>> wrote:
>>>>>>> Hi
>>>>>>> 
>>>>>>> I'm just wondering if there is any possibility of making:
>>>>>>> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix in src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx 
>>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx
>>>>>>> 
>>>>>>> publicly accessible outside of petsc, or if there is an interface I have missed for creating Kokkos matrices entirely on the device? MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so I can't link to it.
>>>>>>> 
>>>>>>> I've currently just copied the code inside of those methods so that I can build without any preallocation on the host (e.g., through the COO interface) and it works really well.
>>>>>>> 
>>>>>>> Thanks for your help
>>>>>>> Steven

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250226/bbaf318c/attachment.html>


More information about the petsc-users mailing list