[petsc-users] kokkos, matcopy, matdiagonalscale on gpu

Junchao Zhang junchao.zhang at gmail.com
Tue Apr 8 00:04:30 CDT 2025


Hi, Steven
  Thank you for the bug report and test example.  You were right. The
MatCopy(A, B,..) implementation was wrong when B was on the device.
  I have a fix at https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8288__;!!G_uCfscf7eWS!e15LlybplgvD-3BTmMELbBy9EiU1biKGxn_qlVmPVnTPUjMYT9nZ7kf3ZtjPPhbQL7zFDI8QGuKk_V-tW9IYCL3xFVri$ , and
will add your test to the MR tomorrow.

  Thanks!
--Junchao Zhang


On Mon, Apr 7, 2025 at 4:57 PM Steven Dargaville <
dargaville.steven at gmail.com> wrote:

> Hi
>
> I have some code that computes a MatMatMult, then modifies one of the
> matrices and calls the MatMatMult again with MAT_REUSE_MATRIX. This gives
> the correct results with mat_types aij, aijkokkos and either run on a CPU
> or a GPU.
>
> In some cases one of the matrices is diagonal so I have been modifying my
> code to call MatDiagonalScale instead and have seen some peculiar
> behaviour. If the mat type is aij, everything works correctly on the CPU
> and GPU. If the mat type is aijkokkos, everything works correctly on the
> CPU but when run on a machine with a GPU the results differ.
>
> I have some example code below that shows this, it prints out four
> matrices; the first two should match and the last two should match. To see
> the failure run with "-mat_type aijkokkos" on a machine with a GPU (again
> this gives the correct results with aijkokkos run on a CPU). I get the
> output:
>
> Mat Object: 1 MPI process
>   type: seqaijkokkos
> row 0: (0, 4.)  (1, 6.)
> row 1: (0, 10.)  (1, 14.)
> Mat Object: 1 MPI process
>   type: seqaijkokkos
> row 0: (0, 4.)  (1, 6.)
> row 1: (0, 10.)  (1, 14.)
> Mat Object: 1 MPI process
>   type: seqaijkokkos
> row 0: (0, 6.)  (1, 9.)
> row 1: (0, 15.)  (1, 21.)
> Mat Object: 1 MPI process
>   type: seqaijkokkos
> row 0: (0, 12.)  (1, 18.)
> row 1: (0, 30.)  (1, 42.)
>
> I have narrowed down this failure to a MatCopy call, where the values of
> result_diag should be overwritten with A before calling MatDiagonalScale.
> The results with aijkokos on a GPU suggest the values of result_diag are
> not being changed. If instead of using the MatCopy I destroy result_diag
> and call MatDuplicate, the results are correct. You can trigger the correct
> behaviour with "-mat_type aijkokkos -correct".
>
> To me this looks like the values are out of sync between the device/host,
> but I would have thought a call to MatCopy would be safe. Any thoughts on
> what I might be doing wrong?
>
> Thanks for your help!
> Steven
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> static char help[] = "Test matmat products with matdiagonal on gpus \n\n";
>
> #include <iostream>
> #include <petscmat.h>
>
>
> int main(int argc, char **args)
> {
>   const PetscInt    inds[]  = {0, 1};
>   PetscScalar       avals[] = {2, 3, 5, 7};
>   Mat               A, B_diag, B_aij_diag, result, result_diag;
>   Vec               diag;
>
>   PetscCall(PetscInitialize(&argc, &args, NULL, help));
>
>   // Create matrix to start
>   PetscCall(MatCreateFromOptions(PETSC_COMM_WORLD, NULL, 1, 2, 2, 2, 2,
> &A));
>   PetscCall(MatSetUp(A));
>   PetscCall(MatSetValues(A, 2, inds, 2, inds, avals, INSERT_VALUES));
>   PetscCall(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY));
>   PetscCall(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY));
>
>   // Create a matdiagonal matrix
>   // Will be the matching vec type as A
>   PetscCall(MatCreateVecs(A, &diag, NULL));
>   PetscCall(VecSet(diag, 2.0));
>   PetscCall(MatCreateDiagonal(diag, &B_diag));
>
>   // Create the same matrix as the matdiagonal but in aij format
>   PetscCall(MatCreateFromOptions(PETSC_COMM_WORLD, NULL, 1, 2, 2, 2, 2,
> &B_aij_diag));
>   PetscCall(MatSetUp(B_aij_diag));
>   PetscCall(MatDiagonalSet(B_aij_diag, diag, INSERT_VALUES));
>   PetscCall(MatAssemblyBegin(B_aij_diag, MAT_FINAL_ASSEMBLY));
>   PetscCall(MatAssemblyEnd(B_aij_diag, MAT_FINAL_ASSEMBLY));
>   PetscCall(VecDestroy(&diag));
>
>   // ~~~~~~~~~~~~~
>   // Do an initial matmatmult
>   // A * B_aij_diag
>   // and then
>   // A * B_diag but just using MatDiagonalScale
>   // ~~~~~~~~~~~~~
>
>   // aij * aij
>   PetscCall(MatMatMult(A, B_aij_diag, MAT_INITIAL_MATRIX, 1.5, &result));
>   PetscCall(MatView(result, PETSC_VIEWER_STDOUT_WORLD));
>
>   // aij * diagonal
>   PetscCall(MatDuplicate(A, MAT_COPY_VALUES, &result_diag));
>   PetscCall(MatDiagonalGetDiagonal(B_diag, &diag));
>   PetscCall(MatDiagonalScale(result_diag, NULL, diag));
>   PetscCall(MatDiagonalRestoreDiagonal(B_diag, &diag));
>   PetscCall(MatView(result_diag, PETSC_VIEWER_STDOUT_WORLD));
>
>   // ~~~~~~~~~~~~~
>   // Now let's modify the diagonal and do it again with "reuse"
>   // ~~~~~~~~~~~~~
>   PetscCall(MatDiagonalGetDiagonal(B_diag, &diag));
>   PetscCall(VecSet(diag, 3.0));
>   PetscCall(MatDiagonalSet(B_aij_diag, diag, INSERT_VALUES));
>   PetscCall(MatDiagonalRestoreDiagonal(B_diag, &diag));
>
>   // aij * aij
>   PetscCall(MatMatMult(A, B_aij_diag, MAT_REUSE_MATRIX, 1.5, &result));
>   PetscCall(MatView(result, PETSC_VIEWER_STDOUT_WORLD));
>
>   PetscBool correct = PETSC_FALSE;
>   PetscCall(PetscOptionsGetBool(NULL, NULL, "-correct", &correct, NULL));
>
>   if (!correct)
>   {
>       // This gives the wrong results below when run on gpu
>       // Results suggest this isn't copied
>       PetscCall(MatCopy(A, result_diag, SAME_NONZERO_PATTERN));
>   }
>   else
>   {
>       // This gives the correct results below
>       PetscCall(MatDestroy(&result_diag));
>       PetscCall(MatDuplicate(A, MAT_COPY_VALUES, &result_diag));
>   }
>
>   // aij * diagonal
>   PetscCall(MatDiagonalGetDiagonal(B_diag, &diag));
>   PetscCall(MatDiagonalScale(result_diag, NULL, diag));
>   PetscCall(MatDiagonalRestoreDiagonal(B_diag, &diag));
>   PetscCall(MatView(result_diag, PETSC_VIEWER_STDOUT_WORLD));
>
>   PetscCall(PetscFinalize());
>   return 0;
> }
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250408/d395826a/attachment-0001.html>


More information about the petsc-users mailing list