[petsc-users] kokkos, matcopy, matdiagonalscale on gpu
Steven Dargaville
dargaville.steven at gmail.com
Tue Apr 8 13:26:47 CDT 2025
Fantastic, that seems to have fixed it. Thanks for your help!
On Tue, 8 Apr 2025 at 06:04, Junchao Zhang <junchao.zhang at gmail.com> wrote:
> Hi, Steven
> Thank you for the bug report and test example. You were right. The
> MatCopy(A, B,..) implementation was wrong when B was on the device.
> I have a fix at https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8288__;!!G_uCfscf7eWS!czsmOAhl7kNlur21sGgNTcYMe79EP113IoDsGXFSWeuR9b1LW7-vxQgIfp2YGSsKyiIR3s193a6KcdEqeQAv-XxqSVRCPUHa$ ,
> and will add your test to the MR tomorrow.
>
> Thanks!
> --Junchao Zhang
>
>
> On Mon, Apr 7, 2025 at 4:57 PM Steven Dargaville <
> dargaville.steven at gmail.com> wrote:
>
>> Hi
>>
>> I have some code that computes a MatMatMult, then modifies one of the
>> matrices and calls the MatMatMult again with MAT_REUSE_MATRIX. This gives
>> the correct results with mat_types aij, aijkokkos and either run on a CPU
>> or a GPU.
>>
>> In some cases one of the matrices is diagonal so I have been modifying my
>> code to call MatDiagonalScale instead and have seen some peculiar
>> behaviour. If the mat type is aij, everything works correctly on the CPU
>> and GPU. If the mat type is aijkokkos, everything works correctly on the
>> CPU but when run on a machine with a GPU the results differ.
>>
>> I have some example code below that shows this, it prints out four
>> matrices; the first two should match and the last two should match. To see
>> the failure run with "-mat_type aijkokkos" on a machine with a GPU (again
>> this gives the correct results with aijkokkos run on a CPU). I get the
>> output:
>>
>> Mat Object: 1 MPI process
>> type: seqaijkokkos
>> row 0: (0, 4.) (1, 6.)
>> row 1: (0, 10.) (1, 14.)
>> Mat Object: 1 MPI process
>> type: seqaijkokkos
>> row 0: (0, 4.) (1, 6.)
>> row 1: (0, 10.) (1, 14.)
>> Mat Object: 1 MPI process
>> type: seqaijkokkos
>> row 0: (0, 6.) (1, 9.)
>> row 1: (0, 15.) (1, 21.)
>> Mat Object: 1 MPI process
>> type: seqaijkokkos
>> row 0: (0, 12.) (1, 18.)
>> row 1: (0, 30.) (1, 42.)
>>
>> I have narrowed down this failure to a MatCopy call, where the values of
>> result_diag should be overwritten with A before calling MatDiagonalScale.
>> The results with aijkokos on a GPU suggest the values of result_diag are
>> not being changed. If instead of using the MatCopy I destroy result_diag
>> and call MatDuplicate, the results are correct. You can trigger the correct
>> behaviour with "-mat_type aijkokkos -correct".
>>
>> To me this looks like the values are out of sync between the device/host,
>> but I would have thought a call to MatCopy would be safe. Any thoughts on
>> what I might be doing wrong?
>>
>> Thanks for your help!
>> Steven
>>
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> static char help[] = "Test matmat products with matdiagonal on gpus \n\n";
>>
>> #include <iostream>
>> #include <petscmat.h>
>>
>>
>> int main(int argc, char **args)
>> {
>> const PetscInt inds[] = {0, 1};
>> PetscScalar avals[] = {2, 3, 5, 7};
>> Mat A, B_diag, B_aij_diag, result, result_diag;
>> Vec diag;
>>
>> PetscCall(PetscInitialize(&argc, &args, NULL, help));
>>
>> // Create matrix to start
>> PetscCall(MatCreateFromOptions(PETSC_COMM_WORLD, NULL, 1, 2, 2, 2, 2,
>> &A));
>> PetscCall(MatSetUp(A));
>> PetscCall(MatSetValues(A, 2, inds, 2, inds, avals, INSERT_VALUES));
>> PetscCall(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY));
>> PetscCall(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY));
>>
>> // Create a matdiagonal matrix
>> // Will be the matching vec type as A
>> PetscCall(MatCreateVecs(A, &diag, NULL));
>> PetscCall(VecSet(diag, 2.0));
>> PetscCall(MatCreateDiagonal(diag, &B_diag));
>>
>> // Create the same matrix as the matdiagonal but in aij format
>> PetscCall(MatCreateFromOptions(PETSC_COMM_WORLD, NULL, 1, 2, 2, 2, 2,
>> &B_aij_diag));
>> PetscCall(MatSetUp(B_aij_diag));
>> PetscCall(MatDiagonalSet(B_aij_diag, diag, INSERT_VALUES));
>> PetscCall(MatAssemblyBegin(B_aij_diag, MAT_FINAL_ASSEMBLY));
>> PetscCall(MatAssemblyEnd(B_aij_diag, MAT_FINAL_ASSEMBLY));
>> PetscCall(VecDestroy(&diag));
>>
>> // ~~~~~~~~~~~~~
>> // Do an initial matmatmult
>> // A * B_aij_diag
>> // and then
>> // A * B_diag but just using MatDiagonalScale
>> // ~~~~~~~~~~~~~
>>
>> // aij * aij
>> PetscCall(MatMatMult(A, B_aij_diag, MAT_INITIAL_MATRIX, 1.5, &result));
>> PetscCall(MatView(result, PETSC_VIEWER_STDOUT_WORLD));
>>
>> // aij * diagonal
>> PetscCall(MatDuplicate(A, MAT_COPY_VALUES, &result_diag));
>> PetscCall(MatDiagonalGetDiagonal(B_diag, &diag));
>> PetscCall(MatDiagonalScale(result_diag, NULL, diag));
>> PetscCall(MatDiagonalRestoreDiagonal(B_diag, &diag));
>> PetscCall(MatView(result_diag, PETSC_VIEWER_STDOUT_WORLD));
>>
>> // ~~~~~~~~~~~~~
>> // Now let's modify the diagonal and do it again with "reuse"
>> // ~~~~~~~~~~~~~
>> PetscCall(MatDiagonalGetDiagonal(B_diag, &diag));
>> PetscCall(VecSet(diag, 3.0));
>> PetscCall(MatDiagonalSet(B_aij_diag, diag, INSERT_VALUES));
>> PetscCall(MatDiagonalRestoreDiagonal(B_diag, &diag));
>>
>> // aij * aij
>> PetscCall(MatMatMult(A, B_aij_diag, MAT_REUSE_MATRIX, 1.5, &result));
>> PetscCall(MatView(result, PETSC_VIEWER_STDOUT_WORLD));
>>
>> PetscBool correct = PETSC_FALSE;
>> PetscCall(PetscOptionsGetBool(NULL, NULL, "-correct", &correct, NULL));
>>
>> if (!correct)
>> {
>> // This gives the wrong results below when run on gpu
>> // Results suggest this isn't copied
>> PetscCall(MatCopy(A, result_diag, SAME_NONZERO_PATTERN));
>> }
>> else
>> {
>> // This gives the correct results below
>> PetscCall(MatDestroy(&result_diag));
>> PetscCall(MatDuplicate(A, MAT_COPY_VALUES, &result_diag));
>> }
>>
>> // aij * diagonal
>> PetscCall(MatDiagonalGetDiagonal(B_diag, &diag));
>> PetscCall(MatDiagonalScale(result_diag, NULL, diag));
>> PetscCall(MatDiagonalRestoreDiagonal(B_diag, &diag));
>> PetscCall(MatView(result_diag, PETSC_VIEWER_STDOUT_WORLD));
>>
>> PetscCall(PetscFinalize());
>> return 0;
>> }
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250408/5d76a1d3/attachment-0001.html>
More information about the petsc-users
mailing list