[petsc-users] Copying PETSc Objects Across MPI Communicators

Wed Oct 25 15:38:44 CDT 2023

More like smaller pieces that need to be combined. Combining them (merging) means sharing the actual data across a sibling communicator and doing some linear algebra to compute the merged matrices (it involves computing a Schur complement of a combined system from the sibling matrices).

The solver is based on the Hierarchical Poincaré-Steklov (HPS) method, a direct method for solving elliptic PDEs. I had a conversation with Richard at this year’s ATPESC2023 about this idea.

For some more context, here’s the test routine I wrote based on the MatCreateSubMatrix idea. The actual implementation would be part of a recursive merge up a quadtree. Each node's communicator would be a sub-communicator of its parent, and so on. I want to spread the data and compute across any ranks that are involved in that node’s merging. The sizes involved start “small” at each leaf node (say, no more than 256x256), then are essentially doubled up the tree to the root node.

```
void TEST_petsc_matrix_comm() {

    // Create local matrices on local communicator
    int M_local = 4;
    int N_local = 4;
    Mat mat_local;
    MatCreate(MPI_COMM_SELF, &mat_local); // Note the MPI_COMM_SELF as a substitute for a sub-communicator of MPI_COMM_WORLD
    MatSetSizes(mat_local, PETSC_DECIDE, PETSC_DECIDE, M_local, N_local);
    MatSetFromOptions(mat_local);

    // Set values in local matrix
    int* row_indices = (int*) malloc(M_local*sizeof(int));
    for (int i = 0; i < M_local; i++) {
        row_indices[i] = i;
    }

    int* col_indices = (int*) malloc(N_local*sizeof(int));
    for (int j = 0; j < M_local; j++) {;
        col_indices[j] = j;
    }

    double* values = (double*) malloc(M_local*N_local*sizeof(double));
    int v = M_local*N_local*rank;
    for (int j = 0; j < N_local; j++) {
        for (int i = 0; i < M_local; i++) {
            values[i + j*N_local] = (double) v;
            v++;
        }
    }
    MatSetValues(mat_local, M_local, row_indices, N_local, col_indices, values, INSERT_VALUES);
    MatAssemblyBegin(mat_local, MAT_FINAL_ASSEMBLY);
    MatAssemblyEnd(mat_local, MAT_FINAL_ASSEMBLY);

    // Create local matrices on global communicator
    Mat mat_global;
    IS is_row;
    int idx[4] = {0, 1, 2, 3};
    ISCreateGeneral(MPI_COMM_WORLD, M_local, idx, PETSC_COPY_VALUES, &is_row);
    MatCreateSubMatrix(mat_local, is_row, NULL, MAT_INITIAL_MATRIX, &mat_global);

    // View each local mat on global communicator (sleep for `rank` seconds so output is ordered)
    sleep(rank);
    MatView(mat_global, 0);

    // Create merged mat on global communicator
    //    For this test, I just put the four locally computed matrices on the diagonal of the merged matrix
    //    In the 4-to-1 merge, this would compute T_merged from T_alpha, T_beta, T_gamma, and T_omega (children)
    int M_merged = M_local*size;
    int N_merged = N_local*size;
    Mat mat_merged;
    MatCreate(MPI_COMM_WORLD, &mat_merged);
    MatSetSizes(mat_merged, PETSC_DECIDE, PETSC_DECIDE, M_merged, N_merged);
    MatSetFromOptions(mat_merged);

    // Get values of local matrix to put on diagonal
    double* values_diag = (double*) malloc(M_local*N_local*sizeof(double));
    MatGetValues(mat_global, M_local, row_indices, N_local, col_indices, values_diag);

    // Put local matrix contributions into merged matrix (placeholder for computing merged matrix)
    for (int i = 0; i < M_local; i++) {
        row_indices[i] = i + M_local*rank;
    }

    for (int j = 0; j < N_local; j++) {
        col_indices[j] = j + N_local*rank;
    }
    MatSetValues(mat_merged, M_local, row_indices, N_local, col_indices, values_diag, INSERT_VALUES);
    MatAssemblyBegin(mat_merged, MAT_FINAL_ASSEMBLY);
    MatAssemblyEnd(mat_merged, MAT_FINAL_ASSEMBLY);

    // View merged mat on global communicator
    sleep(rank);
    MatView(mat_merged, 0);

    // Clean up
    free(row_indices);
    free(col_indices);
    free(values);
    free(values_diag);
    MatDestroy(&mat_local);
    MatDestroy(&mat_global);
    MatDestroy(&mat_merged);

}
```

With the following output :

```
(base) ➜  mpi git:(feature-parallel) ✗ mpirun -n 4 ./mpi_matrix
Mat Object: 1 MPI process
  type: seqaij
row 0: (0, 0.)  (1, 1.)  (2, 2.)  (3, 3.) 
row 1: (0, 4.)  (1, 5.)  (2, 6.)  (3, 7.) 
row 2: (0, 8.)  (1, 9.)  (2, 10.)  (3, 11.) 
row 3: (0, 12.)  (1, 13.)  (2, 14.)  (3, 15.) 
Mat Object: 1 MPI process
  type: seqaij
row 0: (0, 16.)  (1, 17.)  (2, 18.)  (3, 19.) 
row 1: (0, 20.)  (1, 21.)  (2, 22.)  (3, 23.) 
row 2: (0, 24.)  (1, 25.)  (2, 26.)  (3, 27.) 
row 3: (0, 28.)  (1, 29.)  (2, 30.)  (3, 31.) 
Mat Object: 1 MPI process
  type: seqaij
row 0: (0, 32.)  (1, 33.)  (2, 34.)  (3, 35.) 
row 1: (0, 36.)  (1, 37.)  (2, 38.)  (3, 39.) 
row 2: (0, 40.)  (1, 41.)  (2, 42.)  (3, 43.) 
row 3: (0, 44.)  (1, 45.)  (2, 46.)  (3, 47.) 
Mat Object: 1 MPI process
  type: seqaij
row 0: (0, 48.)  (1, 49.)  (2, 50.)  (3, 51.) 
row 1: (0, 52.)  (1, 53.)  (2, 54.)  (3, 55.) 
row 2: (0, 56.)  (1, 57.)  (2, 58.)  (3, 59.) 
row 3: (0, 60.)  (1, 61.)  (2, 62.)  (3, 63.) 
Mat Object: 4 MPI processes
  type: mpiaij
row 0: (0, 0.)  (1, 1.)  (2, 2.)  (3, 3.) 
row 1: (0, 4.)  (1, 5.)  (2, 6.)  (3, 7.) 
row 2: (0, 8.)  (1, 9.)  (2, 10.)  (3, 11.) 
row 3: (0, 12.)  (1, 13.)  (2, 14.)  (3, 15.) 
row 4: (4, 16.)  (5, 17.)  (6, 18.)  (7, 19.) 
row 5: (4, 20.)  (5, 21.)  (6, 22.)  (7, 23.) 
row 6: (4, 24.)  (5, 25.)  (6, 26.)  (7, 27.) 
row 7: (4, 28.)  (5, 29.)  (6, 30.)  (7, 31.) 
row 8: (8, 32.)  (9, 33.)  (10, 34.)  (11, 35.) 
row 9: (8, 36.)  (9, 37.)  (10, 38.)  (11, 39.) 
row 10: (8, 40.)  (9, 41.)  (10, 42.)  (11, 43.) 
row 11: (8, 44.)  (9, 45.)  (10, 46.)  (11, 47.) 
row 12: (12, 48.)  (13, 49.)  (14, 50.)  (15, 51.) 
row 13: (12, 52.)  (13, 53.)  (14, 54.)  (15, 55.) 
row 14: (12, 56.)  (13, 57.)  (14, 58.)  (15, 59.) 
row 15: (12, 60.)  (13, 61.)  (14, 62.)  (15, 63.) 
```

-Damyn

> On Oct 25, 2023, at 1:47 PM, Barry Smith <bsmith at petsc.dev> wrote:
> 
> 
>  If the matrices are stored as dense it is likely new code is the best way to go. 
> 
>  What pieces live on the sub communicator? Is it an m by N matrix where m is the number of rows (on that rank) and N is the total number of columns in the final matrix? Or are they smaller "chunks" that need to be combined together?
> 
>  Barry
> 
> 
>> On Oct 25, 2023, at 3:38 PM, Damyn Chipman <damynchipman at u.boisestate.edu> wrote:
>> 
>> Great thanks, that seemed to work well. This is something my algorithm will do fairly often (“elevating” a node’s communicator to a communicator that includes siblings). The matrices formed are dense but low rank. With MatCreateSubMatrix, it appears I do a lot of copying from one Mat to another. Is there a way to do it with array copying or pointer movement instead of copying entries?
>> 
>> -Damyn
>> 
>>> On Oct 24, 2023, at 9:51 PM, Jed Brown <jed at jedbrown.org> wrote:
>>> 
>>> You can place it in a parallel Mat (that has rows or columns on only one rank or a subset of ranks) and then MatCreateSubMatrix with all new rows/columns on a different rank or subset of ranks.
>>> 
>>> That said, you usually have a function that assembles the matrix and you can just call that on the new communicator.
>>> 
>>> Damyn Chipman <damynchipman at u.boisestate.edu> writes:
>>> 
>>>> Hi PETSc developers,
>>>> 
>>>> In short, my question is this: Does PETSc provide a way to move or copy an object (say a Mat) from one communicator to another?
>>>> 
>>>> The more detailed scenario is this: I’m working on a linear algebra solver on quadtree meshes (i.e., p4est). I use communicator subsets in order to facilitate communication between siblings or nearby neighbors. When performing linear algebra across siblings (a group of 4), I need to copy a node’s data (i.e., a Mat object) from a sibling’s communicator to the communicator that includes the four siblings. From what I can tell, I can only copy a PETSc object onto the same communicator.
>>>> 
>>>> My current approach will be to copy the raw data from the Mat on one communicator to a new Mat on the new communicator, but I wanted to see if there is a more “elegant” approach within PETSc.
>>>> 
>>>> Thanks in advance,
>>>> 
>>>> Damyn Chipman
>>>> Boise State University
>>>> PhD Candidate
>>>> Computational Sciences and Engineering
>>>> damynchipman at u.boisestate.edu
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231025/d0514f01/attachment-0001.html>