[petsc-users] Regarding help with MatGetSubMatrix parallel use

Matthew Knepley knepley at gmail.com
Mon Sep 28 08:43:39 CDT 2020


On Sat, Sep 26, 2020 at 4:13 PM Krishnan, Gautham <gautham3 at illinois.edu>
wrote:

> In parallel, I have passed the global indices that I want on the PE.
> However, the result I obtain seems to change based on whether I divide the
> initial domain between PEs along the x axis or the y axis. This is what I
> am trying to resolve- for example:
>

You mean that you have a DMDA, and you change the pattern of division
across processes? This will change the global numbers of given vertices.
Remember that these are not
"natural numbers", meaning a lexicographic ordering of vertices, but rather
"global numbers", meaning the PETSc ordering of unknowns which is
contiguous across processes. There
is a discussion of this in the manual.

  Thanks,

    Matt


> the *FORTRAN* code I use is basically:
> PetscInt          ::      idx(ngx*ngy*ngz)
> Mat               ::      D2_t, lapl, , lapl_t
> IS                ::      irow, irow1, icol1
>
> call DMCreateMatrix(grid_pressure, lapl, ierr)
> call MatSetOption(lapl_t, MAT_NEW_NONZERO_LOCATIONS,PETSC_TRUE,ierr)
> call MatSetOption(lapl_t, MAT_IGNORE_ZERO_ENTRIES,PETSC_TRUE,ierr)
> call DMCreateMatrix(grid_pressure, lapl_t, ierr)
> call MatSetOption(lapl_t, MAT_NEW_NONZERO_LOCATIONS,PETSC_TRUE,ierr)
> call MatSetOption(lapl_t, MAT_IGNORE_ZERO_ENTRIES,PETSC_TRUE,ierr)
> .
> .
> .    !(Assembling matrix lapl)
>
> call VecCreate(petsc_xyzcom, vec_b, ierr)
> call VecSetSizes(vec_b, PETSC_DECIDE, ngx*ngy*ngz-1, ierr)
> call VecSetFromOptions(vec_b, ierr)
> call VecSetUp(vec_b, ierr)
> call VecGetLocalSize(vec_b,vecsx,ierr)
> call VecGetOwnershipRange(vec_b,veclo,vechi,ierr)
> call ISCreateStride(petsc_xyzcom, vecsx,veclo,1,icol1,ierr)
> idx = (/ (j, j=veclo,vechi-1)/)
> call ISCreateGeneral(petsc_xyzcom, vecsx,idx, PETSC_COPY_VALUES,irow,ierr)
> call MatTranspose(lapl, MAT_INPLACE_MATRIX, lapl_t,ierr)   !transpose the
> laplacian
> call MatGetSubMatrix(lapl_t, irow, icol1, MAT_INITIAL_MATRIX, D2_t,ierr)
> call MatView(lapl_t, PETSC_VIEWER_STDOUT_WORLD, ierr)
> call MatView(D2_t, PETSC_VIEWER_STDOUT_WORLD, ierr)
>
> *Output*:
>
> ngx=ngy=4; ngz=1; such that n=4*4*1=16
>
> lapl_t:
> row 0: (0, -3.)
> row 1: (0, 3.)  (1, -3.)  (5, -9.)
> row 2: (2, -3.)  (3, -3.)  (6, -9.)
> row 3: (3, 3.)
> row 4: (4, -3.)  (5, -9.)
> row 5: (1, 3.)  (4, 3.)  (5, 36.)  (6, -9.)  (9, -9.)
> row 6: (2, 3.)  (5, -9.)  (6, 36.)  (7, -3.)  (10, -9.)
> row 7: (6, -9.)  (7, 3.)
> row 8: (8, -3.)  (9, -9.)
> row 9: (5, -9.)  (8, 3.)  (9, 36.)  (10, -9.)  (13, -3.)
> row 10: (6, -9.)  (9, -9.)  (10, 36.)  (11, -3.)  (14, -3.)
> row 11: (10, -9.)  (11, 3.)
> row 12: (12, -3.)
> row 13: (9, -9.)  (12, 3.)  (13, 3.)
> row 14: (10, -9.)  (14, 3.)  (15, -3.)
> row 15: (15, 3.)
>
>
> *Case 1:*
> nprocx =1; nprocy=2;   ! number of processors in x and y among which to
> divide the domain
>
> Here, the (n-1 x n-1) submatrix is extracted correctly as
> D2_t:
> row 0: (0, -3.)
> row 1: (0, 3.)  (1, -3.)  (5, -9.)
> row 2: (2, -3.)  (3, -3.)  (6, -9.)
> row 3: (3, 3.)
> row 4: (4, -3.)  (5, -9.)
> row 5: (1, 3.)  (4, 3.)  (5, 36.)  (6, -9.)  (9, -9.)
> row 6: (2, 3.)  (5, -9.)  (6, 36.)  (7, -3.)  (10, -9.)
> row 7: (6, -9.)  (7, 3.)
> row 8: (8, -3.)  (9, -9.)
> row 9: (5, -9.)  (8, 3.)  (9, 36.)  (10, -9.)  (13, -3.)
> row 10: (6, -9.)  (9, -9.)  (10, 36.)  (11, -3.)  (14, -3.)
> row 11: (10, -9.)  (11, 3.)
> row 12: (12, -3.)
> row 13: (9, -9.)  (12, 3.)  (13, 3.)
> row 14: (10, -9.)  (14, 3.)
>
> However, for
> *Case 2:*
> nprocx =2; nprocy=1;
>
> lapl_t is correctly assembled and transposed but the (n-1 x n-1)
> submatrix is extracted incorrectly as
> D2_t:
> row 0: (0, -3.)
> row 1: (0, 3.)  (1, -3.)  (3, -9.)
> row 2: (2, -3.)  (3, -9.)
> row 3: (1, 3.)  (2, 3.)  (3, 36.)  (5, -9.)  (10, -9.)
> row 4: (4, -3.)  (5, -9.)
> row 5: (3, -9.)  (4, 3.)  (5, 36.)  (7, -3.)  (12, -9.)
> row 6: (6, -3.)
> row 7: (5, -9.)  (6, 3.)  (7, 3.)
> row 8: (8, -3.)  (9, -3.)  (10, -9.)
> row 9: (9, 3.)
> row 10: (3, -9.)  (8, 3.)  (10, 36.)  (11, -3.)  (12, -9.)
> row 11: (10, -9.)  (11, 3.)
> row 12: (5, -9.)  (10, -9.)  (12, 36.)  (13, -3.)  (14, -3.)
> row 13: (12, -9.)  (13, 3.)
> row 14: (12, -9.)  (14, 3.)
>
> I am unable to understand why the extracted submatrix is incorrect when
> nprocx>1 but works when nprocx=1 and nprocy>=1.
>
> P.S. the parallel IS in cases 1 and 2 are the same and are as follows:
> irow:
> [0] Number of indices in set 8
> [0] 0 0
> [0] 1 1
> [0] 2 2
> [0] 3 3
> [0] 4 4
> [0] 5 5
> [0] 6 6
> [0] 7 7
> [1] Number of indices in set 7
> [1] 0 8
> [1] 1 9
> [1] 2 10
> [1] 3 11
> [1] 4 12
> [1] 5 13
> [1] 6 14
>
> icol1:
> [0] Index set is permutation
> [0] Number of indices in (stride) set 8
> [0] 0 0
> [0] 1 1
> [0] 2 2
> [0] 3 3
> [0] 4 4
> [0] 5 5
> [0] 6 6
> [0] 7 7
> [1] Number of indices in (stride) set 7
> [1] 0 8
> [1] 1 9
> [1] 2 10
> [1] 3 11
> [1] 4 12
> [1] 5 13
> [1] 6 14
>
> Could you please help me find out what is going wrong here?
>
> Regards,
> Gautham
>
> ------------------------------
> *From:* Krishnan, Gautham <gautham3 at illinois.edu>
> *Sent:* Saturday, September 26, 2020 2:01 PM
> *To:* Matthew Knepley <knepley at gmail.com>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Regarding help with MatGetSubMatrix parallel
> use
>
> In parallel I have passed the global indices that I want on the PE.
> However, the result I obtain seems to change based on whether I divide the
> initial domain between PEs along the x axis or the y axis. This is what I
> am trying to resolve- for example:
>
> the *FORTRAN* code I use is basically:
> PetscInt          ::      idx(ngx*ngy*ngz)
> Mat               ::      D2_t, lapl, , lapl_t
> IS                ::      irow, irow1, icol1
>
> call DMCreateMatrix(grid_pressure, lapl, ierr)
> call MatSetOption(lapl_t, MAT_NEW_NONZERO_LOCATIONS,PETSC_TRUE,ierr)
> call MatSetOption(lapl_t, MAT_IGNORE_ZERO_ENTRIES,PETSC_TRUE,ierr)
> call DMCreateMatrix(grid_pressure, lapl_t, ierr)
> call MatSetOption(lapl_t, MAT_NEW_NONZERO_LOCATIONS,PETSC_TRUE,ierr)
> call MatSetOption(lapl_t, MAT_IGNORE_ZERO_ENTRIES,PETSC_TRUE,ierr)
> .
> .
> .    !(Assembling matrix lapl)
>
> call VecCreate(petsc_xyzcom, vec_b, ierr)
> call VecSetSizes(vec_b, PETSC_DECIDE, ngx*ngy*ngz-1, ierr)
> call VecSetFromOptions(vec_b, ierr)
> call VecSetUp(vec_b, ierr)
> call VecGetLocalSize(vec_b,vecsx,ierr)
> call VecGetOwnershipRange(vec_b,veclo,vechi,ierr)
> call ISCreateStride(petsc_xyzcom, vecsx,veclo,1,icol1,ierr)
> idx = (/ (j, j=veclo,vechi-1)/)
> call ISCreateGeneral(petsc_xyzcom, vecsx,idx, PETSC_COPY_VALUES,irow,ierr)
> call MatTranspose(lapl, MAT_INPLACE_MATRIX, lapl_t,ierr)   !transpose the
> laplacian
> call MatGetSubMatrix(lapl_t, irow, icol1, MAT_INITIAL_MATRIX, D2_t,ierr)
> call MatView(lapl_t, PETSC_VIEWER_STDOUT_WORLD, ierr)
> call MatView(D2_t, PETSC_VIEWER_STDOUT_WORLD, ierr)
>
> *Output*:
>
> ngx=ngy=4; ngz=1; such that n=4*4*1=16
>
> lapl_t:
> row 0: (0, -3.)
> row 1: (0, 3.)  (1, -3.)  (5, -9.)
> row 2: (2, -3.)  (3, -3.)  (6, -9.)
> row 3: (3, 3.)
> row 4: (4, -3.)  (5, -9.)
> row 5: (1, 3.)  (4, 3.)  (5, 36.)  (6, -9.)  (9, -9.)
> row 6: (2, 3.)  (5, -9.)  (6, 36.)  (7, -3.)  (10, -9.)
> row 7: (6, -9.)  (7, 3.)
> row 8: (8, -3.)  (9, -9.)
> row 9: (5, -9.)  (8, 3.)  (9, 36.)  (10, -9.)  (13, -3.)
> row 10: (6, -9.)  (9, -9.)  (10, 36.)  (11, -3.)  (14, -3.)
> row 11: (10, -9.)  (11, 3.)
> row 12: (12, -3.)
> row 13: (9, -9.)  (12, 3.)  (13, 3.)
> row 14: (10, -9.)  (14, 3.)  (15, -3.)
> row 15: (15, 3.)
>
>
> *Case 1:*
> nprocx =1; nprocy=2;   ! number of processors in x and y among which to
> divide the domain
>
> Here, the (n-1 x n-1) submatrix is extracted correctly as
> D2_t:
> row 0: (0, -3.)
> row 1: (0, 3.)  (1, -3.)  (5, -9.)
> row 2: (2, -3.)  (3, -3.)  (6, -9.)
> row 3: (3, 3.)
> row 4: (4, -3.)  (5, -9.)
> row 5: (1, 3.)  (4, 3.)  (5, 36.)  (6, -9.)  (9, -9.)
> row 6: (2, 3.)  (5, -9.)  (6, 36.)  (7, -3.)  (10, -9.)
> row 7: (6, -9.)  (7, 3.)
> row 8: (8, -3.)  (9, -9.)
> row 9: (5, -9.)  (8, 3.)  (9, 36.)  (10, -9.)  (13, -3.)
> row 10: (6, -9.)  (9, -9.)  (10, 36.)  (11, -3.)  (14, -3.)
> row 11: (10, -9.)  (11, 3.)
> row 12: (12, -3.)
> row 13: (9, -9.)  (12, 3.)  (13, 3.)
> row 14: (10, -9.)  (14, 3.)
>
> However, for
> *Case 2:*
> nprocx =2; nprocy=1;
>
> lapl_t is correctly assembled and transposed but the (n-1 x n-1)
> submatrix is extracted incorrectly as
> D2_t:
> row 0: (0, -3.)
> row 1: (0, 3.)  (1, -3.)  (3, -9.)
> row 2: (2, -3.)  (3, -9.)
> row 3: (1, 3.)  (2, 3.)  (3, 36.)  (5, -9.)  (10, -9.)
> row 4: (4, -3.)  (5, -9.)
> row 5: (3, -9.)  (4, 3.)  (5, 36.)  (7, -3.)  (12, -9.)
> row 6: (6, -3.)
> row 7: (5, -9.)  (6, 3.)  (7, 3.)
> row 8: (8, -3.)  (9, -3.)  (10, -9.)
> row 9: (9, 3.)
> row 10: (3, -9.)  (8, 3.)  (10, 36.)  (11, -3.)  (12, -9.)
> row 11: (10, -9.)  (11, 3.)
> row 12: (5, -9.)  (10, -9.)  (12, 36.)  (13, -3.)  (14, -3.)
> row 13: (12, -9.)  (13, 3.)
> row 14: (12, -9.)  (14, 3.)
>
> I am unable to understand why the extracted submatrix is incorrect when
> nprocx>1 but works when nprocx=1 and nprocy>=1.
>
> P.S. the parallel IS in cases 1 and 2 are the same and are as follows:
> irow:
> [0] Number of indices in set 8
> [0] 0 0
> [0] 1 1
> [0] 2 2
> [0] 3 3
> [0] 4 4
> [0] 5 5
> [0] 6 6
> [0] 7 7
> [1] Number of indices in set 7
> [1] 0 8
> [1] 1 9
> [1] 2 10
> [1] 3 11
> [1] 4 12
> [1] 5 13
> [1] 6 14
>
> icol1:
> [0] Index set is permutation
> [0] Number of indices in (stride) set 8
> [0] 0 0
> [0] 1 1
> [0] 2 2
> [0] 3 3
> [0] 4 4
> [0] 5 5
> [0] 6 6
> [0] 7 7
> [1] Number of indices in (stride) set 7
> [1] 0 8
> [1] 1 9
> [1] 2 10
> [1] 3 11
> [1] 4 12
> [1] 5 13
> [1] 6 14
>
> Could you please help me find out what is going wrong here?
>
> Regards,
> Gautham
> ------------------------------
> *From:* Matthew Knepley <knepley at gmail.com>
> *Sent:* Wednesday, September 23, 2020 3:55 PM
> *To:* Krishnan, Gautham <gautham3 at illinois.edu>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Regarding help with MatGetSubMatrix parallel
> use
>
> On Wed, Sep 23, 2020 at 4:12 PM Krishnan, Gautham <gautham3 at illinois.edu>
> wrote:
>
> Hello,
>
> For a CFD code being developed with FORTRAN and MPI, I am using PETSC
> matrices and for a particular operation, I require to extract a
> submatrix(n-1 x n-1) of a matrix created (n x n). However using the petsc
> MatGetSubMatrix works for serial runs but fails when the domain is split up
> over PEs- I suspect the indexing changed for parallel runs and hence the
> global indexing that worked for serial case just shuffles around matrix
> entries in parallel undesirably. I would like to ask whether anybody could
> offer some guidance regarding this.  I would like to note that the 2D
> domain is split along both x and y axes for parallel runs on multiple PEs.
>
>
> In parallel, you pass MatGetSubmatrix() the global indices that you want
> on your process.
>
>   Thanks,
>
>      Matt
>
>
> Regards,
> Gautham Krishnan,
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200928/bbbf4ac7/attachment-0001.html>


More information about the petsc-users mailing list