[petsc-users] Regarding help with MatGetSubMatrix parallel use
Krishnan, Gautham
gautham3 at illinois.edu
Sat Sep 26 15:13:19 CDT 2020
In parallel, I have passed the global indices that I want on the PE. However, the result I obtain seems to change based on whether I divide the initial domain between PEs along the x axis or the y axis. This is what I am trying to resolve- for example:
the FORTRAN code I use is basically:
PetscInt :: idx(ngx*ngy*ngz)
Mat :: D2_t, lapl, , lapl_t
IS :: irow, irow1, icol1
call DMCreateMatrix(grid_pressure, lapl, ierr)
call MatSetOption(lapl_t, MAT_NEW_NONZERO_LOCATIONS,PETSC_TRUE,ierr)
call MatSetOption(lapl_t, MAT_IGNORE_ZERO_ENTRIES,PETSC_TRUE,ierr)
call DMCreateMatrix(grid_pressure, lapl_t, ierr)
call MatSetOption(lapl_t, MAT_NEW_NONZERO_LOCATIONS,PETSC_TRUE,ierr)
call MatSetOption(lapl_t, MAT_IGNORE_ZERO_ENTRIES,PETSC_TRUE,ierr)
.
.
. !(Assembling matrix lapl)
call VecCreate(petsc_xyzcom, vec_b, ierr)
call VecSetSizes(vec_b, PETSC_DECIDE, ngx*ngy*ngz-1, ierr)
call VecSetFromOptions(vec_b, ierr)
call VecSetUp(vec_b, ierr)
call VecGetLocalSize(vec_b,vecsx,ierr)
call VecGetOwnershipRange(vec_b,veclo,vechi,ierr)
call ISCreateStride(petsc_xyzcom, vecsx,veclo,1,icol1,ierr)
idx = (/ (j, j=veclo,vechi-1)/)
call ISCreateGeneral(petsc_xyzcom, vecsx,idx, PETSC_COPY_VALUES,irow,ierr)
call MatTranspose(lapl, MAT_INPLACE_MATRIX, lapl_t,ierr) !transpose the laplacian
call MatGetSubMatrix(lapl_t, irow, icol1, MAT_INITIAL_MATRIX, D2_t,ierr)
call MatView(lapl_t, PETSC_VIEWER_STDOUT_WORLD, ierr)
call MatView(D2_t, PETSC_VIEWER_STDOUT_WORLD, ierr)
Output:
ngx=ngy=4; ngz=1; such that n=4*4*1=16
lapl_t:
row 0: (0, -3.)
row 1: (0, 3.) (1, -3.) (5, -9.)
row 2: (2, -3.) (3, -3.) (6, -9.)
row 3: (3, 3.)
row 4: (4, -3.) (5, -9.)
row 5: (1, 3.) (4, 3.) (5, 36.) (6, -9.) (9, -9.)
row 6: (2, 3.) (5, -9.) (6, 36.) (7, -3.) (10, -9.)
row 7: (6, -9.) (7, 3.)
row 8: (8, -3.) (9, -9.)
row 9: (5, -9.) (8, 3.) (9, 36.) (10, -9.) (13, -3.)
row 10: (6, -9.) (9, -9.) (10, 36.) (11, -3.) (14, -3.)
row 11: (10, -9.) (11, 3.)
row 12: (12, -3.)
row 13: (9, -9.) (12, 3.) (13, 3.)
row 14: (10, -9.) (14, 3.) (15, -3.)
row 15: (15, 3.)
Case 1:
nprocx =1; nprocy=2; ! number of processors in x and y among which to divide the domain
Here, the (n-1 x n-1) submatrix is extracted correctly as
D2_t:
row 0: (0, -3.)
row 1: (0, 3.) (1, -3.) (5, -9.)
row 2: (2, -3.) (3, -3.) (6, -9.)
row 3: (3, 3.)
row 4: (4, -3.) (5, -9.)
row 5: (1, 3.) (4, 3.) (5, 36.) (6, -9.) (9, -9.)
row 6: (2, 3.) (5, -9.) (6, 36.) (7, -3.) (10, -9.)
row 7: (6, -9.) (7, 3.)
row 8: (8, -3.) (9, -9.)
row 9: (5, -9.) (8, 3.) (9, 36.) (10, -9.) (13, -3.)
row 10: (6, -9.) (9, -9.) (10, 36.) (11, -3.) (14, -3.)
row 11: (10, -9.) (11, 3.)
row 12: (12, -3.)
row 13: (9, -9.) (12, 3.) (13, 3.)
row 14: (10, -9.) (14, 3.)
However, for
Case 2:
nprocx =2; nprocy=1;
lapl_t is correctly assembled and transposed but the (n-1 x n-1) submatrix is extracted incorrectly as
D2_t:
row 0: (0, -3.)
row 1: (0, 3.) (1, -3.) (3, -9.)
row 2: (2, -3.) (3, -9.)
row 3: (1, 3.) (2, 3.) (3, 36.) (5, -9.) (10, -9.)
row 4: (4, -3.) (5, -9.)
row 5: (3, -9.) (4, 3.) (5, 36.) (7, -3.) (12, -9.)
row 6: (6, -3.)
row 7: (5, -9.) (6, 3.) (7, 3.)
row 8: (8, -3.) (9, -3.) (10, -9.)
row 9: (9, 3.)
row 10: (3, -9.) (8, 3.) (10, 36.) (11, -3.) (12, -9.)
row 11: (10, -9.) (11, 3.)
row 12: (5, -9.) (10, -9.) (12, 36.) (13, -3.) (14, -3.)
row 13: (12, -9.) (13, 3.)
row 14: (12, -9.) (14, 3.)
I am unable to understand why the extracted submatrix is incorrect when nprocx>1 but works when nprocx=1 and nprocy>=1.
P.S. the parallel IS in cases 1 and 2 are the same and are as follows:
irow:
[0] Number of indices in set 8
[0] 0 0
[0] 1 1
[0] 2 2
[0] 3 3
[0] 4 4
[0] 5 5
[0] 6 6
[0] 7 7
[1] Number of indices in set 7
[1] 0 8
[1] 1 9
[1] 2 10
[1] 3 11
[1] 4 12
[1] 5 13
[1] 6 14
icol1:
[0] Index set is permutation
[0] Number of indices in (stride) set 8
[0] 0 0
[0] 1 1
[0] 2 2
[0] 3 3
[0] 4 4
[0] 5 5
[0] 6 6
[0] 7 7
[1] Number of indices in (stride) set 7
[1] 0 8
[1] 1 9
[1] 2 10
[1] 3 11
[1] 4 12
[1] 5 13
[1] 6 14
Could you please help me find out what is going wrong here?
Regards,
Gautham
________________________________
From: Krishnan, Gautham <gautham3 at illinois.edu>
Sent: Saturday, September 26, 2020 2:01 PM
To: Matthew Knepley <knepley at gmail.com>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Regarding help with MatGetSubMatrix parallel use
In parallel I have passed the global indices that I want on the PE. However, the result I obtain seems to change based on whether I divide the initial domain between PEs along the x axis or the y axis. This is what I am trying to resolve- for example:
the FORTRAN code I use is basically:
PetscInt :: idx(ngx*ngy*ngz)
Mat :: D2_t, lapl, , lapl_t
IS :: irow, irow1, icol1
call DMCreateMatrix(grid_pressure, lapl, ierr)
call MatSetOption(lapl_t, MAT_NEW_NONZERO_LOCATIONS,PETSC_TRUE,ierr)
call MatSetOption(lapl_t, MAT_IGNORE_ZERO_ENTRIES,PETSC_TRUE,ierr)
call DMCreateMatrix(grid_pressure, lapl_t, ierr)
call MatSetOption(lapl_t, MAT_NEW_NONZERO_LOCATIONS,PETSC_TRUE,ierr)
call MatSetOption(lapl_t, MAT_IGNORE_ZERO_ENTRIES,PETSC_TRUE,ierr)
.
.
. !(Assembling matrix lapl)
call VecCreate(petsc_xyzcom, vec_b, ierr)
call VecSetSizes(vec_b, PETSC_DECIDE, ngx*ngy*ngz-1, ierr)
call VecSetFromOptions(vec_b, ierr)
call VecSetUp(vec_b, ierr)
call VecGetLocalSize(vec_b,vecsx,ierr)
call VecGetOwnershipRange(vec_b,veclo,vechi,ierr)
call ISCreateStride(petsc_xyzcom, vecsx,veclo,1,icol1,ierr)
idx = (/ (j, j=veclo,vechi-1)/)
call ISCreateGeneral(petsc_xyzcom, vecsx,idx, PETSC_COPY_VALUES,irow,ierr)
call MatTranspose(lapl, MAT_INPLACE_MATRIX, lapl_t,ierr) !transpose the laplacian
call MatGetSubMatrix(lapl_t, irow, icol1, MAT_INITIAL_MATRIX, D2_t,ierr)
call MatView(lapl_t, PETSC_VIEWER_STDOUT_WORLD, ierr)
call MatView(D2_t, PETSC_VIEWER_STDOUT_WORLD, ierr)
Output:
ngx=ngy=4; ngz=1; such that n=4*4*1=16
lapl_t:
row 0: (0, -3.)
row 1: (0, 3.) (1, -3.) (5, -9.)
row 2: (2, -3.) (3, -3.) (6, -9.)
row 3: (3, 3.)
row 4: (4, -3.) (5, -9.)
row 5: (1, 3.) (4, 3.) (5, 36.) (6, -9.) (9, -9.)
row 6: (2, 3.) (5, -9.) (6, 36.) (7, -3.) (10, -9.)
row 7: (6, -9.) (7, 3.)
row 8: (8, -3.) (9, -9.)
row 9: (5, -9.) (8, 3.) (9, 36.) (10, -9.) (13, -3.)
row 10: (6, -9.) (9, -9.) (10, 36.) (11, -3.) (14, -3.)
row 11: (10, -9.) (11, 3.)
row 12: (12, -3.)
row 13: (9, -9.) (12, 3.) (13, 3.)
row 14: (10, -9.) (14, 3.) (15, -3.)
row 15: (15, 3.)
Case 1:
nprocx =1; nprocy=2; ! number of processors in x and y among which to divide the domain
Here, the (n-1 x n-1) submatrix is extracted correctly as
D2_t:
row 0: (0, -3.)
row 1: (0, 3.) (1, -3.) (5, -9.)
row 2: (2, -3.) (3, -3.) (6, -9.)
row 3: (3, 3.)
row 4: (4, -3.) (5, -9.)
row 5: (1, 3.) (4, 3.) (5, 36.) (6, -9.) (9, -9.)
row 6: (2, 3.) (5, -9.) (6, 36.) (7, -3.) (10, -9.)
row 7: (6, -9.) (7, 3.)
row 8: (8, -3.) (9, -9.)
row 9: (5, -9.) (8, 3.) (9, 36.) (10, -9.) (13, -3.)
row 10: (6, -9.) (9, -9.) (10, 36.) (11, -3.) (14, -3.)
row 11: (10, -9.) (11, 3.)
row 12: (12, -3.)
row 13: (9, -9.) (12, 3.) (13, 3.)
row 14: (10, -9.) (14, 3.)
However, for
Case 2:
nprocx =2; nprocy=1;
lapl_t is correctly assembled and transposed but the (n-1 x n-1) submatrix is extracted incorrectly as
D2_t:
row 0: (0, -3.)
row 1: (0, 3.) (1, -3.) (3, -9.)
row 2: (2, -3.) (3, -9.)
row 3: (1, 3.) (2, 3.) (3, 36.) (5, -9.) (10, -9.)
row 4: (4, -3.) (5, -9.)
row 5: (3, -9.) (4, 3.) (5, 36.) (7, -3.) (12, -9.)
row 6: (6, -3.)
row 7: (5, -9.) (6, 3.) (7, 3.)
row 8: (8, -3.) (9, -3.) (10, -9.)
row 9: (9, 3.)
row 10: (3, -9.) (8, 3.) (10, 36.) (11, -3.) (12, -9.)
row 11: (10, -9.) (11, 3.)
row 12: (5, -9.) (10, -9.) (12, 36.) (13, -3.) (14, -3.)
row 13: (12, -9.) (13, 3.)
row 14: (12, -9.) (14, 3.)
I am unable to understand why the extracted submatrix is incorrect when nprocx>1 but works when nprocx=1 and nprocy>=1.
P.S. the parallel IS in cases 1 and 2 are the same and are as follows:
irow:
[0] Number of indices in set 8
[0] 0 0
[0] 1 1
[0] 2 2
[0] 3 3
[0] 4 4
[0] 5 5
[0] 6 6
[0] 7 7
[1] Number of indices in set 7
[1] 0 8
[1] 1 9
[1] 2 10
[1] 3 11
[1] 4 12
[1] 5 13
[1] 6 14
icol1:
[0] Index set is permutation
[0] Number of indices in (stride) set 8
[0] 0 0
[0] 1 1
[0] 2 2
[0] 3 3
[0] 4 4
[0] 5 5
[0] 6 6
[0] 7 7
[1] Number of indices in (stride) set 7
[1] 0 8
[1] 1 9
[1] 2 10
[1] 3 11
[1] 4 12
[1] 5 13
[1] 6 14
Could you please help me find out what is going wrong here?
Regards,
Gautham
________________________________
From: Matthew Knepley <knepley at gmail.com>
Sent: Wednesday, September 23, 2020 3:55 PM
To: Krishnan, Gautham <gautham3 at illinois.edu>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Regarding help with MatGetSubMatrix parallel use
On Wed, Sep 23, 2020 at 4:12 PM Krishnan, Gautham <gautham3 at illinois.edu<mailto:gautham3 at illinois.edu>> wrote:
Hello,
For a CFD code being developed with FORTRAN and MPI, I am using PETSC matrices and for a particular operation, I require to extract a submatrix(n-1 x n-1) of a matrix created (n x n). However using the petsc MatGetSubMatrix works for serial runs but fails when the domain is split up over PEs- I suspect the indexing changed for parallel runs and hence the global indexing that worked for serial case just shuffles around matrix entries in parallel undesirably. I would like to ask whether anybody could offer some guidance regarding this. I would like to note that the 2D domain is split along both x and y axes for parallel runs on multiple PEs.
In parallel, you pass MatGetSubmatrix() the global indices that you want on your process.
Thanks,
Matt
Regards,
Gautham Krishnan,
--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200926/bba10439/attachment-0001.html>
More information about the petsc-users
mailing list