[petsc-users] DMPlex partition problem
Danyang Su
danyang.su at gmail.com
Wed Apr 8 19:18:18 CDT 2020
From: Matthew Knepley <knepley at gmail.com>
Date: Wednesday, April 8, 2020 at 4:50 PM
To: Danyang Su <danyang.su at gmail.com>
Cc: PETSc <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] DMPlex partition problem
On Wed, Apr 8, 2020 at 7:47 PM Danyang Su <danyang.su at gmail.com> wrote:
From: Matthew Knepley <knepley at gmail.com>
Date: Wednesday, April 8, 2020 at 4:41 PM
To: Danyang Su <danyang.su at gmail.com>
Cc: PETSc <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] DMPlex partition problem
On Wed, Apr 8, 2020 at 5:52 PM Danyang Su <danyang.su at gmail.com> wrote:
Hi Matt,
I am one step closer now. When run the ex1 code with ‘-interpolate’, the partition is good, without it, it’s weird.
Crap! That did not even occur to me. Yes, the dual graph construction will not work for uninterpolated wedges.
So, do you really need an uninterpolated mesh? If so, I can put it on the buglist.
For Prism mesh, I am afraid so. For 2D triangle mesh and 3D tetra mesh, the partition is pretty good without interpolate. That’s why I didn’t have problem for all my previous simulations using the other cell types.
What I mean is, are you avoiding interpolating the mesh for memory? The amount of memory is usually small compared to
fields on the mesh.
No, not because of memory consumption problem. When the code was first written several years ago, I just put interpolate = false there. Now after setting interpolate = true, I need to update the code in setting cell-node index (array cell). The following code does not work anymore when interpolate = true. There is some code that is not well written and it needs to be improved.
!c add local to global cell id mapping
do ipoint = 0, istart-1
icell = ipoint + 1
call DMPlexGetCone(dmda_flow%da,ipoint,cone,ierr)
CHKERRQ(ierr)
do ivtex = 1, num_nodes_per_cell
cell_node_idx(:,ipoint+1) = cone - istart + 1
end do
call DMPlexRestoreCone(dmda_flow%da,ipoint,cone,ierr)
CHKERRQ(ierr)
end do
Thanks,
Danyang
Thanks,
Matt
Thanks,
Danyang
Thanks,
Matt
Thanks,
Danyang
From: Danyang Su <danyang.su at gmail.com>
Date: Wednesday, April 8, 2020 at 2:12 PM
To: Matthew Knepley <knepley at gmail.com>
Cc: PETSc <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] DMPlex partition problem
Hi Matt,
Attached is another prism mesh using 8 processors. The partition of the lower mesh does not looks good.
Thanks,
Danyang
From: Danyang Su <danyang.su at gmail.com>
Date: Wednesday, April 8, 2020 at 1:50 PM
To: Matthew Knepley <knepley at gmail.com>
Cc: PETSc <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] DMPlex partition problem
Hi Matt,
Here is what I get using ex1c with stencil 0. There is no change in the source code. I just compile and run the code in different ways. By using ‘make -f ./gmakefile ….’, it works as expected. However, by using ‘make ex1’ and then run the code using ‘mpiexec -n …’, the partition does not looks good. My code has the same problem as this one if I use prism mesh.
I just wonder what makes this difference, even without overlap.
Thanks,
Danyang
From: Matthew Knepley <knepley at gmail.com>
Date: Wednesday, April 8, 2020 at 1:32 PM
To: Danyang Su <danyang.su at gmail.com>
Cc: PETSc <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] DMPlex partition problem
On Wed, Apr 8, 2020 at 4:26 PM Danyang Su <danyang.su at gmail.com> wrote:
From: Matthew Knepley <knepley at gmail.com>
Date: Wednesday, April 8, 2020 at 12:50 PM
To: Danyang Su <danyang.su at gmail.com>
Cc: PETSc <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] DMPlex partition problem
On Wed, Apr 8, 2020 at 3:22 PM Danyang Su <danyang.su at gmail.com> wrote:
Hi Matt,
Here is something pretty interesting. I modified ex1.c file with output of number of nodes and cells (as shown below) . And I also changed the stencil size to 1.
/* get coordinates and section */
ierr = DMGetCoordinatesLocal(*dm,&gc);CHKERRQ(ierr);
ierr = DMGetCoordinateDM(*dm,&cda);CHKERRQ(ierr);
ierr = DMGetSection(cda,&cs);CHKERRQ(ierr);
ierr = PetscSectionGetChart(cs,&istart,&iend);CHKERRQ(ierr);
num_nodes = iend-istart;
num_cells = istart;
/* Output rank and processor information */
printf("rank %d: of nprcs: %d, num_nodes %d, num_cess %d\n", rank, size, num_nodes, num_cells);
If I compile the code using ‘make ex1’ and then run the test using ‘mpiexec -n 2 ./ex1 -filename basin2layer.exo’, I get the same problem as the modified ex1f90 code I sent.
➜ tests mpiexec -n 2 ./ex1 -filename basin2layer.exo
rank 1: of nprcs: 2, num_nodes 699, num_cess 824
rank 0: of nprcs: 2, num_nodes 699, num_cess 824
Ah, I was not looking closely. You are asking for a cell overlap of 1 in the partition. That is why these numbers sum to more than
the total in the mesh. Do you want a cell overlap of 1?
Yes, I need cell overlap of 1 in some circumstance. The mesh has two layers of cells with 412 cells per layer and three layers of nodes with 233 nodes per layer. The number of cells looks good to me. I am confused why the same code generates pretty different partition. If I set the stencil to 0, I get following results. The first method looks good and the second one is not a good choice, with much more number of ghost nodes.
➜ petsc-3.13.0 make -f ./gmakefile test globsearch="dm_impls_plex_tests-ex1_cylinder" EXTRA_OPTIONS="-filename ./basin2layer.exo -dm_view hdf5:$PWD/mesh.h5 -dm_partition_view" NP=2
# > rank 1: of nprcs: 2, num_nodes 354, num_cess 392
# > rank 0: of nprcs: 2, num_nodes 384, num_cess 432
➜ tests mpiexec -n 2 ./ex1 -filename basin2layer.exo
rank 0: of nprcs: 2, num_nodes 466, num_cess 412
rank 1: of nprcs: 2, num_nodes 466, num_cess 412
I think this might just be a confusion over interpretation. Here is how partitioning works:
1) We partition the mesh cells using ParMetis, Chaco, etc.
2) We move those cells (and closures) to the correct processes
3) If you ask for overlap, we mark a layer of adjacent cells on remote processes and move them to each process
The original partitions are the same, Then we add extra cells, and their closures, to each partition. This is what you are asking for.
You would get the same answer with GMsh if it gave you an overlap region.
Thanks,
Matt
Thanks,
Danyang
Thanks,
Matt
➜ tests mpiexec -n 4 ./ex1 -filename basin2layer.exo
rank 1: of nprcs: 4, num_nodes 432, num_cess 486
rank 0: of nprcs: 4, num_nodes 405, num_cess 448
rank 2: of nprcs: 4, num_nodes 411, num_cess 464
rank 3: of nprcs: 4, num_nodes 420, num_cess 466
However, if I compile and run the code using the script you shared, I get reasonable results.
➜ petsc-3.13.0 make -f ./gmakefile test globsearch="dm_impls_plex_tests-ex1_cylinder" EXTRA_OPTIONS="-filename ./basin2layer.exo -dm_view hdf5:$PWD/mesh.h5 -dm_partition_view" NP=2
# > rank 0: of nprcs: 2, num_nodes 429, num_cess 484
# > rank 1: of nprcs: 2, num_nodes 402, num_cess 446
➜ petsc-3.13.0 make -f ./gmakefile test globsearch="dm_impls_plex_tests-ex1_cylinder" EXTRA_OPTIONS="-filename ./basin2layer.exo -dm_view hdf5:$PWD/mesh.h5 -dm_partition_view" NP=4
# > rank 1: of nprcs: 4, num_nodes 246, num_cess 260
# > rank 2: of nprcs: 4, num_nodes 264, num_cess 274
# > rank 3: of nprcs: 4, num_nodes 264, num_cess 280
# > rank 0: of nprcs: 4, num_nodes 273, num_cess 284
Is there some difference in compiling or runtime options that cause the difference? Would you please check if you can reproduce the same problem using the modified ex1.c?
Thanks,
Danyang
From: Danyang Su <danyang.su at gmail.com>
Date: Wednesday, April 8, 2020 at 9:37 AM
To: Matthew Knepley <knepley at gmail.com>
Cc: PETSc <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] DMPlex partition problem
From: Matthew Knepley <knepley at gmail.com>
Date: Wednesday, April 8, 2020 at 9:20 AM
To: Danyang Su <danyang.su at gmail.com>
Cc: PETSc <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] DMPlex partition problem
On Wed, Apr 8, 2020 at 12:13 PM Danyang Su <danyang.su at gmail.com> wrote:
From: Matthew Knepley <knepley at gmail.com>
Date: Wednesday, April 8, 2020 at 6:45 AM
To: Danyang Su <danyang.su at gmail.com>
Cc: PETSc <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] DMPlex partition problem
On Wed, Apr 8, 2020 at 7:25 AM Matthew Knepley <knepley at gmail.com> wrote:
On Wed, Apr 8, 2020 at 12:48 AM Danyang Su <danyang.su at gmail.com> wrote:
Dear All,
Hope you are safe and healthy.
I have a question regarding pretty different partition results of prism mesh. The partition in PETSc generates much more ghost nodes/cells than the partition in Gmsh, even though both use metis as partitioner. Attached please find the prism mesh in both vtk and exo format, the test code modified based on ex1f90 example. Similar problem are observed for larger dataset with more layers.
I will figure this out by next week.
I have run your mesh and do not get those weird partitions. I am running in master. What are you using? Also, here is an easy way
to do this using a PETSc test:
cd $PETSC_DIR
make -f ./gmakefile test globsearch="dm_impls_plex_tests-ex1_cylinder" EXTRA_OPTIONS="-filename ${HOME}/Downloads/basin2layer.exo -dm_view hdf5:$PWD/mesh.h5 -dm_partition_view" NP=5
./lib/petsc/bin/petsc_gen_xdmf.py mesh.h5
and then load mesh.xmf into Paraview. Here is what I see (attached). Is it possible for you to try the master branch?
Hi Matt,
Thanks for your quick response. If I use your script, the partition looks good, as shown in the attached figure. I am working on PETSc 3.13.0 release version on Mac OS.
Does the above script use code /petsc/src/dm/label/tutorials/ex1c.c?
It uses $PETSC_DIR/src/dm/impls/plex/tests/ex1.c
I looked at your code and cannot see any difference. Also, no changes are in master that are not in 3.13. This is very strange.
I guess we will have to go one step at a time between the example and your code.
I will add mesh output to the ex1f90 example and check if the cell/vertex rank is exactly the same. I wrote the mesh output myself based on the partition but there should be no problem in that part. The number of ghost nodes and cells is pretty easy to check. Not sure if there is any difference between the C code and Fortran code that causes the problem. Anyway, I will keep you updated.
Thanks,
Matt
Thanks,
Matt
Thanks,
Matt
For example, in Gmsh, I get partition results using two processors and four processors as shown below, which are pretty reasonable.
However, in PETSc, the partition looks a bit weird. Looks like it takes layer partition first and then inside layer. If the number of nodes per layer is very large, this kind of partitioning results into much more ghost nodes/cells.
Anybody know how to improve the partitioning in PETSc? I have tried parmetis and chaco. There is no big difference between them.
Thanks,
Danyang
--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/
--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/
--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/
--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/
--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/
--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/
--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200408/1f7a8126/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 404238 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200408/1f7a8126/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 281269 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200408/1f7a8126/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.png
Type: image/png
Size: 457836 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200408/1f7a8126/attachment-0007.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image004.png
Type: image/png
Size: 542686 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200408/1f7a8126/attachment-0008.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image005.png
Type: image/png
Size: 349317 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200408/1f7a8126/attachment-0009.png>
More information about the petsc-users
mailing list