[petsc-users] Strange Partition in PETSc 3.11 version on some computers
Danyang Su
danyang.su at gmail.com
Fri Oct 18 18:24:55 CDT 2019
I use the default partition from PETSc. Is there any partition option
available from PETSc side for METIS?
Thanks,
Danyang
On 2019-10-18 3:32 p.m., Mark Adams wrote:
> The 3.11 and 3.12 partitions look like a default, lexicographical,
> partitioning of a certain mesh that I can not see. Could this be the
> original partitioning (ie, "current" partitioning type)?
>
> On Fri, Oct 18, 2019 at 5:54 PM Danyang Su via petsc-users
> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>
> Hi All,
>
> I am now able to reproduce the partition problem using a
> relatively small mesh (attached). The mesh consists of 9087 nodes,
> 15656 prism cells. There are 39 layers with 233 nodes for each
> layer. I have tested the partition using PETSc as well as Gmsh 3.0.1.
>
> Taking 4 partitions as an example, the partition from PETSc 3.9
> and 3.10 are reasonable though not perfect, with total number of
> ghost nodes / total number of nodes ratio 2754 / 9087.
>
> The partition from PETSc 3.11, PETSc 3.12 and PETSc-dev look
> weird, with total number of ghost nodes / total number of nodes:
> 12413 / 9087. The nodes are not well connected for the same processor.
>
> Note: the z axis is scaled by 25 for better visualization in paraview.
>
>
> The partition from Gmsh-Metis is a bit different but still quite
> similar to PETSc 3.9 and 3.10.
>
>
> Finally, the partition using Gmsh-Chaco Multilevel-KL algorithm is
> the best one, with total number of ghost nodes / total number of
> nodes: 741 / 9087 . For most of my simulation cases with much
> larger meshes, PETSc 3.9 and 3.10 generate partition similar to
> the one below, which work pretty well and the code can get very
> good speedup.
>
> Thanks,
>
> Danyang
>
> On 2019-09-18 11:44 a.m., Danyang Su wrote:
>>
>> On 2019-09-18 10:56 a.m., Smith, Barry F. via petsc-users wrote:
>>>
>>>> On Sep 18, 2019, at 12:25 PM, Mark Lohry via petsc-users
>>>> <petsc-users at mcs.anl.gov> <mailto:petsc-users at mcs.anl.gov> wrote:
>>>>
>>>> Mark,
>>> Mark,
>>>
>>> Good point. This has been a big headache forever
>>>
>>> Note that this has been "fixed" in the master version of
>>> PETSc and will be in its next release. If you use
>>> --download-parmetis in the future it will use the same random
>>> numbers on all machines and thus should produce the same
>>> partitions on all machines.
>>>
>>> I think that metis has aways used the same random
>>> numbers and all machines and thus always produced the same results.
>>>
>>> Barry
>> Good to know this. I will the same configuration that causes
>> strange partition problem to test the next version.
>>
>> Thanks,
>>
>> Danyang
>>
>>>
>>>
>>>> The machine, compiler and MPI version should not matter.
>>>>
>>>> I might have missed something earlier in the thread, but
>>>> parmetis has a dependency on the machine's glibc srand, and it
>>>> can (and does) create different partitions with different srand
>>>> versions. The same mesh on the same code on the same process
>>>> count can and will give different partitions (possibly bad
>>>> ones) on different machines.
>>>>
>>>> On Tue, Sep 17, 2019 at 1:05 PM Mark Adams via petsc-users
>>>> <petsc-users at mcs.anl.gov> <mailto:petsc-users at mcs.anl.gov> wrote:
>>>>
>>>>
>>>> On Tue, Sep 17, 2019 at 12:53 PM Danyang Su
>>>> <danyang.su at gmail.com> <mailto:danyang.su at gmail.com> wrote:
>>>> Hi Mark,
>>>>
>>>> Thanks for your follow-up.
>>>>
>>>> The unstructured grid code has been verified and there is no
>>>> problem in the results. The convergence rate is also good. The
>>>> 3D mesh is not good, it is based on the original stratum which
>>>> I haven't refined, but good for initial test as it is relative
>>>> small and the results obtained from this mesh still makes sense.
>>>>
>>>> The 2D meshes are just for testing purpose as I want to
>>>> reproduce the partition problem on a cluster using PETSc3.11.3
>>>> and Intel2019. Unfortunately, I didn't find problem using this
>>>> example.
>>>>
>>>> The code has no problem in using different PETSc versions
>>>> (PETSc V3.4 to V3.11)
>>>>
>>>> OK, it is the same code. I thought I saw something about your
>>>> code changing.
>>>>
>>>> Just to be clear, v3.11 never gives you good partitions. It is
>>>> not just a problem on this Intel cluster.
>>>>
>>>> The machine, compiler and MPI version should not matter.
>>>> and MPI distribution (MPICH, OpenMPI, IntelMPI), except for
>>>> one simulation case (the mesh I attached) on a cluster with
>>>> PETSc3.11.3 and Intel2019u4 due to the very different partition
>>>> compared to PETSc3.9.3. Yet the simulation results are the same
>>>> except for the efficiency problem because the strange partition
>>>> results into much more communication (ghost nodes).
>>>>
>>>> I am still trying different compiler and mpi with PETSc3.11.3
>>>> on that cluster to trace the problem. Will get back to you guys
>>>> when there is update.
>>>>
>>>>
>>>> This is very strange. You might want to use 'git bisect'. You
>>>> set a good and a bad SHA1 (we can give you this for 3.9 and
>>>> 3.11 and the exact commands). The git will go to a version in
>>>> the middle. You then reconfigure, remake, rebuild your code,
>>>> run your test. Git will ask you, as I recall, if the version is
>>>> good or bad. Once you get this workflow going it is not too
>>>> bad, depending on how hard this loop is of course.
>>>> Thanks,
>>>>
>>>> danyang
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191018/0332abb8/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: basin-3d-dgr20000.png
Type: image/png
Size: 85113 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191018/0332abb8/attachment-0003.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gmsh-partition-metis.png
Type: image/png
Size: 61754 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191018/0332abb8/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gmsh-partition-Chaco.png
Type: image/png
Size: 66392 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191018/0332abb8/attachment-0005.png>
More information about the petsc-users
mailing list