<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>Hi Matt,</p>

    <p>The mesh is attached, please let me know if you cannot receive

      it. I also tried to do RCM ordering for the mesh before

      distribution, but it makes no difference for the partition.</p>

    <p>Thanks,</p>

    <p>Danyang<br>

    </p>

    <div class="moz-cite-prefix">On 2019-10-18 5:20 p.m., Matthew

      Knepley wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAMYG4Gk00hHAxspwm9CXPR8Nw+Tjh3QJN3rkwESRyc_dMkjG=A@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="ltr">

        <div dir="ltr">On Fri, Oct 18, 2019 at 5:53 PM Danyang Su <<a

            href="mailto:danyang.su@gmail.com" moz-do-not-send="true">danyang.su@gmail.com</a>>

          wrote:<br>

        </div>

        <div class="gmail_quote">

          <blockquote class="gmail_quote" style="margin:0px 0px 0px

            0.8ex;border-left:1px solid

            rgb(204,204,204);padding-left:1ex">

            <div bgcolor="#FFFFFF">

              <p>Hi All,</p>

              <p>I am now able to reproduce the partition problem using

                a relatively small mesh (attached). The mesh consists of

                9087 nodes, 15656 prism cells. There are 39 layers with

                233 nodes for each layer. I have tested the partition

                using PETSc as well as Gmsh 3.0.1.</p>

            </div>

          </blockquote>

          <div>Great job finding a good test case. Can you send me that

            mesh?<br>

          </div>

          <div><br>

          </div>

          <div>  Thanks,</div>

          <div><br>

          </div>

          <div>     Matt</div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px

            0.8ex;border-left:1px solid

            rgb(204,204,204);padding-left:1ex">

            <div bgcolor="#FFFFFF">

              <p>Taking 4 partitions as an example, the partition from

                PETSc 3.9 and 3.10 are reasonable though not perfect,

                with total number of ghost nodes / total number of nodes

                ratio 2754 / 9087.<br>

              </p>

              <p>The partition from PETSc 3.11, PETSc 3.12 and PETSc-dev

                look weird, with total number of ghost nodes / total

                number of nodes: 12413 / 9087. The nodes are not well

                connected for the same processor.<br>

              </p>

              <p>Note: the z axis is scaled by 25 for better

                visualization in paraview.<br>

              </p>

              <p><img src="cid:part2.F4BB70E2.5CCC1F98@gmail.com" alt=""

                  class="" width="1164" height="563"></p>

              <p><br>

              </p>

              <p>The partition from Gmsh-Metis is a bit different but

                still quite similar to PETSc 3.9 and 3.10.<br>

              </p>

              <img src="cid:part3.6C0FF5A4.54D3711A@gmail.com" alt=""

                class="" width="1218" height="683"><br>

              <p>Finally, the partition using Gmsh-Chaco Multilevel-KL

                algorithm is the best one, with total number of ghost

                nodes / total number of nodes: 741 / 9087 . For most of

                my simulation cases with much larger meshes, PETSc 3.9

                and 3.10 generate partition similar to the one below,

                which work pretty well and the code can get very good

                speedup. <br>

              </p>

              <p><img src="cid:part4.60A9DCB5.387A7C39@gmail.com" alt=""

                  class="" width="1188" height="668"></p>

              <p>Thanks,<br>

              </p>

              <p>Danyang<br>

              </p>

              <div>On 2019-09-18 11:44 a.m., Danyang Su wrote:<br>

              </div>

              <blockquote type="cite"> <br>

                On 2019-09-18 10:56 a.m., Smith, Barry F. via

                petsc-users wrote: <br>

                <blockquote type="cite"> <br>

                  <blockquote type="cite">On Sep 18, 2019, at 12:25 PM,

                    Mark Lohry via petsc-users <a

                      href="mailto:petsc-users@mcs.anl.gov"

                      target="_blank" moz-do-not-send="true"><petsc-users@mcs.anl.gov></a>

                    wrote: <br>

                    <br>

                    Mark, <br>

                      </blockquote>

                       Mark, <br>

                  <br>

                         Good point. This has been a big headache

                  forever <br>

                  <br>

                         Note that this has been "fixed" in the master

                  version of PETSc and will be in its next release. If

                  you use --download-parmetis in the future it will use

                  the same random numbers on all machines and thus

                  should produce the same partitions on all machines. <br>

                  <br>

                          I think that metis has aways used the same

                  random numbers and all machines and thus always

                  produced the same results. <br>

                  <br>

                       Barry <br>

                </blockquote>

                Good to know this. I will the same configuration that

                causes strange partition problem to test the next

                version. <br>

                <br>

                Thanks, <br>

                <br>

                Danyang <br>

                <br>

                <blockquote type="cite"> <br>

                  <br>

                  <blockquote type="cite">The machine, compiler and MPI

                    version should not matter. <br>

                    <br>

                    I might have missed something earlier in the thread,

                    but parmetis has a dependency on the machine's glibc

                    srand, and it can (and does) create different

                    partitions with different srand versions. The same

                    mesh on the same code on the same process count can

                    and will give different partitions (possibly bad

                    ones) on different machines. <br>

                    <br>

                    On Tue, Sep 17, 2019 at 1:05 PM Mark Adams via

                    petsc-users <a

                      href="mailto:petsc-users@mcs.anl.gov"

                      target="_blank" moz-do-not-send="true"><petsc-users@mcs.anl.gov></a>

                    wrote: <br>

                    <br>

                    <br>

                    On Tue, Sep 17, 2019 at 12:53 PM Danyang Su <a

                      href="mailto:danyang.su@gmail.com" target="_blank"

                      moz-do-not-send="true"><danyang.su@gmail.com></a>

                    wrote: <br>

                    Hi Mark, <br>

                    <br>

                    Thanks for your follow-up. <br>

                    <br>

                    The unstructured grid code has been verified and

                    there is no problem in the results. The convergence

                    rate is also good. The 3D mesh is not good, it is

                    based on the original stratum which I haven't

                    refined, but good for initial test as it is relative

                    small and the results obtained from this mesh still

                    makes sense. <br>

                    <br>

                    The 2D meshes are just for testing purpose as I want

                    to reproduce the partition problem on a cluster

                    using PETSc3.11.3 and Intel2019. Unfortunately, I

                    didn't find problem using this example. <br>

                    <br>

                    The code has no problem in using different PETSc

                    versions (PETSc V3.4 to V3.11) <br>

                    <br>

                    OK, it is the same code. I thought I saw something

                    about your code changing. <br>

                    <br>

                    Just to be clear, v3.11 never gives you good

                    partitions. It is not just a problem on this Intel

                    cluster. <br>

                    <br>

                    The machine, compiler and MPI version should not

                    matter. <br>

                      and MPI distribution (MPICH, OpenMPI, IntelMPI),

                    except for one simulation case (the mesh I attached)

                    on a cluster with PETSc3.11.3 and Intel2019u4 due to

                    the very different partition compared to PETSc3.9.3.

                    Yet the simulation results are the same except for

                    the efficiency problem because the strange partition

                    results into much more communication (ghost nodes).

                    <br>

                    <br>

                    I am still trying different compiler and mpi with

                    PETSc3.11.3 on that cluster to trace the problem.

                    Will get back to you guys when there is update. <br>

                    <br>

                    <br>

                    This is very strange. You might want to use 'git

                    bisect'. You set a good and a bad SHA1 (we can give

                    you this for 3.9 and 3.11 and the exact commands).

                    The git will go to a version in the middle. You then

                    reconfigure, remake, rebuild your code, run your

                    test. Git will ask you, as I recall, if the version

                    is good or bad. Once you get this workflow going it

                    is not too bad, depending on how hard this loop is

                    of course. <br>

                      Thanks, <br>

                    <br>

                    danyang <br>

                    <br>

                  </blockquote>

                </blockquote>

              </blockquote>

            </div>

          </blockquote>

        </div>

        <br clear="all">

        <div><br>

        </div>

        -- <br>

        <div dir="ltr" class="gmail_signature">

          <div dir="ltr">

            <div>

              <div dir="ltr">

                <div>

                  <div dir="ltr">

                    <div>What most experimenters take for granted before

                      they begin their experiments is infinitely more

                      interesting than any results to which their

                      experiments lead.<br>

                      -- Norbert Wiener</div>

                    <div><br>

                    </div>

                    <div><a href="http://www.cse.buffalo.edu/~knepley/"

                        target="_blank" moz-do-not-send="true">https://www.cse.buffalo.edu/~knepley/</a><br>

                    </div>

                  </div>

                </div>

              </div>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

  </body>

</html>