[petsc-users] Strange Partition in PETSc 3.11 version on some computers

Mark Lohry mlohry at gmail.com
Wed Sep 18 12:25:47 CDT 2019


Mark,


> The machine, compiler and MPI version should not matter.


I might have missed something earlier in the thread, but parmetis has a
dependency on the machine's glibc srand, and it can (and does) create
different partitions with different srand versions. The same mesh on the
same code on the same process count can and will give different partitions
(possibly bad ones) on different machines.

On Tue, Sep 17, 2019 at 1:05 PM Mark Adams via petsc-users <
petsc-users at mcs.anl.gov> wrote:

>
>
> On Tue, Sep 17, 2019 at 12:53 PM Danyang Su <danyang.su at gmail.com> wrote:
>
>> Hi Mark,
>>
>> Thanks for your follow-up.
>>
>> The unstructured grid code has been verified and there is no problem in
>> the results. The convergence rate is also good. The 3D mesh is not good, it
>> is based on the original stratum which I haven't refined, but good for
>> initial test as it is relative small and the results obtained from this
>> mesh still makes sense.
>>
>> The 2D meshes are just for testing purpose as I want to reproduce the
>> partition problem on a cluster using PETSc3.11.3 and Intel2019.
>> Unfortunately, I didn't find problem using this example.
>>
>> The code has no problem in using different PETSc versions (PETSc V3.4 to
>> V3.11)
>>
> OK, it is the same code. I thought I saw something about your code
> changing.
>
> Just to be clear, v3.11 never gives you good partitions. It is not just a
> problem on this Intel cluster.
>
> The machine, compiler and MPI version should not matter.
>
>
>> and MPI distribution (MPICH, OpenMPI, IntelMPI), except for one
>> simulation case (the mesh I attached) on a cluster with PETSc3.11.3 and
>> Intel2019u4 due to the very different partition compared to PETSc3.9.3. Yet
>> the simulation results are the same except for the efficiency problem
>> because the strange partition results into much more communication (ghost
>> nodes).
>>
>> I am still trying different compiler and mpi with PETSc3.11.3 on that
>> cluster to trace the problem. Will get back to you guys when there is
>> update.
>>
>
> This is very strange. You might want to use 'git bisect'. You set a good
> and a bad SHA1 (we can give you this for 3.9 and 3.11 and the exact
> commands). The git will go to a version in the middle. You then
> reconfigure, remake, rebuild your code, run your test. Git will ask you, as
> I recall, if the version is good or bad. Once you get this workflow going
> it is not too bad, depending on how hard this loop is of course.
>
>
>> Thanks,
>>
>> danyang
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190918/588151d8/attachment.html>


More information about the petsc-users mailing list