[petsc-users] Strange Partition in PETSc 3.11 version on some computers
Matthew Knepley
knepley at gmail.com
Sun Sep 15 18:07:15 CDT 2019
On Sun, Sep 15, 2019 at 6:59 PM Danyang Su <danyang.su at gmail.com> wrote:
> Hi Matt,
>
> Thanks for the quick reply. I have no change in the adjacency. The source
> code and the simulation input files are all the same. I also tried to use
> GNU compiler and mpich with petsc 3.11.3 and it works fine.
>
> It looks like the problem is caused by the difference in configuration.
> However, the configuration is pretty the same as petsc 3.9.3 except the
> compiler and mpi used. I will contact scinet staff to check if they have
> any idea on this.
>
Very very strange since the partition is handled completely by Metis, and
does not use MPI.
Thanks,
Matt
> Thanks,
>
> Danyang
>
> On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley <knepley at gmail.com>
> wrote:
>>
>> On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>>
>>> Dear All,
>>>
>>> I have a question regarding strange partition problem in PETSc 3.11
>>> version. The problem does not exist on my local workstation. However, on a
>>> cluster with different PETSc versions, the partition seems quite different,
>>> as you can find in the figure below, which is tested with 160 processors.
>>> The color means the processor owns that subdomain. In this layered prism
>>> mesh, there are 40 layers from bottom to top and each layer has around 20k
>>> nodes. The natural order of nodes is also layered from bottom to top.
>>>
>>> The left partition (PETSc 3.10 and earlier) looks good with minimum
>>> number of ghost nodes while the right one (PETSc 3.11) looks weired with
>>> huge number of ghost nodes. Looks like the right one uses partition layer
>>> by layer. This problem exists on a a cluster but not on my local
>>> workstation for the same PETSc version (with different compiler and MPI).
>>> Other than the difference in partition and efficiency, the simulation
>>> results are the same.
>>>
>>> [image: partition difference]
>>>
>>> Below is PETSc configuration on three machine:
>>>
>>> Local workstation (works fine): ./configure --with-cc=gcc
>>> --with-cxx=g++ --with-fc=gfortran --download-mpich --download-scalapack
>>> --download-parmetis --download-metis --download-ptscotch
>>> --download-fblaslapack --download-hypre --download-superlu_dist
>>> --download-hdf5=yes --download-ctetgen --with-debugging=0 COPTFLAGS=-O3
>>> CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11
>>>
>>> Cluster with PETSc 3.9.3 (works fine):
>>> --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3
>>> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native
>>> -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2"
>>> --download-chaco=1 --download-hypre=1 --download-metis=1 --download-ml=1
>>> --download-mumps=1 --download-parmetis=1 --download-plapack=1
>>> --download-prometheus=1 --download-ptscotch=1 --download-scotch=1
>>> --download-sprng=1 --download-superlu=1 --download-superlu_dist=1
>>> --download-triangle=1 --with-avx512-kernels=1
>>> --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl
>>> --with-debugging=0 --with-hdf5=1
>>> --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl
>>> --with-scalapack=1
>>> --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]"
>>> --with-x=0
>>>
>>> Cluster with PETSc 3.11.3 (looks weired):
>>> --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3
>>> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native
>>> -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2"
>>> --download-chaco=1 --download-hdf5=1 --download-hypre=1 --download-metis=1
>>> --download-ml=1 --download-mumps=1 --download-parmetis=1
>>> --download-plapack=1 --download-prometheus=1 --download-ptscotch=1
>>> --download-scotch=1 --download-sprng=1 --download-superlu=1
>>> --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1
>>> --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl
>>> --with-cxx-dialect=C++11 --with-debugging=0
>>> --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl
>>> --with-scalapack=1
>>> --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]"
>>> --with-x=0
>>>
>>> And the partition is used by default dmplex distribution.
>>>
>>> !c distribute mesh over processes
>>> call DMPlexDistribute(dmda_flow%da,stencil_width, &
>>> PETSC_NULL_SF, &
>>> PETSC_NULL_OBJECT, &
>>> distributedMesh,ierr)
>>> CHKERRQ(ierr)
>>>
>>> Any idea on this strange problem?
>>>
>>> I just looked at the code. Your mesh should be partitioned by k-way
>> partitioning using Metis since its on 1 proc for partitioning. This code
>> is the same for 3.9 and 3.11, and you get the same result on your
>> machine. I cannot understand what might be happening on your cluster
>> (MPI plays no role). Is it possible that you changed the adjacency
>> specification in that version?
>>
>> Thanks,
>>
>> Matt
>>
>>> Thanks,
>>>
>>> Danyang
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190915/f964d036/attachment-0001.html>
More information about the petsc-users
mailing list