[petsc-users] Tutorials test case cannot run in parallel

Mark Adams mfadams at lbl.gov
Sat Oct 30 07:51:27 CDT 2021


Ah, I can reproduce this error with debugging turned on.
This test is not a parallel test, but it does not say that serial is a
requirement.
So there is a problem here.
Anyone?

On Sat, Oct 30, 2021 at 8:29 AM Mark Adams <mfadams at lbl.gov> wrote:

> 08:27 adams/pcksp-batch-kokkos *=
> summit:/gpfs/alpine/csc314/scratch/adams/petsc/src/dm/impls/plex/tutorials$
> make PETSC_ARCH=arch-summit-opt-gnu-kokkos-cuda ex3f90
> mpifort -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O
>   -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O
>  -I/gpfs/alpine/csc314/scratch/adams/petsc/include
> -I/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/include
> -I/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/hdf5-1.10.7-yxvwkhm4nhgezbl2mwzdruwoaiblt6q2/include
> -I/sw/summit/cuda/11.0.3/include     ex3f90.F90
>  -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
> -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
> -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
> -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
> -L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/netlib-lapack-3.9.1-t2a6tcso5tkezcjmfrqvqi2cpary7kgx/lib64
> -Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/hdf5-1.10.7-yxvwkhm4nhgezbl2mwzdruwoaiblt6q2/lib
> -L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/hdf5-1.10.7-yxvwkhm4nhgezbl2mwzdruwoaiblt6q2/lib
> -Wl,-rpath,/sw/summit/cuda/11.0.3/lib64 -L/sw/summit/cuda/11.0.3/lib64
> -L/sw/summit/cuda/11.0.3/lib64/stubs
> -Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/spectrum-mpi-10.4.0.3-20210112-6jbupg3thjwhsabgevk6xmwhd2bbyxdc/lib
> -L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/spectrum-mpi-10.4.0.3-20210112-6jbupg3thjwhsabgevk6xmwhd2bbyxdc/lib
> -Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc/powerpc64le-unknown-linux-gnu/9.1.0
> -L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc/powerpc64le-unknown-linux-gnu/9.1.0
> -Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc
> -L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc
> -Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/netlib-lapack-3.9.1-t2a6tcso5tkezcjmfrqvqi2cpary7kgx/lib64
> -Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib64
> -L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib64
> -Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-8.3.1/darshan-runtime-3.3.0-mu6tnxlhxfplrq3srkkgi5dvly6wenwy/lib
> -L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-8.3.1/darshan-runtime-3.3.0-mu6tnxlhxfplrq3srkkgi5dvly6wenwy/lib
> -Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib
> -L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib -lpetsc
> -lkokkoskernels -lkokkoscontainers -lkokkoscore -lp4est -lsc -lblas
> -llapack -lhdf5_hl -lhdf5 -lm -lz -lcudart -lcufft -lcublas -lcusparse
> -lcusolver -lcurand -lcuda -lstdc++ -ldl -lmpiprofilesupport
> -lmpi_ibm_usempif08 -lmpi_ibm_usempi_ignore_tkr -lmpi_ibm_mpifh -lmpi_ibm
> -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath
> -lstdc++ -ldl -o ex3f90
> 08:27 adams/pcksp-batch-kokkos *=
> summit:/gpfs/alpine/csc314/scratch/adams/petsc/src/dm/impls/plex/tutorials$
> jsrun -n 2 -g 1 ./ex3f90
> DM Object: testplex 2 MPI processes
>   type: plex
> testplex in 3 dimensions:
>   0-cells: 12 12
>   1-cells: 20 20
>   2-cells: 11 11
>   3-cells: 2 2
> Labels:
>   celltype: 4 strata with value/size (0 (12), 7 (2), 4 (11), 1 (20))
>   depth: 4 strata with value/size (0 (12), 1 (20), 2 (11), 3 (2))
> cell:  0 volume:   0.5000 centroid:  -0.2500   0.5000   0.5000
> cell:  1 volume:   0.5000 centroid:   0.2500   0.5000   0.5000
> cell:  0 volume:   0.5000 centroid:  -0.2500   0.5000   0.5000
> cell:  1 volume:   0.5000 centroid:   0.2500   0.5000   0.5000
> 08:28 adams/pcksp-batch-kokkos *=
> summit:/gpfs/alpine/csc314/scratch/adams/pets
>
> On Fri, Oct 29, 2021 at 10:41 PM 袁煕 <yuanxi at advancesoft.jp> wrote:
>
>> Thanks,  Mark.
>>
>> I do what you suggested but nothing changes. Besides, from your compile
>> history and result,
>>
>> -  you use gfortran with no MPI library, not mpif90
>> -  two CPUs gives exactly the same result
>> -  The first line of the DMView output should be "DM Object: testplex 2
>> MPI processes", not "DM Object: testplex 1 MPI processes", when you use
>> 2CPUs
>>
>> It seems like you did not use MPI but just two CPUs do exactly the same
>> thing..
>>
>> Best regards,
>>
>> Yuan
>>
>>
>> 2021年10月29日(金) 20:22 Mark Adams <mfadams at lbl.gov>:
>>
>>> This works for me (appended) using an up to date version of PETSc.
>>>
>>> I would delete the architecture director and reconfigure, and make all,
>>> and try again.
>>>
>>> Next, you seem to be using git. Use the 'main' branch and try again.
>>>
>>> Mark
>>>
>>> (base) 07:09 adams/swarm-omp-pc *= ~/Codes/petsc$ cd
>>> src/dm/impls/plex/tutorials/
>>> (base) 07:16 adams/swarm-omp-pc *=
>>> ~/Codes/petsc/src/dm/impls/plex/tutorials$    make
>>> PETSC_DIR=/Users/markadams/Codes/petsc PETSC_ARCH=arch-macosx-gnu-g ex3f90
>>> gfortran-11 -Wl,-bind_at_load -Wl,-multiply_defined,suppress
>>> -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs
>>> -Wl,-search_paths_first -Wl,-no_compact_unwind  -fPIC -Wall
>>> -ffree-line-length-0 -Wno-unused-dummy-argument -g -O   -fPIC -Wall
>>> -ffree-line-length-0 -Wno-unused-dummy-argument -g -O
>>>  -I/Users/markadams/Codes/petsc/include
>>> -I/Users/markadams/Codes/petsc/arch-macosx-gnu-g/include     ex3f90.F90
>>>  -Wl,-rpath,/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>>> -L/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>>> -Wl,-rpath,/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>>> -L/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>>> -Wl,-rpath,/usr/local/Cellar/gcc/11.2.0/lib/gcc/11/gcc/x86_64-apple-darwin20/11.2.0
>>> -L/usr/local/Cellar/gcc/11.2.0/lib/gcc/11/gcc/x86_64-apple-darwin20/11.2.0
>>> -Wl,-rpath,/usr/local/Cellar/gcc/11.2.0/lib/gcc/11
>>> -L/usr/local/Cellar/gcc/11.2.0/lib/gcc/11 -lpetsc -lp4est -lsc -llapack
>>> -lblas -lhdf5_hl -lhdf5 -lmetis -lz -lstdc++ -ldl -lgcc_s.1 -lgfortran
>>> -lquadmath -lm -lquadmath -lstdc++ -ldl -lgcc_s.1 -o ex3f90
>>> (base) 07:16 adams/swarm-omp-pc *=
>>> ~/Codes/petsc/src/dm/impls/plex/tutorials$ mpirun -np 2 ./ex3f90
>>> DM Object: testplex 1 MPI processes
>>>   type: plex
>>> testplex in 3 dimensions:
>>>   0-cells: 12
>>>   1-cells: 20
>>>   2-cells: 11
>>>   3-cells: 2
>>> Labels:
>>>   celltype: 4 strata with value/size (0 (12), 7 (2), 4 (11), 1 (20))
>>>   depth: 4 strata with value/size (0 (12), 1 (20), 2 (11), 3 (2))
>>> DM Object: testplex 1 MPI processes
>>>   type: plex
>>> testplex in 3 dimensions:
>>>   0-cells: 12
>>>   1-cells: 20
>>>   2-cells: 11
>>>   3-cells: 2
>>> Labels:
>>>   celltype: 4 strata with value/size (0 (12), 7 (2), 4 (11), 1 (20))
>>>   depth: 4 strata with value/size (0 (12), 1 (20), 2 (11), 3 (2))
>>> cell:  0 volume:   0.5000 centroid:  -0.2500   0.5000   0.5000
>>> cell:  1 volume:   0.5000 centroid:   0.2500   0.5000   0.5000
>>> cell:  0 volume:   0.5000 centroid:  -0.2500   0.5000   0.5000
>>> cell:  1 volume:   0.5000 centroid:   0.2500   0.5000   0.5000
>>>
>>> On Fri, Oct 29, 2021 at 6:11 AM 袁煕 <yuanxi at advancesoft.jp> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have tried the test case ex3f90 in the folder
>>>> \src\dm\impls\plex\tutorials to run in parallel but found it fails. When I
>>>> run it in 1 CPU by
>>>>
>>>> -  mpirun -np 1 ./ex3f90
>>>>
>>>> Everything seems OK. But when run it in 2 CPU by
>>>>
>>>> -  mpirun -np 2 ./ex3f90
>>>>
>>>> I got the following error message
>>>>
>>>> [0]PETSC ERROR: --------------------- Error Message
>>>> --------------------------------------------------------------
>>>> [0]PETSC ERROR: Object is in wrong state
>>>> [0]PETSC ERROR: This DMPlex is distributed but its PointSF has no graph
>>>> set
>>>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble
>>>> shooting.
>>>> [0]PETSC ERROR: Petsc Development GIT revision: v3.16.0-248-ge617e6467c
>>>>  GIT Date: 2021-10-19 23:11:25 -0500
>>>> [0]PETSC ERROR: ./ex3f90 on a  named pc-010-088 by  Fri Oct 29 18:48:54
>>>> 2021
>>>> [0]PETSC ERROR: Configure options --with-cc=mpicc --with-cxx=mpicxx
>>>> --with-fc=mpiifort --with-fortran-bindings=1 --with-debugging=0
>>>> --with-blaslapack-dir=/opt/intel/oneapi/mkl/2021.4.0
>>>> --with-mkl_pardiso-dir=/opt/intel/oneapi/mkl/2021.4.0 --download-metis=1
>>>> --download-parmetis=1 --download-cmake --force --download-superlu_dist=1
>>>> --download-mumps=1 --download-scalapack=1 --download-hypre=1
>>>> --download-ml=1 --with-debugging=yes --prefix=/home/yuanxi
>>>> [0]PETSC ERROR: #1 DMPlexCheckPointSF() at
>>>> /home/yuanxi/myprograms/petsc/src/dm/impls/plex/plex.c:8626
>>>> [0]PETSC ERROR: #2 DMPlexOrientInterface_Internal() at
>>>> /home/yuanxi/myprograms/petsc/src/dm/impls/plex/plexinterpolate.c:595
>>>> [0]PETSC ERROR: #3 DMPlexInterpolate() at
>>>> /home/yuanxi/myprograms/petsc/src/dm/impls/plex/plexinterpolate.c:1357
>>>> [0]PETSC ERROR: #4 User provided function() at User file:0
>>>> Abort(73) on node 0 (rank 0 in comm 16): application called
>>>> MPI_Abort(MPI_COMM_SELF, 73) - process 0
>>>>
>>>> ------------------------------------------------------------------------------------------------------------------------------------
>>>>
>>>> It fails in calling DMPlexInterpolate. Maybe this program is not
>>>> considered to be run in parallel. But if I wish to do so, how should I
>>>> modify it to let it run on multiple CPUs?
>>>>
>>>> Much thanks for your help
>>>>
>>>> Yuan
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211030/a0b8aedb/attachment.html>


More information about the petsc-users mailing list