[petsc-users] Tutorials test case cannot run in parallel
Mark Adams
mfadams at lbl.gov
Sat Oct 30 07:29:00 CDT 2021
08:27 adams/pcksp-batch-kokkos *=
summit:/gpfs/alpine/csc314/scratch/adams/petsc/src/dm/impls/plex/tutorials$
make PETSC_ARCH=arch-summit-opt-gnu-kokkos-cuda ex3f90
mpifort -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O
-fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O
-I/gpfs/alpine/csc314/scratch/adams/petsc/include
-I/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/include
-I/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/hdf5-1.10.7-yxvwkhm4nhgezbl2mwzdruwoaiblt6q2/include
-I/sw/summit/cuda/11.0.3/include ex3f90.F90
-Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
-L/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
-Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
-L/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
-L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/netlib-lapack-3.9.1-t2a6tcso5tkezcjmfrqvqi2cpary7kgx/lib64
-Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/hdf5-1.10.7-yxvwkhm4nhgezbl2mwzdruwoaiblt6q2/lib
-L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/hdf5-1.10.7-yxvwkhm4nhgezbl2mwzdruwoaiblt6q2/lib
-Wl,-rpath,/sw/summit/cuda/11.0.3/lib64 -L/sw/summit/cuda/11.0.3/lib64
-L/sw/summit/cuda/11.0.3/lib64/stubs
-Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/spectrum-mpi-10.4.0.3-20210112-6jbupg3thjwhsabgevk6xmwhd2bbyxdc/lib
-L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/spectrum-mpi-10.4.0.3-20210112-6jbupg3thjwhsabgevk6xmwhd2bbyxdc/lib
-Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc/powerpc64le-unknown-linux-gnu/9.1.0
-L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc/powerpc64le-unknown-linux-gnu/9.1.0
-Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc
-L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc
-Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/netlib-lapack-3.9.1-t2a6tcso5tkezcjmfrqvqi2cpary7kgx/lib64
-Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib64
-L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib64
-Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-8.3.1/darshan-runtime-3.3.0-mu6tnxlhxfplrq3srkkgi5dvly6wenwy/lib
-L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-8.3.1/darshan-runtime-3.3.0-mu6tnxlhxfplrq3srkkgi5dvly6wenwy/lib
-Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib
-L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib -lpetsc
-lkokkoskernels -lkokkoscontainers -lkokkoscore -lp4est -lsc -lblas
-llapack -lhdf5_hl -lhdf5 -lm -lz -lcudart -lcufft -lcublas -lcusparse
-lcusolver -lcurand -lcuda -lstdc++ -ldl -lmpiprofilesupport
-lmpi_ibm_usempif08 -lmpi_ibm_usempi_ignore_tkr -lmpi_ibm_mpifh -lmpi_ibm
-lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath
-lstdc++ -ldl -o ex3f90
08:27 adams/pcksp-batch-kokkos *=
summit:/gpfs/alpine/csc314/scratch/adams/petsc/src/dm/impls/plex/tutorials$
jsrun -n 2 -g 1 ./ex3f90
DM Object: testplex 2 MPI processes
type: plex
testplex in 3 dimensions:
0-cells: 12 12
1-cells: 20 20
2-cells: 11 11
3-cells: 2 2
Labels:
celltype: 4 strata with value/size (0 (12), 7 (2), 4 (11), 1 (20))
depth: 4 strata with value/size (0 (12), 1 (20), 2 (11), 3 (2))
cell: 0 volume: 0.5000 centroid: -0.2500 0.5000 0.5000
cell: 1 volume: 0.5000 centroid: 0.2500 0.5000 0.5000
cell: 0 volume: 0.5000 centroid: -0.2500 0.5000 0.5000
cell: 1 volume: 0.5000 centroid: 0.2500 0.5000 0.5000
08:28 adams/pcksp-batch-kokkos *=
summit:/gpfs/alpine/csc314/scratch/adams/pets
On Fri, Oct 29, 2021 at 10:41 PM 袁煕 <yuanxi at advancesoft.jp> wrote:
> Thanks, Mark.
>
> I do what you suggested but nothing changes. Besides, from your compile
> history and result,
>
> - you use gfortran with no MPI library, not mpif90
> - two CPUs gives exactly the same result
> - The first line of the DMView output should be "DM Object: testplex 2
> MPI processes", not "DM Object: testplex 1 MPI processes", when you use
> 2CPUs
>
> It seems like you did not use MPI but just two CPUs do exactly the same
> thing..
>
> Best regards,
>
> Yuan
>
>
> 2021年10月29日(金) 20:22 Mark Adams <mfadams at lbl.gov>:
>
>> This works for me (appended) using an up to date version of PETSc.
>>
>> I would delete the architecture director and reconfigure, and make all,
>> and try again.
>>
>> Next, you seem to be using git. Use the 'main' branch and try again.
>>
>> Mark
>>
>> (base) 07:09 adams/swarm-omp-pc *= ~/Codes/petsc$ cd
>> src/dm/impls/plex/tutorials/
>> (base) 07:16 adams/swarm-omp-pc *=
>> ~/Codes/petsc/src/dm/impls/plex/tutorials$ make
>> PETSC_DIR=/Users/markadams/Codes/petsc PETSC_ARCH=arch-macosx-gnu-g ex3f90
>> gfortran-11 -Wl,-bind_at_load -Wl,-multiply_defined,suppress
>> -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs
>> -Wl,-search_paths_first -Wl,-no_compact_unwind -fPIC -Wall
>> -ffree-line-length-0 -Wno-unused-dummy-argument -g -O -fPIC -Wall
>> -ffree-line-length-0 -Wno-unused-dummy-argument -g -O
>> -I/Users/markadams/Codes/petsc/include
>> -I/Users/markadams/Codes/petsc/arch-macosx-gnu-g/include ex3f90.F90
>> -Wl,-rpath,/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>> -L/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>> -Wl,-rpath,/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>> -L/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>> -Wl,-rpath,/usr/local/Cellar/gcc/11.2.0/lib/gcc/11/gcc/x86_64-apple-darwin20/11.2.0
>> -L/usr/local/Cellar/gcc/11.2.0/lib/gcc/11/gcc/x86_64-apple-darwin20/11.2.0
>> -Wl,-rpath,/usr/local/Cellar/gcc/11.2.0/lib/gcc/11
>> -L/usr/local/Cellar/gcc/11.2.0/lib/gcc/11 -lpetsc -lp4est -lsc -llapack
>> -lblas -lhdf5_hl -lhdf5 -lmetis -lz -lstdc++ -ldl -lgcc_s.1 -lgfortran
>> -lquadmath -lm -lquadmath -lstdc++ -ldl -lgcc_s.1 -o ex3f90
>> (base) 07:16 adams/swarm-omp-pc *=
>> ~/Codes/petsc/src/dm/impls/plex/tutorials$ mpirun -np 2 ./ex3f90
>> DM Object: testplex 1 MPI processes
>> type: plex
>> testplex in 3 dimensions:
>> 0-cells: 12
>> 1-cells: 20
>> 2-cells: 11
>> 3-cells: 2
>> Labels:
>> celltype: 4 strata with value/size (0 (12), 7 (2), 4 (11), 1 (20))
>> depth: 4 strata with value/size (0 (12), 1 (20), 2 (11), 3 (2))
>> DM Object: testplex 1 MPI processes
>> type: plex
>> testplex in 3 dimensions:
>> 0-cells: 12
>> 1-cells: 20
>> 2-cells: 11
>> 3-cells: 2
>> Labels:
>> celltype: 4 strata with value/size (0 (12), 7 (2), 4 (11), 1 (20))
>> depth: 4 strata with value/size (0 (12), 1 (20), 2 (11), 3 (2))
>> cell: 0 volume: 0.5000 centroid: -0.2500 0.5000 0.5000
>> cell: 1 volume: 0.5000 centroid: 0.2500 0.5000 0.5000
>> cell: 0 volume: 0.5000 centroid: -0.2500 0.5000 0.5000
>> cell: 1 volume: 0.5000 centroid: 0.2500 0.5000 0.5000
>>
>> On Fri, Oct 29, 2021 at 6:11 AM 袁煕 <yuanxi at advancesoft.jp> wrote:
>>
>>> Hi,
>>>
>>> I have tried the test case ex3f90 in the folder
>>> \src\dm\impls\plex\tutorials to run in parallel but found it fails. When I
>>> run it in 1 CPU by
>>>
>>> - mpirun -np 1 ./ex3f90
>>>
>>> Everything seems OK. But when run it in 2 CPU by
>>>
>>> - mpirun -np 2 ./ex3f90
>>>
>>> I got the following error message
>>>
>>> [0]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> [0]PETSC ERROR: Object is in wrong state
>>> [0]PETSC ERROR: This DMPlex is distributed but its PointSF has no graph
>>> set
>>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
>>> [0]PETSC ERROR: Petsc Development GIT revision: v3.16.0-248-ge617e6467c
>>> GIT Date: 2021-10-19 23:11:25 -0500
>>> [0]PETSC ERROR: ./ex3f90 on a named pc-010-088 by Fri Oct 29 18:48:54
>>> 2021
>>> [0]PETSC ERROR: Configure options --with-cc=mpicc --with-cxx=mpicxx
>>> --with-fc=mpiifort --with-fortran-bindings=1 --with-debugging=0
>>> --with-blaslapack-dir=/opt/intel/oneapi/mkl/2021.4.0
>>> --with-mkl_pardiso-dir=/opt/intel/oneapi/mkl/2021.4.0 --download-metis=1
>>> --download-parmetis=1 --download-cmake --force --download-superlu_dist=1
>>> --download-mumps=1 --download-scalapack=1 --download-hypre=1
>>> --download-ml=1 --with-debugging=yes --prefix=/home/yuanxi
>>> [0]PETSC ERROR: #1 DMPlexCheckPointSF() at
>>> /home/yuanxi/myprograms/petsc/src/dm/impls/plex/plex.c:8626
>>> [0]PETSC ERROR: #2 DMPlexOrientInterface_Internal() at
>>> /home/yuanxi/myprograms/petsc/src/dm/impls/plex/plexinterpolate.c:595
>>> [0]PETSC ERROR: #3 DMPlexInterpolate() at
>>> /home/yuanxi/myprograms/petsc/src/dm/impls/plex/plexinterpolate.c:1357
>>> [0]PETSC ERROR: #4 User provided function() at User file:0
>>> Abort(73) on node 0 (rank 0 in comm 16): application called
>>> MPI_Abort(MPI_COMM_SELF, 73) - process 0
>>>
>>> ------------------------------------------------------------------------------------------------------------------------------------
>>>
>>> It fails in calling DMPlexInterpolate. Maybe this program is not
>>> considered to be run in parallel. But if I wish to do so, how should I
>>> modify it to let it run on multiple CPUs?
>>>
>>> Much thanks for your help
>>>
>>> Yuan
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211030/3b481718/attachment-0001.html>
More information about the petsc-users
mailing list