[petsc-users] Tutorials test case cannot run in parallel

袁煕 yuanxi at advancesoft.jp
Sun Oct 31 02:05:00 CDT 2021


Please see the attached file. I Hope it will be of some help!

2021年10月30日(土) 23:40 Mark Adams <mfadams at lbl.gov>:

> Great. Thank you.
> Could you please send a 'git diff' if that is available? And we can take
> care of it.
>
>
> On Sat, Oct 30, 2021 at 9:38 AM 袁煕 <yuanxi at advancesoft.jp> wrote:
>
>> Thank you for your reply.
>>
>> I have solved the problem by modifying
>> ----------------------------------------------------
>> call DMPlexCreateFromDAG(dm, depth, numPoints, coneSize,
>> cones,coneOrientations, vertexCoords, ierr);CHKERRA(ierr)
>> ----------------------------------------------------
>> into
>> -----------------------------------------------------
>> numPoints1 = [0, 0, 0, 0]
>> if (rank == 0) then
>>     call DMPlexCreateFromDAG(dm, depth, numPoints, coneSize,
>> cones,coneOrientations, vertexCoords, ierr);CHKERRA(ierr)
>>  else
>>     call DMPlexCreateFromDAG(dm, 3, numPoints1, PETSC_NULL_INTEGER,
>> PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, PETSC_NULL_REAL, ierr)
>> endif
>> ----------------------------------------------------
>>
>> The result obtained as follows
>>
>> DM Object: testplex 2 MPI processes
>>   type: plex
>> testplex in 3 dimensions:
>>   0-cells: 12 0
>>   1-cells: 20 0
>>   2-cells: 11 0
>>   3-cells: 2 0
>> Labels:
>>   celltype: 4 strata with value/size (0 (12), 7 (2), 4 (11), 1 (20))
>>   depth: 4 strata with value/size (0 (12), 1 (20), 2 (11), 3 (2))
>> cell:  0 volume:   0.5000 centroid:  -0.2500   0.5000   0.5000
>> cell:  1 volume:   0.5000 centroid:   0.2500   0.5000   0.5000
>>
>>
>> ===================================================================================
>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> =   RANK 0 PID 15428 RUNNING AT DESKTOP-9ITFSBM
>> =   KILLED BY SIGNAL: 9 (Killed)
>>
>> ===================================================================================
>>
>> There is still problem left. I like it relevent
>>
>> 2021年10月30日(土) 21:51 Mark Adams <mfadams at lbl.gov>:
>>
>>> Ah, I can reproduce this error with debugging turned on.
>>> This test is not a parallel test, but it does not say that serial is a
>>> requirement.
>>> So there is a problem here.
>>> Anyone?
>>>
>>> On Sat, Oct 30, 2021 at 8:29 AM Mark Adams <mfadams at lbl.gov> wrote:
>>>
>>>> 08:27 adams/pcksp-batch-kokkos *=
>>>> summit:/gpfs/alpine/csc314/scratch/adams/petsc/src/dm/impls/plex/tutorials$
>>>> make PETSC_ARCH=arch-summit-opt-gnu-kokkos-cuda ex3f90
>>>> mpifort -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g
>>>> -O   -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O
>>>>  -I/gpfs/alpine/csc314/scratch/adams/petsc/include
>>>> -I/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/include
>>>> -I/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/hdf5-1.10.7-yxvwkhm4nhgezbl2mwzdruwoaiblt6q2/include
>>>> -I/sw/summit/cuda/11.0.3/include     ex3f90.F90
>>>>  -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
>>>> -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
>>>> -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
>>>> -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
>>>> -L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/netlib-lapack-3.9.1-t2a6tcso5tkezcjmfrqvqi2cpary7kgx/lib64
>>>> -Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/hdf5-1.10.7-yxvwkhm4nhgezbl2mwzdruwoaiblt6q2/lib
>>>> -L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/hdf5-1.10.7-yxvwkhm4nhgezbl2mwzdruwoaiblt6q2/lib
>>>> -Wl,-rpath,/sw/summit/cuda/11.0.3/lib64 -L/sw/summit/cuda/11.0.3/lib64
>>>> -L/sw/summit/cuda/11.0.3/lib64/stubs
>>>> -Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/spectrum-mpi-10.4.0.3-20210112-6jbupg3thjwhsabgevk6xmwhd2bbyxdc/lib
>>>> -L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/spectrum-mpi-10.4.0.3-20210112-6jbupg3thjwhsabgevk6xmwhd2bbyxdc/lib
>>>> -Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc/powerpc64le-unknown-linux-gnu/9.1.0
>>>> -L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc/powerpc64le-unknown-linux-gnu/9.1.0
>>>> -Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc
>>>> -L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc
>>>> -Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/netlib-lapack-3.9.1-t2a6tcso5tkezcjmfrqvqi2cpary7kgx/lib64
>>>> -Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib64
>>>> -L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib64
>>>> -Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-8.3.1/darshan-runtime-3.3.0-mu6tnxlhxfplrq3srkkgi5dvly6wenwy/lib
>>>> -L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-8.3.1/darshan-runtime-3.3.0-mu6tnxlhxfplrq3srkkgi5dvly6wenwy/lib
>>>> -Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib
>>>> -L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib -lpetsc
>>>> -lkokkoskernels -lkokkoscontainers -lkokkoscore -lp4est -lsc -lblas
>>>> -llapack -lhdf5_hl -lhdf5 -lm -lz -lcudart -lcufft -lcublas -lcusparse
>>>> -lcusolver -lcurand -lcuda -lstdc++ -ldl -lmpiprofilesupport
>>>> -lmpi_ibm_usempif08 -lmpi_ibm_usempi_ignore_tkr -lmpi_ibm_mpifh -lmpi_ibm
>>>> -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath
>>>> -lstdc++ -ldl -o ex3f90
>>>> 08:27 adams/pcksp-batch-kokkos *=
>>>> summit:/gpfs/alpine/csc314/scratch/adams/petsc/src/dm/impls/plex/tutorials$
>>>> jsrun -n 2 -g 1 ./ex3f90
>>>> DM Object: testplex 2 MPI processes
>>>>   type: plex
>>>> testplex in 3 dimensions:
>>>>   0-cells: 12 12
>>>>   1-cells: 20 20
>>>>   2-cells: 11 11
>>>>   3-cells: 2 2
>>>> Labels:
>>>>   celltype: 4 strata with value/size (0 (12), 7 (2), 4 (11), 1 (20))
>>>>   depth: 4 strata with value/size (0 (12), 1 (20), 2 (11), 3 (2))
>>>> cell:  0 volume:   0.5000 centroid:  -0.2500   0.5000   0.5000
>>>> cell:  1 volume:   0.5000 centroid:   0.2500   0.5000   0.5000
>>>> cell:  0 volume:   0.5000 centroid:  -0.2500   0.5000   0.5000
>>>> cell:  1 volume:   0.5000 centroid:   0.2500   0.5000   0.5000
>>>> 08:28 adams/pcksp-batch-kokkos *=
>>>> summit:/gpfs/alpine/csc314/scratch/adams/pets
>>>>
>>>> On Fri, Oct 29, 2021 at 10:41 PM 袁煕 <yuanxi at advancesoft.jp> wrote:
>>>>
>>>>> Thanks,  Mark.
>>>>>
>>>>> I do what you suggested but nothing changes. Besides, from your
>>>>> compile history and result,
>>>>>
>>>>> -  you use gfortran with no MPI library, not mpif90
>>>>> -  two CPUs gives exactly the same result
>>>>> -  The first line of the DMView output should be "DM Object: testplex
>>>>> 2 MPI processes", not "DM Object: testplex 1 MPI processes", when you use
>>>>> 2CPUs
>>>>>
>>>>> It seems like you did not use MPI but just two CPUs do exactly
>>>>> the same thing..
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Yuan
>>>>>
>>>>>
>>>>> 2021年10月29日(金) 20:22 Mark Adams <mfadams at lbl.gov>:
>>>>>
>>>>>> This works for me (appended) using an up to date version of PETSc.
>>>>>>
>>>>>> I would delete the architecture director and reconfigure, and make
>>>>>> all, and try again.
>>>>>>
>>>>>> Next, you seem to be using git. Use the 'main' branch and try again.
>>>>>>
>>>>>> Mark
>>>>>>
>>>>>> (base) 07:09 adams/swarm-omp-pc *= ~/Codes/petsc$ cd
>>>>>> src/dm/impls/plex/tutorials/
>>>>>> (base) 07:16 adams/swarm-omp-pc *=
>>>>>> ~/Codes/petsc/src/dm/impls/plex/tutorials$    make
>>>>>> PETSC_DIR=/Users/markadams/Codes/petsc PETSC_ARCH=arch-macosx-gnu-g ex3f90
>>>>>> gfortran-11 -Wl,-bind_at_load -Wl,-multiply_defined,suppress
>>>>>> -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs
>>>>>> -Wl,-search_paths_first -Wl,-no_compact_unwind  -fPIC -Wall
>>>>>> -ffree-line-length-0 -Wno-unused-dummy-argument -g -O   -fPIC -Wall
>>>>>> -ffree-line-length-0 -Wno-unused-dummy-argument -g -O
>>>>>>  -I/Users/markadams/Codes/petsc/include
>>>>>> -I/Users/markadams/Codes/petsc/arch-macosx-gnu-g/include     ex3f90.F90
>>>>>>  -Wl,-rpath,/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>>>>>> -L/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>>>>>> -Wl,-rpath,/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>>>>>> -L/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>>>>>> -Wl,-rpath,/usr/local/Cellar/gcc/11.2.0/lib/gcc/11/gcc/x86_64-apple-darwin20/11.2.0
>>>>>> -L/usr/local/Cellar/gcc/11.2.0/lib/gcc/11/gcc/x86_64-apple-darwin20/11.2.0
>>>>>> -Wl,-rpath,/usr/local/Cellar/gcc/11.2.0/lib/gcc/11
>>>>>> -L/usr/local/Cellar/gcc/11.2.0/lib/gcc/11 -lpetsc -lp4est -lsc -llapack
>>>>>> -lblas -lhdf5_hl -lhdf5 -lmetis -lz -lstdc++ -ldl -lgcc_s.1 -lgfortran
>>>>>> -lquadmath -lm -lquadmath -lstdc++ -ldl -lgcc_s.1 -o ex3f90
>>>>>> (base) 07:16 adams/swarm-omp-pc *=
>>>>>> ~/Codes/petsc/src/dm/impls/plex/tutorials$ mpirun -np 2 ./ex3f90
>>>>>> DM Object: testplex 1 MPI processes
>>>>>>   type: plex
>>>>>> testplex in 3 dimensions:
>>>>>>   0-cells: 12
>>>>>>   1-cells: 20
>>>>>>   2-cells: 11
>>>>>>   3-cells: 2
>>>>>> Labels:
>>>>>>   celltype: 4 strata with value/size (0 (12), 7 (2), 4 (11), 1 (20))
>>>>>>   depth: 4 strata with value/size (0 (12), 1 (20), 2 (11), 3 (2))
>>>>>> DM Object: testplex 1 MPI processes
>>>>>>   type: plex
>>>>>> testplex in 3 dimensions:
>>>>>>   0-cells: 12
>>>>>>   1-cells: 20
>>>>>>   2-cells: 11
>>>>>>   3-cells: 2
>>>>>> Labels:
>>>>>>   celltype: 4 strata with value/size (0 (12), 7 (2), 4 (11), 1 (20))
>>>>>>   depth: 4 strata with value/size (0 (12), 1 (20), 2 (11), 3 (2))
>>>>>> cell:  0 volume:   0.5000 centroid:  -0.2500   0.5000   0.5000
>>>>>> cell:  1 volume:   0.5000 centroid:   0.2500   0.5000   0.5000
>>>>>> cell:  0 volume:   0.5000 centroid:  -0.2500   0.5000   0.5000
>>>>>> cell:  1 volume:   0.5000 centroid:   0.2500   0.5000   0.5000
>>>>>>
>>>>>> On Fri, Oct 29, 2021 at 6:11 AM 袁煕 <yuanxi at advancesoft.jp> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I have tried the test case ex3f90 in the folder
>>>>>>> \src\dm\impls\plex\tutorials to run in parallel but found it fails. When I
>>>>>>> run it in 1 CPU by
>>>>>>>
>>>>>>> -  mpirun -np 1 ./ex3f90
>>>>>>>
>>>>>>> Everything seems OK. But when run it in 2 CPU by
>>>>>>>
>>>>>>> -  mpirun -np 2 ./ex3f90
>>>>>>>
>>>>>>> I got the following error message
>>>>>>>
>>>>>>> [0]PETSC ERROR: --------------------- Error Message
>>>>>>> --------------------------------------------------------------
>>>>>>> [0]PETSC ERROR: Object is in wrong state
>>>>>>> [0]PETSC ERROR: This DMPlex is distributed but its PointSF has no
>>>>>>> graph set
>>>>>>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble
>>>>>>> shooting.
>>>>>>> [0]PETSC ERROR: Petsc Development GIT revision:
>>>>>>> v3.16.0-248-ge617e6467c  GIT Date: 2021-10-19 23:11:25 -0500
>>>>>>> [0]PETSC ERROR: ./ex3f90 on a  named pc-010-088 by  Fri Oct 29
>>>>>>> 18:48:54 2021
>>>>>>> [0]PETSC ERROR: Configure options --with-cc=mpicc --with-cxx=mpicxx
>>>>>>> --with-fc=mpiifort --with-fortran-bindings=1 --with-debugging=0
>>>>>>> --with-blaslapack-dir=/opt/intel/oneapi/mkl/2021.4.0
>>>>>>> --with-mkl_pardiso-dir=/opt/intel/oneapi/mkl/2021.4.0 --download-metis=1
>>>>>>> --download-parmetis=1 --download-cmake --force --download-superlu_dist=1
>>>>>>> --download-mumps=1 --download-scalapack=1 --download-hypre=1
>>>>>>> --download-ml=1 --with-debugging=yes --prefix=/home/yuanxi
>>>>>>> [0]PETSC ERROR: #1 DMPlexCheckPointSF() at
>>>>>>> /home/yuanxi/myprograms/petsc/src/dm/impls/plex/plex.c:8626
>>>>>>> [0]PETSC ERROR: #2 DMPlexOrientInterface_Internal() at
>>>>>>> /home/yuanxi/myprograms/petsc/src/dm/impls/plex/plexinterpolate.c:595
>>>>>>> [0]PETSC ERROR: #3 DMPlexInterpolate() at
>>>>>>> /home/yuanxi/myprograms/petsc/src/dm/impls/plex/plexinterpolate.c:1357
>>>>>>> [0]PETSC ERROR: #4 User provided function() at User file:0
>>>>>>> Abort(73) on node 0 (rank 0 in comm 16): application called
>>>>>>> MPI_Abort(MPI_COMM_SELF, 73) - process 0
>>>>>>>
>>>>>>> ------------------------------------------------------------------------------------------------------------------------------------
>>>>>>>
>>>>>>> It fails in calling DMPlexInterpolate. Maybe this program is not
>>>>>>> considered to be run in parallel. But if I wish to do so, how should I
>>>>>>> modify it to let it run on multiple CPUs?
>>>>>>>
>>>>>>> Much thanks for your help
>>>>>>>
>>>>>>> Yuan
>>>>>>>
>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211031/3b25d282/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex3f90.diff
Type: application/octet-stream
Size: 1904 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211031/3b25d282/attachment-0001.obj>


More information about the petsc-users mailing list