[petsc-users] Tutorials test case cannot run in parallel

Matthew Knepley knepley at gmail.com
Sat Oct 30 11:17:24 CDT 2021


Yes, it is a serial test.

  Thanks,

    Matt

On Sat, Oct 30, 2021 at 9:38 AM 袁煕 <yuanxi at advancesoft.jp> wrote:

> Thank you for your reply.
>
> I have solved the problem by modifying
> ----------------------------------------------------
> call DMPlexCreateFromDAG(dm, depth, numPoints, coneSize,
> cones,coneOrientations, vertexCoords, ierr);CHKERRA(ierr)
> ----------------------------------------------------
> into
> -----------------------------------------------------
> numPoints1 = [0, 0, 0, 0]
> if (rank == 0) then
>     call DMPlexCreateFromDAG(dm, depth, numPoints, coneSize,
> cones,coneOrientations, vertexCoords, ierr);CHKERRA(ierr)
>  else
>     call DMPlexCreateFromDAG(dm, 3, numPoints1, PETSC_NULL_INTEGER,
> PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, PETSC_NULL_REAL, ierr)
> endif
> ----------------------------------------------------
>
> The result obtained as follows
>
> DM Object: testplex 2 MPI processes
>   type: plex
> testplex in 3 dimensions:
>   0-cells: 12 0
>   1-cells: 20 0
>   2-cells: 11 0
>   3-cells: 2 0
> Labels:
>   celltype: 4 strata with value/size (0 (12), 7 (2), 4 (11), 1 (20))
>   depth: 4 strata with value/size (0 (12), 1 (20), 2 (11), 3 (2))
> cell:  0 volume:   0.5000 centroid:  -0.2500   0.5000   0.5000
> cell:  1 volume:   0.5000 centroid:   0.2500   0.5000   0.5000
>
>
> ===================================================================================
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   RANK 0 PID 15428 RUNNING AT DESKTOP-9ITFSBM
> =   KILLED BY SIGNAL: 9 (Killed)
>
> ===================================================================================
>
> There is still problem left. I like it relevent
>
> 2021年10月30日(土) 21:51 Mark Adams <mfadams at lbl.gov>:
>
>> Ah, I can reproduce this error with debugging turned on.
>> This test is not a parallel test, but it does not say that serial is a
>> requirement.
>> So there is a problem here.
>> Anyone?
>>
>> On Sat, Oct 30, 2021 at 8:29 AM Mark Adams <mfadams at lbl.gov> wrote:
>>
>>> 08:27 adams/pcksp-batch-kokkos *=
>>> summit:/gpfs/alpine/csc314/scratch/adams/petsc/src/dm/impls/plex/tutorials$
>>> make PETSC_ARCH=arch-summit-opt-gnu-kokkos-cuda ex3f90
>>> mpifort -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g
>>> -O   -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O
>>>  -I/gpfs/alpine/csc314/scratch/adams/petsc/include
>>> -I/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/include
>>> -I/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/hdf5-1.10.7-yxvwkhm4nhgezbl2mwzdruwoaiblt6q2/include
>>> -I/sw/summit/cuda/11.0.3/include     ex3f90.F90
>>>  -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
>>> -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
>>> -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
>>> -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/lib
>>> -L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/netlib-lapack-3.9.1-t2a6tcso5tkezcjmfrqvqi2cpary7kgx/lib64
>>> -Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/hdf5-1.10.7-yxvwkhm4nhgezbl2mwzdruwoaiblt6q2/lib
>>> -L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/hdf5-1.10.7-yxvwkhm4nhgezbl2mwzdruwoaiblt6q2/lib
>>> -Wl,-rpath,/sw/summit/cuda/11.0.3/lib64 -L/sw/summit/cuda/11.0.3/lib64
>>> -L/sw/summit/cuda/11.0.3/lib64/stubs
>>> -Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/spectrum-mpi-10.4.0.3-20210112-6jbupg3thjwhsabgevk6xmwhd2bbyxdc/lib
>>> -L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/spectrum-mpi-10.4.0.3-20210112-6jbupg3thjwhsabgevk6xmwhd2bbyxdc/lib
>>> -Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc/powerpc64le-unknown-linux-gnu/9.1.0
>>> -L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc/powerpc64le-unknown-linux-gnu/9.1.0
>>> -Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc
>>> -L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib/gcc
>>> -Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/netlib-lapack-3.9.1-t2a6tcso5tkezcjmfrqvqi2cpary7kgx/lib64
>>> -Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib64
>>> -L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib64
>>> -Wl,-rpath,/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-8.3.1/darshan-runtime-3.3.0-mu6tnxlhxfplrq3srkkgi5dvly6wenwy/lib
>>> -L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-8.3.1/darshan-runtime-3.3.0-mu6tnxlhxfplrq3srkkgi5dvly6wenwy/lib
>>> -Wl,-rpath,/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib
>>> -L/autofs/nccs-svm1_sw/summit/gcc/9.1.0-alpha+20190716/lib -lpetsc
>>> -lkokkoskernels -lkokkoscontainers -lkokkoscore -lp4est -lsc -lblas
>>> -llapack -lhdf5_hl -lhdf5 -lm -lz -lcudart -lcufft -lcublas -lcusparse
>>> -lcusolver -lcurand -lcuda -lstdc++ -ldl -lmpiprofilesupport
>>> -lmpi_ibm_usempif08 -lmpi_ibm_usempi_ignore_tkr -lmpi_ibm_mpifh -lmpi_ibm
>>> -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath
>>> -lstdc++ -ldl -o ex3f90
>>> 08:27 adams/pcksp-batch-kokkos *=
>>> summit:/gpfs/alpine/csc314/scratch/adams/petsc/src/dm/impls/plex/tutorials$
>>> jsrun -n 2 -g 1 ./ex3f90
>>> DM Object: testplex 2 MPI processes
>>>   type: plex
>>> testplex in 3 dimensions:
>>>   0-cells: 12 12
>>>   1-cells: 20 20
>>>   2-cells: 11 11
>>>   3-cells: 2 2
>>> Labels:
>>>   celltype: 4 strata with value/size (0 (12), 7 (2), 4 (11), 1 (20))
>>>   depth: 4 strata with value/size (0 (12), 1 (20), 2 (11), 3 (2))
>>> cell:  0 volume:   0.5000 centroid:  -0.2500   0.5000   0.5000
>>> cell:  1 volume:   0.5000 centroid:   0.2500   0.5000   0.5000
>>> cell:  0 volume:   0.5000 centroid:  -0.2500   0.5000   0.5000
>>> cell:  1 volume:   0.5000 centroid:   0.2500   0.5000   0.5000
>>> 08:28 adams/pcksp-batch-kokkos *=
>>> summit:/gpfs/alpine/csc314/scratch/adams/pets
>>>
>>> On Fri, Oct 29, 2021 at 10:41 PM 袁煕 <yuanxi at advancesoft.jp> wrote:
>>>
>>>> Thanks,  Mark.
>>>>
>>>> I do what you suggested but nothing changes. Besides, from your compile
>>>> history and result,
>>>>
>>>> -  you use gfortran with no MPI library, not mpif90
>>>> -  two CPUs gives exactly the same result
>>>> -  The first line of the DMView output should be "DM Object: testplex 2
>>>> MPI processes", not "DM Object: testplex 1 MPI processes", when you use
>>>> 2CPUs
>>>>
>>>> It seems like you did not use MPI but just two CPUs do exactly the same
>>>> thing..
>>>>
>>>> Best regards,
>>>>
>>>> Yuan
>>>>
>>>>
>>>> 2021年10月29日(金) 20:22 Mark Adams <mfadams at lbl.gov>:
>>>>
>>>>> This works for me (appended) using an up to date version of PETSc.
>>>>>
>>>>> I would delete the architecture director and reconfigure, and make
>>>>> all, and try again.
>>>>>
>>>>> Next, you seem to be using git. Use the 'main' branch and try again.
>>>>>
>>>>> Mark
>>>>>
>>>>> (base) 07:09 adams/swarm-omp-pc *= ~/Codes/petsc$ cd
>>>>> src/dm/impls/plex/tutorials/
>>>>> (base) 07:16 adams/swarm-omp-pc *=
>>>>> ~/Codes/petsc/src/dm/impls/plex/tutorials$    make
>>>>> PETSC_DIR=/Users/markadams/Codes/petsc PETSC_ARCH=arch-macosx-gnu-g ex3f90
>>>>> gfortran-11 -Wl,-bind_at_load -Wl,-multiply_defined,suppress
>>>>> -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs
>>>>> -Wl,-search_paths_first -Wl,-no_compact_unwind  -fPIC -Wall
>>>>> -ffree-line-length-0 -Wno-unused-dummy-argument -g -O   -fPIC -Wall
>>>>> -ffree-line-length-0 -Wno-unused-dummy-argument -g -O
>>>>>  -I/Users/markadams/Codes/petsc/include
>>>>> -I/Users/markadams/Codes/petsc/arch-macosx-gnu-g/include     ex3f90.F90
>>>>>  -Wl,-rpath,/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>>>>> -L/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>>>>> -Wl,-rpath,/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>>>>> -L/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib
>>>>> -Wl,-rpath,/usr/local/Cellar/gcc/11.2.0/lib/gcc/11/gcc/x86_64-apple-darwin20/11.2.0
>>>>> -L/usr/local/Cellar/gcc/11.2.0/lib/gcc/11/gcc/x86_64-apple-darwin20/11.2.0
>>>>> -Wl,-rpath,/usr/local/Cellar/gcc/11.2.0/lib/gcc/11
>>>>> -L/usr/local/Cellar/gcc/11.2.0/lib/gcc/11 -lpetsc -lp4est -lsc -llapack
>>>>> -lblas -lhdf5_hl -lhdf5 -lmetis -lz -lstdc++ -ldl -lgcc_s.1 -lgfortran
>>>>> -lquadmath -lm -lquadmath -lstdc++ -ldl -lgcc_s.1 -o ex3f90
>>>>> (base) 07:16 adams/swarm-omp-pc *=
>>>>> ~/Codes/petsc/src/dm/impls/plex/tutorials$ mpirun -np 2 ./ex3f90
>>>>> DM Object: testplex 1 MPI processes
>>>>>   type: plex
>>>>> testplex in 3 dimensions:
>>>>>   0-cells: 12
>>>>>   1-cells: 20
>>>>>   2-cells: 11
>>>>>   3-cells: 2
>>>>> Labels:
>>>>>   celltype: 4 strata with value/size (0 (12), 7 (2), 4 (11), 1 (20))
>>>>>   depth: 4 strata with value/size (0 (12), 1 (20), 2 (11), 3 (2))
>>>>> DM Object: testplex 1 MPI processes
>>>>>   type: plex
>>>>> testplex in 3 dimensions:
>>>>>   0-cells: 12
>>>>>   1-cells: 20
>>>>>   2-cells: 11
>>>>>   3-cells: 2
>>>>> Labels:
>>>>>   celltype: 4 strata with value/size (0 (12), 7 (2), 4 (11), 1 (20))
>>>>>   depth: 4 strata with value/size (0 (12), 1 (20), 2 (11), 3 (2))
>>>>> cell:  0 volume:   0.5000 centroid:  -0.2500   0.5000   0.5000
>>>>> cell:  1 volume:   0.5000 centroid:   0.2500   0.5000   0.5000
>>>>> cell:  0 volume:   0.5000 centroid:  -0.2500   0.5000   0.5000
>>>>> cell:  1 volume:   0.5000 centroid:   0.2500   0.5000   0.5000
>>>>>
>>>>> On Fri, Oct 29, 2021 at 6:11 AM 袁煕 <yuanxi at advancesoft.jp> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have tried the test case ex3f90 in the folder
>>>>>> \src\dm\impls\plex\tutorials to run in parallel but found it fails. When I
>>>>>> run it in 1 CPU by
>>>>>>
>>>>>> -  mpirun -np 1 ./ex3f90
>>>>>>
>>>>>> Everything seems OK. But when run it in 2 CPU by
>>>>>>
>>>>>> -  mpirun -np 2 ./ex3f90
>>>>>>
>>>>>> I got the following error message
>>>>>>
>>>>>> [0]PETSC ERROR: --------------------- Error Message
>>>>>> --------------------------------------------------------------
>>>>>> [0]PETSC ERROR: Object is in wrong state
>>>>>> [0]PETSC ERROR: This DMPlex is distributed but its PointSF has no
>>>>>> graph set
>>>>>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble
>>>>>> shooting.
>>>>>> [0]PETSC ERROR: Petsc Development GIT revision:
>>>>>> v3.16.0-248-ge617e6467c  GIT Date: 2021-10-19 23:11:25 -0500
>>>>>> [0]PETSC ERROR: ./ex3f90 on a  named pc-010-088 by  Fri Oct 29
>>>>>> 18:48:54 2021
>>>>>> [0]PETSC ERROR: Configure options --with-cc=mpicc --with-cxx=mpicxx
>>>>>> --with-fc=mpiifort --with-fortran-bindings=1 --with-debugging=0
>>>>>> --with-blaslapack-dir=/opt/intel/oneapi/mkl/2021.4.0
>>>>>> --with-mkl_pardiso-dir=/opt/intel/oneapi/mkl/2021.4.0 --download-metis=1
>>>>>> --download-parmetis=1 --download-cmake --force --download-superlu_dist=1
>>>>>> --download-mumps=1 --download-scalapack=1 --download-hypre=1
>>>>>> --download-ml=1 --with-debugging=yes --prefix=/home/yuanxi
>>>>>> [0]PETSC ERROR: #1 DMPlexCheckPointSF() at
>>>>>> /home/yuanxi/myprograms/petsc/src/dm/impls/plex/plex.c:8626
>>>>>> [0]PETSC ERROR: #2 DMPlexOrientInterface_Internal() at
>>>>>> /home/yuanxi/myprograms/petsc/src/dm/impls/plex/plexinterpolate.c:595
>>>>>> [0]PETSC ERROR: #3 DMPlexInterpolate() at
>>>>>> /home/yuanxi/myprograms/petsc/src/dm/impls/plex/plexinterpolate.c:1357
>>>>>> [0]PETSC ERROR: #4 User provided function() at User file:0
>>>>>> Abort(73) on node 0 (rank 0 in comm 16): application called
>>>>>> MPI_Abort(MPI_COMM_SELF, 73) - process 0
>>>>>>
>>>>>> ------------------------------------------------------------------------------------------------------------------------------------
>>>>>>
>>>>>> It fails in calling DMPlexInterpolate. Maybe this program is not
>>>>>> considered to be run in parallel. But if I wish to do so, how should I
>>>>>> modify it to let it run on multiple CPUs?
>>>>>>
>>>>>> Much thanks for your help
>>>>>>
>>>>>> Yuan
>>>>>>
>>>>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211030/69102876/attachment-0001.html>


More information about the petsc-users mailing list