[petsc-users] Problem running geoclaw example with petsc 3.22

Barry Smith bsmith at petsc.dev
Thu Oct 24 09:25:01 CDT 2024


   Ok, super strange, I've run many times on my Mac

   Can you please try to remove that allocated memory with ipcrm and then

   cd $PETSC_DIR/src/ksp/ksp/tutorials
   make ex89f
   mpiexec -n 3 ./ex89f  -n 20 -mpi_linear_solver_server -mpi_linear_solver_server -mpi_linear_solver_server_ksp_view -ksp_monitor -ksp_converged_reason -ksp_view  -mpi_linear_solver_server_minimum_count_per_rank 5 

  This does the same thing as the GeoClaw code but is much simpler.

   Barry


   

> On Oct 23, 2024, at 10:55 PM, Praveen C <cpraveen at gmail.com> wrote:
> 
> I get very similar error on my mac with
> 
> $ gfortran -v
> Using built-in specs.
> COLLECT_GCC=gfortran
> COLLECT_LTO_WRAPPER=/opt/homebrew/Caskroom/miniforge/base/envs/claw/libexec/gcc/arm64-apple-darwin20.0.0/13.2.0/lto-wrapper
> Target: arm64-apple-darwin20.0.0
> Configured with: ../configure --prefix=/opt/homebrew/Caskroom/miniforge/base/envs/claw --build=x86_64-apple-darwin13.4.0 --host=arm64-apple-darwin20.0.0 --target=arm64-apple-darwin20.0.0 --with-libiconv-prefix=/opt/homebrew/Caskroom/miniforge/base/envs/claw --enable-languages=fortran --disable-multilib --enable-checking=release --disable-bootstrap --disable-libssp --with-gmp=/opt/homebrew/Caskroom/miniforge/base/envs/claw --with-mpfr=/opt/homebrew/Caskroom/miniforge/base/envs/claw --with-mpc=/opt/homebrew/Caskroom/miniforge/base/envs/claw --with-isl=/opt/homebrew/Caskroom/miniforge/base/envs/claw --enable-darwin-at-rpath
> Thread model: posix
> Supported LTO compression algorithms: zlib
> gcc version 13.2.0 (GCC) 
> 
> Before starting
> 
> $ ipcs -m
> IPC status from <running system> as of Thu Oct 24 08:02:11 IST 2024
> T     ID     KEY        MODE       OWNER    GROUP
> Shared Memory:
> 
> and when I run the code
> 
>  Using a PETSc solver
>  Using Bouss equations from the start
>  rnode allocated...
>  node allocated...
>  listOfGrids allocated...
>  Storage allocated...
>  bndList allocated...
> Gridding level   1 at t =  0.000000E+00:     4 grids with       10000 cells
>    Setting initial dt to    2.9999999999999999E-002
>   max threads set to            1
>   
>  Done reading data, starting computation ...  
>   
>  Total zeta at initial time:    39269.907650665169     
> GEOCLAW: Frame    0 output files done at time t =  0.000000D+00
> 
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Petsc has generated inconsistent data
> [0]PETSC ERROR: Unable to locate PCMPI allocated shared address 0x130698000
> [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc!
> [0]PETSC ERROR:   Option left: name:-ksp_type value: preonly source: file
> [0]PETSC ERROR:   Option left: name:-mpi_ksp_max_it value: 200 source: file
> [0]PETSC ERROR:   Option left: name:-mpi_ksp_reuse_preconditioner (no value) source: file
> [0]PETSC ERROR:   Option left: name:-mpi_ksp_rtol value: 1.e-9 source: file
> [0]PETSC ERROR:   Option left: name:-mpi_ksp_type value: gmres source: file
> [0]PETSC ERROR:   Option left: name:-mpi_linear_solver_server_view (no value) source: file
> [0]PETSC ERROR:   Option left: name:-mpi_pc_gamg_sym_graph value: true source: file
> [0]PETSC ERROR:   Option left: name:-mpi_pc_gamg_symmetrize_graph value: true source: file
> [0]PETSC ERROR:   Option left: name:-mpi_pc_type value: gamg source: file
> [0]PETSC ERROR:   Option left: name:-pc_mpi_minimum_count_per_rank value: 5000 source: file
> [0]PETSC ERROR:   Option left: name:-pc_type value: mpi source: file
> [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!bpaAXGQ_lR0oWQBFY_WN-vIyC5TaIwS8E1FfnQbLpgX0LCqFj8xzFvY7F6MGWmGqzB_3JjhTC_ApnH63wNWGfxg$  for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.22.0, Sep 28, 2024 
> [0]PETSC ERROR: /Users/praveen/work/bouss/radial_flat/xgeoclaw with 6 MPI process(es) and PETSC_ARCH  on MacMiniHome.local by praveen Thu Oct 24 08:04:27 2024
> [0]PETSC ERROR: Configure options: AR=arm64-apple-darwin20.0.0-ar CC=mpicc CXX=mpicxx FC=mpifort CFLAGS="-ftree-vectorize -fPIC -fstack-protector-strong -O2 -pipe -isystem /opt/homebrew/Caskroom/miniforge/base/envs/claw/include  " CPPFLAGS="-D_FORTIFY_SOURCE=2 -isystem /opt/homebrew/Caskroom/miniforge/base/envs/claw/include -mmacosx-version-min=11.0 -mmacosx-version-min=11.0" CXXFLAGS="-ftree-vectorize -fPIC -fstack-protector-strong -O2 -pipe -stdlib=libc++ -fvisibility-inlines-hidden -fmessage-length=0 -isystem /opt/homebrew/Caskroom/miniforge/base/envs/claw/include  " FFLAGS="-march=armv8.3-a -ftree-vectorize -fPIC -fno-stack-protector -O2 -pipe -isystem /opt/homebrew/Caskroom/miniforge/base/envs/claw/include  " LDFLAGS="-Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs -Wl,-rpath,/opt/homebrew/Caskroom/miniforge/base/envs/claw/lib -L/opt/homebrew/Caskroom/miniforge/base/envs/claw/lib" LIBS="-Wl,-rpath,/opt/homebrew/Caskroom/miniforge/base/envs/claw/lib -lmpi_mpifh -lgfortran" --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --FOPTFLAGS=-O3 --with-clib-autodetect=0 --with-cxxlib-autodetect=0 --with-fortranlib-autodetect=0 --with-debugging=0 --with-blas-lib=libblas.dylib --with-lapack-lib=liblapack.dylib --with-yaml=1 --with-hdf5=1 --with-fftw=1 --with-hwloc=0 --with-hypre=1 --with-metis=1 --with-mpi=1 --with-mumps=1 --with-parmetis=1 --with-pthread=1 --with-ptscotch=1 --with-shared-libraries --with-ssl=0 --with-scalapack=1 --with-superlu=1 --with-superlu_dist=1 --with-superlu_dist-include=/opt/homebrew/Caskroom/miniforge/base/envs/claw/include/superlu-dist --with-superlu_dist-lib=-lsuperlu_dist --with-suitesparse=1 --with-suitesparse-dir=/opt/homebrew/Caskroom/miniforge/base/envs/claw --with-x=0 --with-scalar-type=real   --with-cuda=0 --with-batch --prefix=/opt/homebrew/Caskroom/miniforge/base/envs/claw
> [0]PETSC ERROR: #1 PetscShmgetMapAddresses() at /Users/runner/miniforge3/conda-bld/petsc_1728030427805/work/src/sys/utils/server.c:114
> [0]PETSC ERROR: #2 PCMPISetMat() at /Users/runner/miniforge3/conda-bld/petsc_1728030427805/work/src/ksp/pc/impls/mpi/pcmpi.c:269
> [0]PETSC ERROR: #3 PCSetUp_MPI() at /Users/runner/miniforge3/conda-bld/petsc_1728030427805/work/src/ksp/pc/impls/mpi/pcmpi.c:853
> [0]PETSC ERROR: #4 PCSetUp() at /Users/runner/miniforge3/conda-bld/petsc_1728030427805/work/src/ksp/pc/interface/precon.c:1071
> [0]PETSC ERROR: #5 KSPSetUp() at /Users/runner/miniforge3/conda-bld/petsc_1728030427805/work/src/ksp/ksp/interface/itfunc.c:415
> [0]PETSC ERROR: #6 KSPSolve_Private() at /Users/runner/miniforge3/conda-bld/petsc_1728030427805/work/src/ksp/ksp/interface/itfunc.c:826
> [0]PETSC ERROR: #7 KSPSolve() at /Users/runner/miniforge3/conda-bld/petsc_1728030427805/work/src/ksp/ksp/interface/itfunc.c:1075
> 
> Code does not progress and I kill it
> 
> ^CTraceback (most recent call last):
>   File "/Users/praveen/Applications/clawpack/clawutil/src/python/clawutil/runclaw.py", line 341, in <module>
>     runclaw(*args)
>   File "/Users/praveen/Applications/clawpack/clawutil/src/python/clawutil/runclaw.py", line 242, in runclaw
>     proc = subprocess.check_call(cmd_split,
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/opt/homebrew/Caskroom/miniforge/base/envs/claw/lib/python3.12/subprocess.py", line 408, in check_call
>     retcode = call(*popenargs, **kwargs)
>               ^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/opt/homebrew/Caskroom/miniforge/base/envs/claw/lib/python3.12/subprocess.py", line 391, in call
>     return p.wait(timeout=timeout)
>            ^^^^^^^^^^^^^^^^^^^^^^^
>   File "/opt/homebrew/Caskroom/miniforge/base/envs/claw/lib/python3.12/subprocess.py", line 1264, in wait
>     return self._wait(timeout=timeout)
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/opt/homebrew/Caskroom/miniforge/base/envs/claw/lib/python3.12/subprocess.py", line 2053, in _wait
>     (pid, sts) = self._try_wait(0)
>                  ^^^^^^^^^^^^^^^^^
>   File "/opt/homebrew/Caskroom/miniforge/base/envs/claw/lib/python3.12/subprocess.py", line 2011, in _try_wait
>     (pid, sts) = os.waitpid(self.pid, wait_flags)
>                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> KeyboardInterrupt
> make[1]: *** [output] Interrupt: 2
> make: *** [.output] Interrupt: 2
> 
> Now it says
> 
> $ ipcs -m        
> IPC status from <running system> as of Thu Oct 24 08:05:06 IST 2024
> T     ID     KEY        MODE       OWNER    GROUP
> Shared Memory:
> m 720896 0x0000000a --rw-rw-rw-  praveen    staff
> 
> Thanks
> praveen
> 
>> On 23 Oct 2024, at 8:26 PM, Barry Smith <bsmith at petsc.dev> wrote:
>> 
>> 
>>    Hmm,  so it is creating the first shared memory region with ID of 10 (A in hex) puts it in a linked list in PETSc but then when it tries to find it in the linked list it cannot find it.
>> 
>>    I don't know how to reproduce this or debug it remotely.
>> 
>>     Can you build on a completely different machine or with completely different compilers?
>> 
>>    Barry
>> 
>> 
>> 
>> 
>>> On Oct 23, 2024, at 10:31 AM, Praveen C <cpraveen at gmail.com> wrote:
>>> 
>>> I get same error and now it shows
>>> 
>>> $ ipcs -m
>>> 
>>> ------ Shared Memory Segments --------
>>> key        shmid      owner      perms      bytes      nattch     status      
>>> 0x0000000a 32788      praveen    666        240        6 
>>> 
>>> Note that the code seems to be still running after printing those error message, but it is not printing any progress which it should do.
>>> 
>>> Thanks
>>> praveen
>>> 
>>>> On 23 Oct 2024, at 7:56 PM, Barry Smith <bsmith at petsc.dev> wrote:
>>>> 
>>>> 
>>>>    Try 
>>>> 
>>>> ipcrm -m 11
>>>> 
>>>> ipcs -m
>>>> 
>>>> Try running the program again
>>>> 
>>>> If failed check 
>>>> 
>>>> ipcs -m
>>>> 
>>>> again
>>>> 
>>>> 
>>>> 
>>>>> On Oct 23, 2024, at 10:20 AM, Praveen C <cpraveen at gmail.com> wrote:
>>>>> 
>>>>> Hello Barry
>>>>> 
>>>>> I see this
>>>>> 
>>>>> $ ipcs -m
>>>>> 
>>>>> ------ Shared Memory Segments --------
>>>>> key        shmid      owner      perms      bytes      nattch     status      
>>>>> 0x0000000a 11         praveen    666        240        6  
>>>>> 
>>>>> and I am observing same error as below.
>>>>> 
>>>>> Thanks
>>>>> praveen
>>>>> 
>>>>>> On 23 Oct 2024, at 7:08 PM, Barry Smith <bsmith at petsc.dev> wrote:
>>>>>> 
>>>>>> 
>>>>>>   Please take a look at the notes in https://urldefense.us/v3/__https://petsc.org/release/manualpages/Sys/PetscShmgetAllocateArray/__;!!G_uCfscf7eWS!bpaAXGQ_lR0oWQBFY_WN-vIyC5TaIwS8E1FfnQbLpgX0LCqFj8xzFvY7F6MGWmGqzB_3JjhTC_ApnH637bfMOgs$   For some reason your program is not able to access/use the Unix shared memory; check if you are already using the shared memory (so it is not available for a new run) or the limits are too low to access enough memory.
>>>>>> 
>>>>>>    Barry
>>>>>> 
>>>>>>> On Oct 23, 2024, at 8:23 AM, Praveen C <cpraveen at gmail.com> wrote:
>>>>>>> 
>>>>>>> Dear all
>>>>>>> 
>>>>>>> I am not able to run the boussinesq example from geoclaw using petsc at 3.22.0
>>>>>>> 
>>>>>>> https://urldefense.us/v3/__https://github.com/clawpack/geoclaw/tree/3303883f46572c58130d161986b8a87a57ca7816/examples/bouss__;!!G_uCfscf7eWS!e3VQ4NHKmXGstRsQW5vtI7fmKfUT9zmJkMJcPbcvPyIjicyfJpNoMgx3wZ-qyGcKNSjIkNZkzilec8MnHN6PMw$ 
>>>>>>> 
>>>>>>> It runs with petsc at 3.21.6
>>>>>>> 
>>>>>>> The error I get is given below. After printing this, the code does not progress.
>>>>>>> 
>>>>>>> I use the following petsc options
>>>>>>> 
>>>>>>> # set min numbers of matrix rows per MPI rank  (default is 10000)
>>>>>>> -mpi_linear_solve_minimum_count_per_rank 5000
>>>>>>> 
>>>>>>> 
>>>>>>> # Krylov linear solver:
>>>>>>> -mpi_linear_solver_server
>>>>>>> -mpi_linear_solver_server_view
>>>>>>> -ksp_type gmres
>>>>>>> -ksp_max_it 200
>>>>>>> -ksp_reuse_preconditioner
>>>>>>> -ksp_rtol 1.e-9
>>>>>>> 
>>>>>>> # preconditioner:
>>>>>>> -pc_type gamg
>>>>>>> 
>>>>>>> I installed petsc and other dependencies for clawpack using miniforge.
>>>>>>> 
>>>>>>> Thanks
>>>>>>> pc
>>>>>>> 
>>>>>>> ==> Use Bouss. in water deeper than    1.0000000000000000       Using a PETSc solver
>>>>>>> Using Bouss equations from the start
>>>>>>> rnode allocated...
>>>>>>> node allocated...
>>>>>>> listOfGrids allocated...
>>>>>>> Storage allocated...
>>>>>>> bndList allocated...
>>>>>>> Gridding level   1 at t =  0.000000E+00:     4 grids with       10000 cells
>>>>>>>  Setting initial dt to    2.9999999999999999E-002
>>>>>>> max threads set to            6
>>>>>>>   Done reading data, starting computation ...       Total zeta at initial time:    39269.907650665169      GEOCLAW: Frame    0 output files done at time t =  0.000000D+00
>>>>>>> 
>>>>>>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>>>>>> [0]PETSC ERROR: Petsc has generated inconsistent data
>>>>>>> [0]PETSC ERROR: Unable to locate PCMPI allocated shared address 0x55e6d750ae20
>>>>>>> [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc!
>>>>>>> [0]PETSC ERROR:   Option left: name:-ksp_max_it value: 200 source: file
>>>>>>> [0]PETSC ERROR:   Option left: name:-ksp_reuse_preconditioner (no value) source: file
>>>>>>> [0]PETSC ERROR:   Option left: name:-ksp_rtol value: 1.e-9 source: file
>>>>>>> [0]PETSC ERROR:   Option left: name:-ksp_type value: gmres source: file
>>>>>>> [0]PETSC ERROR:   Option left: name:-mpi_linear_solve_minimum_count_per_rank value: 5000 source: file
>>>>>>> [0]PETSC ERROR:   Option left: name:-mpi_linear_solver_server_view (no value) source: file
>>>>>>> [0]PETSC ERROR:   Option left: name:-pc_type value: gamg source: file
>>>>>>> [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!e3VQ4NHKmXGstRsQW5vtI7fmKfUT9zmJkMJcPbcvPyIjicyfJpNoMgx3wZ-qyGcKNSjIkNZkzilec8MvNjNo7A$  for trouble shooting.
>>>>>>> [0]PETSC ERROR: Petsc Release Version 3.22.0, Sep 28, 2024  [0]PETSC ERROR: /home/praveen/bouss/radial_flat/xgeoclaw with 6 MPI process(es) and PETSC_ARCH  on euler by praveen Thu Oct 17 21:49:54 2024
>>>>>>> [0]PETSC ERROR: Configure options: AR=${PREFIX}/bin/x86_64-conda-linux-gnu-ar CC=mpicc CXX=mpicxx FC=mpifort CFLAGS="-march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /opt/miniforge/envs/claw/include  " CPPFLAGS="-DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /opt/miniforge/envs/claw/include" CXXFLAGS="-fvisibility-inlines-hidden -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /opt/miniforge/envs/claw/include  " FFLAGS="-march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /opt/miniforge/envs/claw/include   -Wl,--no-as-needed" LDFLAGS="-pthread -fopenmp -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,--allow-shlib-undefined -Wl,-rpath,/opt/miniforge/envs/claw/lib -Wl,-rpath-link,/opt/miniforge/envs/claw/lib -L/opt/miniforge/envs/claw/lib -Wl,-rpath-link,/opt/miniforge/envs/claw/lib" LIBS="-Wl,-rpath,/opt/miniforge/envs/claw/lib -lmpi_mpifh -lgfortran" --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --FOPTFLAGS=-O3 --with-clib-autodetect=0 --with-cxxlib-autodetect=0 --with-fortranlib-autodetect=0 --with-debugging=0 --with-blas-lib=libblas.so --with-lapack-lib=liblapack.so --with-yaml=1 --with-hdf5=1 --with-fftw=1 --with-hwloc=0 --with-hypre=1 --with-metis=1 --with-mpi=1 --with-mumps=1 --with-parmetis=1 --with-pthread=1 --with-ptscotch=1 --with-shared-libraries --with-ssl=0 --with-scalapack=1 --with-superlu=1 --with-superlu_dist=1 --with-superlu_dist-include=/opt/miniforge/envs/claw/include/superlu-dist --with-superlu_dist-lib=-lsuperlu_dist --with-suitesparse=1 --with-suitesparse-dir=/opt/miniforge/envs/claw --with-x=0 --with-scalar-type=real   --with-cuda=0 --prefix=/opt/miniforge/envs/claw
>>>>>>> [0]PETSC ERROR: #1 PetscShmgetMapAddresses() at /home/conda/feedstock_root/build_artifacts/petsc_1728030599661/work/src/sys/utils/server.c:114
>>>>>>> [0]PETSC ERROR: #2 PCMPISetMat() at /home/conda/feedstock_root/build_artifacts/petsc_1728030599661/work/src/ksp/pc/impls/mpi/pcmpi.c:269
>>>>>>> [0]PETSC ERROR: #3 PCSetUp_MPI() at /home/conda/feedstock_root/build_artifacts/petsc_1728030599661/work/src/ksp/pc/impls/mpi/pcmpi.c:853
>>>>>>> [0]PETSC ERROR: #4 PCSetUp() at /home/conda/feedstock_root/build_artifacts/petsc_1728030599661/work/src/ksp/pc/interface/precon.c:1071
>>>>>>> [0]PETSC ERROR: #5 KSPSetUp() at /home/conda/feedstock_root/build_artifacts/petsc_1728030599661/work/src/ksp/ksp/interface/itfunc.c:415
>>>>>>> [0]PETSC ERROR: #6 KSPSolve_Private() at /home/conda/feedstock_root/build_artifacts/petsc_1728030599661/work/src/ksp/ksp/interface/itfunc.c:826
>>>>>>> [0]PETSC ERROR: #7 KSPSolve() at /home/conda/feedstock_root/build_artifacts/petsc_1728030599661/work/src/ksp/ksp/interface/itfunc.c:1075
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241024/9545dff4/attachment-0001.html>


More information about the petsc-users mailing list