[petsc-users] Problem about optimized version
Barry Smith
bsmith at petsc.dev
Wed Aug 31 09:00:40 CDT 2022
Please send configure.log and make.log to petsc-maint at mcs.anl.gov <mailto:petsc-maint at mcs.anl.gov> (too large for petsc-users)
Did you use the exact same configure options for the optimized and debugger versions of PETSc except for the option --with-debugging=no ?
The "is processed by two cores serially, not in parallel", usually comes from using a different mpiexec than the one associated with the MPI library that you linked against.
Verify that you use the mpiexec in /public3/soft/mpich/mpich-3.4.2/bin also
verify that the MPI libraries linked codes can actually be run on the front-end; it may be that such executables can only be run on compute nodes in your system.
cd src/snes/tutorials/; make ex19 and try running that code with mpiexec
Barry
> On Aug 31, 2022, at 9:39 AM, wangxq2020--- via petsc-users <petsc-users at mcs.anl.gov> wrote:
>
> Hi
> I run the program on the debuged version of PETSC successfuly. Then I want using the optimized version. And configure by ./configure --with-debugging=no --with-packages-download-dir=~/packages_dir --with-mpich-dir="/public3/soft/mpich/mpich-3.4.2". Then I installed the Hypre package by ./configure --download-hypre=/public3/home/scg6368/packages_dir/hypre-2.24.0.tar.gz --with-mpich-dir="/public3/soft/mpich/mpich-3.4.2"
> . The result as below
>
> =============================================================================================
> Configuring PETSc to compile on your system
> =============================================================================================
> ============================================================================================= ***** WARNING: You have a version of GNU make older than 4.0. It will work, but may not support all the parallel testing options. You can install the latest GNU make with your package manager, such as brew or macports, or use the --download-make option to get the latest GNU make ***** ============================================================================================= ============================================================================================= Running configure on HYPRE; this may take several minutes ============================================================================================= ============================================================================================= Running make on HYPRE; this may take several minutes ============================================================================================= ============================================================================================= Running make install on HYPRE; this may take several minutes ============================================================================================= Compilers:
> C Compiler: /public3/soft/mpich/mpich-3.4.2/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector -fvisibility=hidden -g3 -O0
> Version: gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
> C++ Compiler: /public3/soft/mpich/mpich-3.4.2/bin/mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector -fvisibility=hidden -g -O0 -std=gnu++11 -fPIC
> Version: g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
> Fortran Compiler: /public3/soft/mpich/mpich-3.4.2/bin/mpif90 -fPIC -Wall -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O0
> Version: GNU Fortran (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
> Linkers:
> Shared linker: /public3/soft/mpich/mpich-3.4.2/bin/mpicc -shared -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector -fvisibility=hidden -g3 -O0
> Dynamic linker: /public3/soft/mpich/mpich-3.4.2/bin/mpicc -shared -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector -fvisibility=hidden -g3 -O0
> Libraries linked against: -lquadmath -lstdc++ -ldl
> BlasLapack:
> Library: -llapack -lblas
> Unknown if this uses OpenMP (try export OMP_NUM_THREADS=<1-4> yourprogram -log_view)
> uses 4 byte integers
> MPI:
> Version: 3
> Includes: -I/public3/soft/mpich/mpich-3.4.2/include
> mpiexec: /public3/soft/mpich/mpich-3.4.2/bin/mpiexec
> Implementation: mpich3
> MPICH_NUMVERSION: 30402300
> X:
> Library: -lX11
> pthread:
> Library: -lpthread
> cmake:
> Version: 2.8.12
> /usr/bin/cmake
> hypre:
> Version: 2.24.0
> Includes: -I/public3/home/scg6368/petsc-3.17.2-opt/arch-linux-c-debug/include
> Library: -Wl,-rpath,/public3/home/scg6368/petsc-3.17.2-opt/arch-linux-c-debug/lib -L/public3/home/scg6368/petsc-3.17.2-opt/arch-linux-c-debug/lib -lHYPRE
> regex:
> Language used to compile PETSc: C
> PETSc:
> PETSC_ARCH: arch-linux-c-debug
> PETSC_DIR: /public3/home/scg6368/petsc-3.17.2-opt
> Prefix: <inplace installation>
> Scalar type: real
> Precision: double
> Support for __float128
> Integer size: 4 bytes
> Single library: yes
> Shared libraries: yes
> Memory alignment from malloc(): 16 bytes
> Using GNU make: /usr/bin/gmake
> xxx=========================================================================xxx
> Configure stage complete. Now build PETSc libraries with:
> make PETSC_DIR=/public3/home/scg6368/petsc-3.17.2-opt PETSC_ARCH=arch-linux-c-debug all
> xxx=========================================================================xxx
> [scg6368 at ln1:~/petsc-3.17.2-opt]$ make PETSC_DIR=/public3/home/scg6368/petsc-3.17.2-opt PETSC_ARCH=arch-linux-c-debug all
>
> But after I compiled the C file, run the program on the login node by mpiexec -n 2 ./test37. It appears that the program is processed by two cores serially, not in parallel. The result is
> INITIALIZATION TO COMPUTE INITIAL PRESSURE TO CREATE FRACTURE WITH INITIAL VOLUME AT INJECTION RATE = 0.000000e+00 ......
> snesU converged in 0 iterations 2.
> snesV converged in 0 iterations 2.
> V min / max: 1.000000e+00 1.000000e+00
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple MacOS to find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.17.2, Jun 02, 2022
> [0]PETSC ERROR: ./test37 on a arch-linux-c-opt named ln1.para.bscc by scg6368 Wed Aug 31 21:27:26 2022
> [0]PETSC ERROR: Configure options --with-debugging=no
> [0]PETSC ERROR: #1 User provided function() at unknown file:0
> [0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash.
> Abort(59) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> snesU converged in 0 iterations 2.
> --------------------------------------------------------------------------
> Primary job terminated normally, but 1 process returned
> a non-zero exit code. Per user-direction, the job has been aborted.
> --------------------------------------------------------------------------
> snesV converged in 0 iterations 2.
> V min / max: 1.000000e+00 1.000000e+00
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple MacOS to find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.17.2, Jun 02, 2022
>
> Then I run the program on the compute nodes by sbatch scipt. It appears that there is no hypre. The content of the script is as follows,the job error file is attached. I don't know where the problem is and would like your help. Thank you very much。
>
> #!/bin/sh
> #An example for MPI job.
> #SBATCH -J PFM_frac53
> #SBATCH -o job_37-1-%j.log
> #SBATCH -e job_37-1-%j.err
> #SBATCH -N 1
> #SBATCH -n 20
> #SBATCH --partition=amd_512
> echo Time is `date`
> echo Directory is $PWD
> echo This job runs on the following nodes:
> echo $SLURM_JOB_NODELIST
> echo This job has allocated $SLURM_JOB_CPUS_PER_NODE cpu cores.
> #module load intelmpi/2018.update4
> #module load intelmpi/2019.update5
> #module load intelmpi/2020
> #module load hpcx/2.9.0/hpcx-intel-2019.update5
> module load mpi/openmpi/4.1.1-gcc7.3.0
> module load mpich/3.4.2
> #module load openmpi/4.0.2/gcc/4.8.5
> #module load openmpi/3.0.5/gcc/9.2.0
> #module load gcc/9.2.0
> #MPIRUN=mpirun #Intel mpi and Open MPI
> MPIRUN=mpiexec #MPICH
> #MPIOPT="-env I_MPI_FABRICS shm:ofi" #Intel MPI 2018 ofa, 2019 and 2020 ofi
> #MPIOPT="--mca btl self,sm,openib --mca btl_openib_cpc_include rdmacm --mca btl_openib_if_include mlx5_0:1" #Open MPI
> MPIOPT="-iface ib0" #MPICH3
> #MPIOPT=
> #$MPIRUN $MPIOPT ./mpiring
> $MPIRUN ./test37 -options_file options/test37.opts -p runtest37-1
>
> <job_37-1-1233983.err>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220831/7ce848fb/attachment-0001.html>
More information about the petsc-users
mailing list