[petsc-users] petsc 3.4.5 on CentOS 7.2.511

Satish Balay balay at mcs.anl.gov
Fri Nov 18 14:38:20 CST 2016


The stack says - the crash is OpenMPI's orterun [aka mpiexec].

Perhaps its broken?

you can run PETSc examples without mpiexec as:

cd src/ksp/ksp/examples/tutorials
make ex2
./ex2

I don't understand the tweak. Usually 'compat' packages are used by
precompiled binaries - that were compiled with old compilers.  And
PETSc shouldn't need it.

Also PETSc can use system blas/lapack - instead of
--download-f2cblaslapack=1 [but that shouldn't cause issues]

Also gcc, openmpi are in the base CentOS 7.2 repo - so I'm not sure I
understand the reference to EPEL.

Satish

 On Fri, 18 Nov 2016, Park, Joseph wrote:

> I'm having difficulty configuring petsc 3.4.5 on a CentOS 7.2 machine. I'm
> forced to use petsc 3.4.5 until the application (C++) can be upgraded.  We
> are restricted to EPEL packages for gcc, OpenMPI and all libraries.
> 
> The only 'tweak' is that EPEL package compat-openmpi16 is used instead
> of openmpi, as the latter seems incompatible with petsc 3.4.2.
> 
> Petsc configures and builds fine via:
> 
> ./configure --download-f2cblaslapack=1
> --with-mpi-dir=/usr/lib64/compat-openmpi16 --with-debugging=1
> --with-clanguage=cxx --with-fc=0
> 
> make PETSC_DIR=/opt/sfwmd_rsm/apps/petsc-3.4.5
> PETSC_ARCH=arch-linux2-cxx-debug all
> 
> However, at test time:
> 
> make PETSC_DIR=/opt/sfwmd_rsm/apps/petsc-3.4.5
> PETSC_ARCH=arch-linux2-cxx-debug test
> 
> Running test examples to verify correct installation
> Using PETSC_DIR=/opt/sfwmd_rsm/apps/petsc-3.4.5 and
> PETSC_ARCH=arch-linux2-cxx-debug
> /usr/bin/sh: line 20: 24136 Segmentation fault
>  /usr/lib64/compat-openmpi16/bin/mpiexec -n 1 ./ex19 -da_refine 3 -pc_type
> mg -ksp_type fgmres > ex19_1.tmp 2>&1
> Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI
> process
> See http://www.mcs.anl.gov/petsc/documentation/faq.html
> [snailkite:24136] *** Process received signal ***
> [snailkite:24136] Signal: Segmentation fault (11)
> [snailkite:24136] Signal code: Address not mapped (1)
> [snailkite:24136] Failing at address: (nil)
> [snailkite:24136] [ 0] /lib64/libpthread.so.0(+0xf100) [0x7ff020de8100]
> [snailkite:24136] [ 1] /lib64/libc.so.6(+0x85346) [0x7ff020a9c346]
> [snailkite:24136] [ 2] /lib64/libhwloc.so.5(+0x7ccb) [0x7ff021206ccb]
> [snailkite:24136] [ 3]
> /lib64/libhwloc.so.5(hwloc__insert_object_by_cpuset+0xa7) [0x7ff021206e87]
> [snailkite:24136] [ 4] /lib64/libhwloc.so.5(+0x2196e) [0x7ff02122096e]
> [snailkite:24136] [ 5] /lib64/libhwloc.so.5(+0x22828) [0x7ff021221828]
> [snailkite:24136] [ 6] /lib64/libhwloc.so.5(+0x228a3) [0x7ff0212218a3]
> [snailkite:24136] [ 7] /lib64/libhwloc.so.5(hwloc_topology_load+0x13b)
> [0x7ff0212098bb]
> [snailkite:24136] [ 8]
> /usr/lib64/compat-openmpi16/lib/libopen-rte.so.4(orte_odls_base_open+0x7ab)
> [0x7ff021fa6dbb]
> [snailkite:24136] [ 9]
> /usr/lib64/compat-openmpi16/lib/openmpi/mca_ess_hnp.so(+0x2e54)
> [0x7ff01f233e54]
> [snailkite:24136] [10]
> /usr/lib64/compat-openmpi16/lib/libopen-rte.so.4(orte_init+0x193)
> [0x7ff021f7dd83]
> [snailkite:24136] [11] /usr/lib64/compat-openmpi16/bin/mpiexec() [0x403dd5]
> [snailkite:24136] [12] /usr/lib64/compat-openmpi16/bin/mpiexec() [0x403430]
> [snailkite:24136] [13] /lib64/libc.so.6(__libc_start_main+0xf5)
> [0x7ff020a38b15]
> [snailkite:24136] [14] /usr/lib64/compat-openmpi16/bin/mpiexec() [0x403349]
> [snailkite:24136] *** End of error message ***
> /usr/bin/sh: line 20: 24139 Segmentation fault
> 
> Under gdb:
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x00007ffff6647346 in __strcmp_sse2 () from /lib64/libc.so.6
> (gdb)
> (gdb) where
> #0  0x00007ffff6647346 in __strcmp_sse2 () from /lib64/libc.so.6
> #1  0x00007ffff6db1ccb in hwloc_obj_cmp () from /lib64/libhwloc.so.5
> #2  0x00007ffff6db1e87 in hwloc__insert_object_by_cpuset () from
> /lib64/libhwloc.so.5
> #3  0x00007ffff6dcb96e in summarize () from /lib64/libhwloc.so.5
> #4  0x00007ffff6dcc828 in hwloc_look_x86 () from /lib64/libhwloc.so.5
> #5  0x00007ffff6dcc8a3 in hwloc_x86_discover () from /lib64/libhwloc.so.5
> #6  0x00007ffff6db48bb in hwloc_topology_load () from /lib64/libhwloc.so.5
> #7  0x00007ffff7b51dbb in orte_odls_base_open ()
>    from /usr/lib64/compat-openmpi16/lib/libopen-rte.so.4
> #8  0x00007ffff4ddee54 in rte_init () from
> /usr/lib64/compat-openmpi16/lib/openmpi/mca_ess_hnp.so
> #9  0x00007ffff7b28d83 in orte_init () from
> /usr/lib64/compat-openmpi16/lib/libopen-rte.so.4
> #10 0x0000000000403dd5 in orterun ()
> #11 0x0000000000403430 in main ()
> 
> Any suggestions are most welcome!
> 



More information about the petsc-users mailing list