[mpich-discuss] runtime segfault: mpich2-1.3.2 with pgi v11.5 on rhel5.6 system
Limin Gu
lgu at penguincomputing.com
Thu May 26 12:46:34 CDT 2011
Hi Everybody,
I need some help on mpich2-1.3.2 with pgi v11.5 on rhel5.6 system.
Previously I have successfully compiled and run mpich2-1.3.2 with pgi
v10.0 on rhel5.6. Recently I upgraded the pgi compiler from v10.0 to
v11.5, and I followed the exact same steps to compile mpich2-1.3.2,
the compilation was fine, but I got segfault when I try to run mpiexec
or mpirun.
The steps are below:
[root at flatline mpich2-1.3.2]# env CC=pgcc FC=pgf90 F77=pgf77 CXX=pgCC
./configure --prefix=/home/lgu/mpich2_install --enable-shared
[root at flatline mpich2-1.3.2]# make
[root at flatline mpich2-1.3.2]# make install
[root at flatline mpich2-1.3.2]#
[root at flatline mpich2-1.3.2]# which mpiexec
/home/lgu/mpich2_install/bin/mpiexec
[root at flatline mpich2-1.3.2]# which mpicc
/home/lgu/mpich2_install/bin/mpicc
[root at flatline mpich2-1.3.2]# which pgcc
/usr/pgi/linux86-64/11.5/bin/pgcc
[root at flatline mpich2-1.3.2]#
[root at flatline mpich2-1.3.2]# cd examples/
[root at flatline examples]# mpicc -o cpi cpi.c
[root at flatline examples]# mpiexec -hosts master -np 1 ./cpi
Segmentation fault
[root at flatline examples]# mpiexec
Segmentation fault
[root at flatline examples]#
When I tried to run "strace mpiexec", it shows mmap tried to allocate
huge memory, and it failed.
open("/sys/devices/system/node", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 3
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
brk(0) = 0x19125000
brk(0x1914e000) = 0x1914e000
getdents(3, /* 4 entries */, 32768) = 112
getdents(3, /* 0 entries */, 32768) = 0
close(3) = 0
mmap(NULL, 18446744073223036928, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 18446744073223168000, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 134217728, PROT_NONE,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x2b4e88607000
munmap(0x2b4e88607000, 60788736) = 0
munmap(0x2b4e90000000, 6320128) = 0
mprotect(0x2b4e8c000000, 135168, PROT_READ|PROT_WRITE) = 0
mmap(NULL, 18446744073223036928, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++
I have also tried to configure with "CC=pgcc FC=pgfortran
F77=pgfortran CXX=pgcpp CFLAGS=-fast FCFLAGS=-fast FFLAGS=-fast
CXXFLAGS=-fast" like pgi guide suggests, I got the same segfault.
Does anybody know what possibly caused this problem?
BTW, this happens on rhel5.6, I have successfully built and run
mpich2-1.3.2 with the new pgi v11.5 on rhel4.9.
Thank you very much for any help on this!
Limin
More information about the mpich-discuss
mailing list