[mpich-discuss] Errors: In direct memory block for handle type REQUEST, 2 handles are still allocated
Borrelli, Steven
steven.borrelli at citi.com
Wed Oct 26 16:12:36 CDT 2011
I'm seeing the following issues when I attempt to run a program that used to run on 1.2.1p1/mpd. What should be my next step in debugging this issue?
Results for launching mpiexec with -n
N=1, run completes without any issues
N=2,3, run completes, but has the "In direct memory block for handle type REQUEST, 2 handles are still allocated" warning
N=4+, run fails with a signal 11.
Platform: Linux x86_64,g++44 (RedHat 5.6, gcc44-c++-4.4.4-13.el5)
MPICH built with the following options: ./configure --prefix=/app/mpich2-1.4.1p1-64-DEBUG/ --enable-debuginfo --enable-g=all --with-device=ch3:nemesis
--------------------------------------------------------------------------------
Run results for n = 2, 3 and 4:
# /app/mpich2-1.4.1p1-64-DEBUG/bin/mpiexec -n 2 ./myapp
In direct memory block for handle type REQUEST, 2 handles are still allocated
[1] 48 at [0x00002aaac0813408], ch3u_handle_recv_pkt.c[248]
[1] 8 at [0x00002aaac0005a98], ch3u_eager.c[439]
# /app/mpich2-1.4.1p1-64-DEBUG/bin/mpiexec -n 3 ./myapp
In direct memory block for handle type REQUEST, 2 handles are still allocated
[2] 48 at [0x00002aaab4d62688], ch3u_handle_recv_pkt.c[248]
[2] 8 at [0x00002aaab40018c8], ch3u_eager.c[439]
In direct memory block for handle type REQUEST, 2 handles are still allocated
[1] 48 at [0x00002aaab1082218], ch3u_handle_recv_pkt.c[248]
[1] 8 at [0x00002aaab1083c28], ch3u_eager.c[439]
# /app/mpich2-1.4.1p1-64-DEBUG/bin/mpiexec -n 4 ./myapp
=====================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=====================================================================================
APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
--------------------------------------------------------------------------------
tail of debug.log
0 0 2b6fc2769760[24377] 32 1.941493 bcast.c 1190 Leaving MPID_STATE_MPIR_BCAST
0 0 2b6fc2769760[24377] 32 1.941496 bcast.c 1469 Leaving MPID_STATE_MPI_BCAST
0 0 2b6fc2769760[24377] 16 1.941513 recv.c 76 Entering MPID_STATE_MPI_RECV
0 0 2b6fc2769760[24377] 16 1.941519 mpid_recv.c 29 Entering MPID_STATE_MPID_RECV
0 0 2b6fc2769760[24377] 16 1.941525 ch3u_recvq.c 273 Entering MPID_STATE_MPIDI_CH3U_RECVQ_FDU_OR_AEP
0 0 2b6fc2769760[24377] 16 1.941530 ch3_progress.c 1148 Entering MPID_STATE_MPIDI_CH3I_POSTED_RECV_ENQUEUED
0 0 2b6fc2769760[24377] 32 1.941534 ch3_progress.c 1202 Leaving MPID_STATE_MPIDI_CH3I_POSTED_RECV_ENQUEUED
0 0 2b6fc2769760[24377] 32 1.941538 ch3u_recvq.c 407 Leaving MPID_STATE_MPIDI_CH3U_RECVQ_FDU_OR_AEP
0 0 2b6fc2769760[24377] 32 1.941542 mpid_recv.c 190 Leaving MPID_STATE_MPID_RECV
0 0 2b6fc2769760[24377] 16 1.941545 ch3_progress.c 296 Entering MPID_STATE_MPIDI_CH3I_PROGRESS
More information about the mpich-discuss
mailing list