[mpich2-dev] [mvapich-discuss] Need a hint in debugging a problem that only affects a few machines in our cluster.

Mike Heinz michael.heinz at qlogic.com
Wed Jul 15 14:02:03 CDT 2009


Krishna, thanks for the suggestion - but setting MV2_USE_SHMEM_COLL to zero did not seem to change the stack trace much:

Node 0:

0x00002aaaaab5d8b7 in MPIDI_CH3I_MRAILI_Cq_poll (vbuf_handle=0x7fffcb46d698,
    vc_req=0x0, receiving=0, is_blocking=1) at ibv_channel_manager.c:529
529         for (; i < rdma_num_hcas; ++i) {
(gdb) where
#0  0x00002aaaaab5d8b7 in MPIDI_CH3I_MRAILI_Cq_poll (
    vbuf_handle=0x7fffcb46d698, vc_req=0x0, receiving=0, is_blocking=1)
    at ibv_channel_manager.c:529
#1  0x00002aaaaab177fa in MPIDI_CH3I_read_progress (vc_pptr=0x7fffcb46d6a0,
    v_ptr=0x7fffcb46d698, is_blocking=1) at ch3_read_progress.c:143
#2  0x00002aaaaab17464 in MPIDI_CH3I_Progress (is_blocking=1,
    state=<value optimized out>) at ch3_progress.c:202
#3  0x00002aaaaab5bc4e in MPIC_Wait (request_ptr=0x2aaaaae19800)
    at helper_fns.c:269
#4  0x00002aaaaab5c043 in MPIC_Sendrecv (sendbuf=0x10993a80, sendcount=2,
    sendtype=1275069445, dest=1, sendtag=7, recvbuf=0x10993a88, recvcount=2,
    recvtype=1275069445, source=1, recvtag=7, comm=1140850688,
    status=0x7fffcb46d820) at helper_fns.c:125
#5  0x00002aaaaaafe387 in MPIR_Allgather (sendbuf=<value optimized out>,
    sendcount=<value optimized out>, sendtype=1275069445, recvbuf=0x10993a80,
    recvcount=2, recvtype=1275069445, comm_ptr=0x2aaaaae1c1e0)
    at allgather.c:192
#6  0x00002aaaaaafeff9 in PMPI_Allgather (sendbuf=0xffffffffffffffff,
    sendcount=2, sendtype=1275069445, recvbuf=0x10993a80, recvcount=2,
    recvtype=1275069445, comm=1140850688) at allgather.c:866
#7  0x00002aaaaab3b00b in PMPI_Comm_split (comm=1140850688, color=0, key=0,
    newcomm=0x2aaaaae1c2f4) at comm_split.c:196
#8  0x00002aaaaab3cd84 in create_2level_comm (comm=1140850688, size=2,
---Type <return> to continue, or q <return> to quit---
    my_rank=<value optimized out>) at create_2level_comm.c:142
#9  0x00002aaaaab6877d in PMPI_Init (argc=0x7fffcb46db3c, argv=0x7fffcb46db30)
    at init.c:146
#10 0x0000000000400b2f in main (argc=3, argv=0x7fffcb46dc78) at bw.c:27

Node 1:

MPIDI_CH3I_read_progress (vc_pptr=0x7fff0b10bb50, v_ptr=0x7fff0b10bb48,
    is_blocking=1) at ch3_read_progress.c:143
143         type = MPIDI_CH3I_MRAILI_Cq_poll(v_ptr, NULL, 0, is_blocking);
(gdb) where
#0  MPIDI_CH3I_read_progress (vc_pptr=0x7fff0b10bb50, v_ptr=0x7fff0b10bb48,
    is_blocking=1) at ch3_read_progress.c:143
#1  0x00002afc9fb21f44 in MPIDI_CH3I_Progress (is_blocking=1,
    state=<value optimized out>) at ch3_progress.c:202
#2  0x00002afc9fb6660e in MPIC_Wait (request_ptr=0x2afc9fd242a0)
    at helper_fns.c:269
#3  0x00002afc9fb66a03 in MPIC_Sendrecv (sendbuf=0xf77028, sendcount=2,
    sendtype=1275069445, dest=0, sendtag=7, recvbuf=0xf77020, recvcount=4,
    recvtype=1275069445, source=0, recvtag=7, comm=1140850688,
    status=0x7fff0b10bcd0) at helper_fns.c:125
#4  0x00002afc9fb08ddb in MPIR_Allgather (sendbuf=<value optimized out>,
    sendcount=<value optimized out>, sendtype=1275069445, recvbuf=0xf77020,
    recvcount=2, recvtype=1275069445, comm_ptr=0x2afc9fd26c80)
    at allgather.c:192
#5  0x00002afc9fb09a45 in PMPI_Allgather (sendbuf=0xffffffffffffffff,
    sendcount=2, sendtype=1275069445, recvbuf=0xf77020, recvcount=2,
    recvtype=1275069445, comm=1140850688) at allgather.c:866
#6  0x00002afc9fb4591b in PMPI_Comm_split (comm=1140850688, color=1, key=0,
    newcomm=0x2afc9fd26d94) at comm_split.c:196
#7  0x00002afc9fb478f4 in create_2level_comm (comm=1140850688, size=2,
    my_rank=<value optimized out>) at create_2level_comm.c:142
#8  0x00002afc9fb730a5 in PMPI_Init (argc=0x7fff0b10bfec, argv=0x7fff0b10bfe0)
    at init.c:146
---Type <return> to continue, or q <return> to quit---
#9  0x0000000000400bcf in main (argc=3, argv=0x7fff0b10c128) at bw.c:27

Any suggestions would be appreciated.
--
Michael Heinz
Principal Engineer, Qlogic Corporation
King of Prussia, Pennsylvania
From: kris.c1986 at gmail.com [mailto:kris.c1986 at gmail.com] On Behalf Of Krishna Chaitanya
Sent: Tuesday, July 14, 2009 6:39 PM
To: Mike Heinz
Cc: Todd Rimmer; mvapich-discuss at cse.ohio-state.edu; mpich2-dev at mcs.anl.gov
Subject: Re: [mvapich-discuss] [mpich2-dev] Need a hint in debugging a problem that only affects a few machines in our cluster.

Mike,
         The hang seems to be occuring when the MPI library is trying to create the 2-level communicator, during the init phase. Can you try running the test with MV2_USE_SHMEM_COLL<http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-1.4rc1.html#x1-16000011.74>=0. This will ensure that a flat communicator is used for the subsequent MPI calls. This might help us isolate the problem.

Thanks,
Krishna

On Tue, Jul 14, 2009 at 5:04 PM, Mike Heinz <michael.heinz at qlogic.com<mailto:michael.heinz at qlogic.com>> wrote:

We're having a very odd problem with our fabric, where, out of the entire cluster, machine "A" can't run mvapich2 programs with  machine "B", and machine "C" can't run programs with machine "D" - even though "A" can run with "D" and "B" can run with "C" - and the rest of the fabric works fine.



1)      There are no IB errors anywhere on the fabric that I can find, and the machines in question all work correctly with mvapich1 and low-level IB tests.

2)      The problem occurs whether using mpd or rsh.

3)      If I attach to the running processes, both machines appear to be waiting for a read operation to complete. (See below)



Can anyone make a suggestion on how to debug this?



Stack trace for node 0:



#0  0x000000361160abb5 in pthread_spin_lock () from /lib64/libpthread.so.0

#1  0x00002aaaab08fb6c in mthca_poll_cq (ibcq=0x2060980, ne=1,

    wc=0x7fff9d835900) at src/cq.c:468

#2  0x00002aaaaab5d8d8 in MPIDI_CH3I_MRAILI_Cq_poll (

    vbuf_handle=0x7fff9d8359d8, vc_req=0x0, receiving=0, is_blocking=1)

    at /usr/include/infiniband/verbs.h:934

#3  0x00002aaaaab177fa in MPIDI_CH3I_read_progress (vc_pptr=0x7fff9d8359e0,

    v_ptr=0x7fff9d8359d8, is_blocking=1) at ch3_read_progress.c:143

#4  0x00002aaaaab17464 in MPIDI_CH3I_Progress (is_blocking=1,

    state=<value optimized out>) at ch3_progress.c:202

#5  0x00002aaaaab5bc4e in MPIC_Wait (request_ptr=0x2aaaaae19800)

    at helper_fns.c:269

#6  0x00002aaaaab5c043 in MPIC_Sendrecv (sendbuf=0x217fc50, sendcount=2,

    sendtype=1275069445, dest=1, sendtag=7, recvbuf=0x217fc58, recvcount=2,

    recvtype=1275069445, source=1, recvtag=7, comm=1140850688,

    status=0x7fff9d835b60) at helper_fns.c:125

#7  0x00002aaaaaafe387 in MPIR_Allgather (sendbuf=<value optimized out>,

    sendcount=<value optimized out>, sendtype=1275069445, recvbuf=0x217fc50,

    recvcount=2, recvtype=1275069445, comm_ptr=0x2aaaaae1c1e0)

    at allgather.c:192

#8  0x00002aaaaaafeff9 in PMPI_Allgather (sendbuf=0xffffffffffffffff,

    sendcount=2, sendtype=1275069445, recvbuf=0x217fc50, recvcount=2,

    recvtype=1275069445, comm=1140850688) at allgather.c:866

---Type <return> to continue, or q <return> to quit---

#9  0x00002aaaaab3b00b in PMPI_Comm_split (comm=1140850688, color=0, key=0,

    newcomm=0x2aaaaae1c2f4) at comm_split.c:196

#10 0x00002aaaaab3cd84 in create_2level_comm (comm=1140850688, size=2,

    my_rank=<value optimized out>) at create_2level_comm.c:142

#11 0x00002aaaaab6877d in PMPI_Init (argc=0x7fff9d835e7c, argv=0x7fff9d835e70)

    at init.c:146

#12 0x0000000000400b2f in main (argc=3, argv=0x7fff9d835fb8) at bw.c:27



Stack trace for node 1:



#0  0x00002ac3cbdac2d2 in MPIDI_CH3I_read_progress (vc_pptr=0x7fffdee81020,

    v_ptr=0x7fffdee81018, is_blocking=1) at ch3_read_progress.c:143

#1  0x00002ac3cbdabf44 in MPIDI_CH3I_Progress (is_blocking=1,

    state=<value optimized out>) at ch3_progress.c:202

#2  0x00002ac3cbdf060e in MPIC_Wait (request_ptr=0x2ac3cbfae2a0)

    at helper_fns.c:269

#3  0x00002ac3cbdf0a03 in MPIC_Sendrecv (sendbuf=0xf79028, sendcount=2,

    sendtype=1275069445, dest=0, sendtag=7, recvbuf=0xf79020, recvcount=4,

    recvtype=1275069445, source=0, recvtag=7, comm=1140850688,

    status=0x7fffdee811a0) at helper_fns.c:125

#4  0x00002ac3cbd92ddb in MPIR_Allgather (sendbuf=<value optimized out>,

    sendcount=<value optimized out>, sendtype=1275069445, recvbuf=0xf79020,

    recvcount=2, recvtype=1275069445, comm_ptr=0x2ac3cbfb0c80)

    at allgather.c:192

#5  0x00002ac3cbd93a45 in PMPI_Allgather (sendbuf=0xffffffffffffffff,

    sendcount=2, sendtype=1275069445, recvbuf=0xf79020, recvcount=2,

    recvtype=1275069445, comm=1140850688) at allgather.c:866

#6  0x00002ac3cbdcf91b in PMPI_Comm_split (comm=1140850688, color=1, key=0,

    newcomm=0x2ac3cbfb0d94) at comm_split.c:196

#7  0x00002ac3cbdd18f4 in create_2level_comm (comm=1140850688, size=2,

    my_rank=<value optimized out>) at create_2level_comm.c:142

#8  0x00002ac3cbdfd0a5 in PMPI_Init (argc=0x7fffdee814bc, argv=0x7fffdee814b0)

    at init.c:146

---Type <return> to continue, or q <return> to quit---

#9  0x0000000000400bcf in main (argc=3, argv=0x7fffdee815f8) at bw.c:27

--

Michael Heinz

Principal Engineer, Qlogic Corporation

King of Prussia, Pennsylvania

_______________________________________________
mvapich-discuss mailing list
mvapich-discuss at cse.ohio-state.edu<mailto:mvapich-discuss at cse.ohio-state.edu>
http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss



--
In the middle of difficulty, lies opportunity
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich2-dev/attachments/20090715/6b85a1bc/attachment.htm>


More information about the mpich2-dev mailing list