[MPICH] RE: Send + Send on a 3 node

Calin Iaru calin at dolphinics.no
Tue Jan 8 06:15:04 CST 2008


I would like to give you an update to what happened with Reduce_scatter:
  - the unexpected receive queue has been modified to grow up to a 
limit. However, the problem may be that all the requests in the queue 
come from the same node, and the MPI library needs an MPIC_Recv from a 
different node. This would lead to a deadlock.
  - we opted for growing the allocation of indirect blocks to maximum 
allowed. For reference, see MPIU_Handle_obj_alloc() and 
MPIU_Handle_indirect_init(). If there are too many unexpected receives, 
the job would fail, which makes for an easier case to debug.

Let me know if you find a way to fix this reliably. This problem can 
still occur on Reduce_scatter implemented in the original distribution, 
although it could be harder to reproduce over ch3_sock.

Calin Iaru wrote:
> Is this flow control going to be implemented in the next patch? I 
> think I will make some changes to the old 1.2p1 release on my machine 
> as an alternative.
>
> --------------------------------------------------
> From: "Rajeev Thakur" <thakur at mcs.anl.gov>
> Sent: Thursday, January 03, 2008 8:19 AM
> To: "'Calin Iaru'" <calin at dolphinics.no>
> Cc: <mpich-discuss at mcs.anl.gov>
> Subject: [MPICH] RE: Send + Send on a 3 node
>
>> This kind of problem will require flow control within the MPI
>> implementation.
>>
>> Rajeev
>>
>>> -----Original Message-----
>>> From: Calin Iaru [mailto:calin at dolphinics.no]
>>> Sent: Wednesday, January 02, 2008 8:35 AM
>>> To: Rajeev Thakur
>>> Cc: mpich-discuss at mcs.anl.gov
>>> Subject: Send + Send on a 3 node
>>>
>>> <resend>The list has blocked zip attachments</resend>
>>>
>>> I have reduced the MPI_Reduce_scatter test to a 3 node test where:
>>>      rank 0 sends to rank 1
>>>      rank 1 sends to rank 2
>>>      rank 3 does nothing.
>>>
>>> As you can see, rank 1 will block on send because rank 2 has
>>> full receive buffers, while rank 0 will continue sending.
>>>
>>> The problem is that on rank 1, the sender also polls for
>>> incoming receive buffers which are gathered into an
>>> unexpected queue. This cannot happen forever because of
>>> limited memory that is allocated for the unexpected requests.
>>> The error returned over sockets in 1.0.6p1 is this:
>>>
>>> mpiexec -n 3 -machinefile machines.txt
>>> \\linden-4\h$\SendOn3.exe job aborted:
>>> rank: node: exit code[: error message]
>>> 0: linden-2: 1
>>> 1: linden-3: 1: Fatal error in MPI_Send: Other MPI error, error stack:
>>> MPI_Send(173).............................:
>>> MPI_Send(buf=008C2418, count=256, MPI_BYTE, dest=2, tag=1,
>>> MPI_COMM_WORLD) f ailed
>>> MPIDI_CH3i_Progress_wait(215).............: an error occurred
>>> while handling an event returned by MPIDU_Sock_Wait()
>>> MPIDI_CH3I_Progress_handle_sock_event(436):
>>> MPIDI_EagerContigIsend(567)...............: failure occurred
>>> while allocating memory for a request object
>>> 2: linden-4: 1
>>>
>>> I have attached the source code to this mail. It's available
>>> only on Windows 32.
>>>
>>> Calin Iaru wrote:
>>> > It's not so easy because this is a third party RDMA
>>> integration which
>>> > now is expected to be broken.
>>> >
>>> > Rajeev Thakur wrote:
>>> >> 1.0.2p1 is a very old version of MPICH2. Some memory leaks
>>> have been
>>> >> fixed since then. Please try with the latest release, 1.0.6p1.
>>> >>
>>> >> Rajeev
>>> >>
>>> >>> -----Original Message-----
>>> >>> From: owner-mpich-discuss at mcs.anl.gov
>>> >>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Calin Iaru
>>> >>> Sent: Friday, December 21, 2007 9:32 AM
>>> >>> To: mpich-discuss at mcs.anl.gov
>>> >>> Subject: [MPICH] MPI_Reduce_scatter
>>> >>>
>>> >>> I am using PALLAS to stress MPI_Reduce_scatter. The error
>>> reported
>>> >>> after millions of inner loops is:
>>> >>>
>>> >>> 3: MPI error  875666319 occurred
>>> >>> 3: Other MPI error, error stack:
>>> >>> 3: MPI_Reduce_scatter(1201):
>>> MPI_Reduce_scatter(sbuf=0x2aaaabdfb010,
>>> >>> rbuf=0x2aaaac1fc010, rcnts=0x176e1850, MPI_INT, MPI_SUM,
>>> >>> comm=0x84000000) failed
>>> >>> 3: MPIR_Reduce_scatter(372):
>>> >>> 3: MPIC_Send(48):
>>> >>> 3: MPIC_Wait(321):
>>> >>> 3: MPIDI_CH3_Progress(115): Unable to make message
>>> passing progress
>>> >>> 3: handle_read(280):
>>> >>> 3: MPIDI_CH3U_Handle_recv_pkt(250): failure occurred while
>>> >>> allocating memory for a request object
>>> >>> 3: aborting job:
>>> >>> 3: application called MPI_Abort(MPI_COMM_WORLD,
>>> 875666319) - process
>>> >>> 3
>>> >>>
>>> >>>
>>> >>> The library is 1.0.2p1 and I would like to know if there are some
>>> >>> changes that would fix this issue.
>>> >>>
>>> >>> Best regards,
>>> >>>     Calin
>>> >>>
>>> >>>
>>> >>>
>>> >>
>>> >>
>>> >
>>>
>>>
>>>
>>
>>
>




More information about the mpich-discuss mailing list