[mpich-discuss] Strange MPI_Recv error

Xiao Li shinelee.thewise at gmail.com
Thu Feb 10 23:59:26 CST 2011


Hi,

I use a small MPI program and get the following error.

Fatal error in MPI_Recv: Other MPI error, error stack:
> MPI_Recv(186)........................: MPI_Recv(buf=0012FA20, count=1,
> MPI_INT,
> src=MPI_ANY_SOURCE, tag=5, MPI_COMM_WORLD, status=0012FA80) failed
> MPIDI_CH3I_Progress(335).............:
> MPID_nem_mpich2_blocking_recv(906)...:
> MPID_nem_newtcp_module_poll(37)......:
> MPID_nem_newtcp_module_connpoll(2655):
> gen_read_fail_handler(1145)..........: read from socket failed - The
> specified network name is no longer available.


The code framework is something like this below.

if rank == 0
{
  for iter=1 to N
       MPI_Recv any
       get proc rank from status
       MPI_Send proc
  end
}
else
{
       for iter=1 to N
           MPI_Send to 0
           MPI_Recv from 0
           do some computation  here
       end
}

I do check my code carefully. And I even rewrite the core computation code
in a series way. Then I get no error.  Even more strange is that the code
will crash at different for loop iteration. I suspect the MPI can not work
in my network environment. The network is composed by four Windows XP
machines with 100/mbps Ethernet network. Would you help me on this issue?

cheers
Xiao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20110211/a2cafa3e/attachment-0001.htm>


More information about the mpich-discuss mailing list