[mpich-discuss] Socket closed

Tim Kroeger tim.kroeger at cevis.uni-bremen.de
Wed Nov 4 02:52:21 CST 2009


Dear all,

In my application, I get the following error message:

================================================================

Fatal error in MPI_Allgatherv: Other MPI error, error stack:
MPI_Allgatherv(1143)..............: MPI_Allgatherv(sbuf=0x70cdbbf0, scount=898392, MPI_DOUBLE, rbuf=0x75cc6cf0, rcounts=0x5c247280, displs=0x5c2471c0, MPI_DOUBLE, comm=0xc4000021) failed
MPIR_Allgatherv(789)..............:
MPIC_Sendrecv(161)................:
MPIC_Wait(513)....................:
MPIDI_CH3I_Progress(150)..........:
MPID_nem_mpich2_blocking_recv(948):
MPID_nem_tcp_connpoll(1670).......:
state_commrdy_handler(1520).......:
MPID_nem_tcp_recv_handler(1412)...: socket closed

================================================================

This is using mpich2-1.1.1p1.  The problem is reproducible, but it 
appears inside a complex application, and the program keeps running 
successfully for over 2 hours before the crash occurs.

Can anybody tell me what exactly this message means and what possible 
causes there are and how I can track that down efficiently?

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim.kroeger at mevis.fraunhofer.de            Phone +49-421-218-7710
tim.kroeger at cevis.uni-bremen.de            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany



More information about the mpich-discuss mailing list