[MPICH] max number of isend/irecv allowed?
Rajeev Thakur
thakur at mcs.anl.gov
Fri Feb 15 21:53:07 CST 2008
The error message shows a connection failure. Can you try with the new 1.0.7
rc1?
Rajeev
> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Wei-keng Liao
> Sent: Friday, February 15, 2008 3:41 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: [MPICH] max number of isend/irecv allowed?
>
>
> Is there a max number of MPI isend/irecv calls allowed per
> process before a MPI_Wait_all is called?
>
> I am seeing an error message below when a large number of
> isend/irecv are used (eg. 512 processes):
>
> [cli_53]: aborting job:
> Fatal error in MPI_Waitall: Other MPI error, error stack:
> MPI_Waitall(258)............................:
> MPI_Waitall(count=1024,
> req_array=0x5f7730, status_array=0x8176c0) failed
> MPIDI_CH3i_Progress_wait(215)...............: an error
> occurred while
> handling an event returned by MPIDU_Sock_Wait()
> MPIDI_CH3I_Progress_handle_sock_event(779)..:
> MPIDI_CH3_Sockconn_handle_connect_event(608): [ch3:sock] failed to
> connnect to remote process
> MPIDU_Socki_handle_connect(791).............: connection failure
> (set=0,sock=18,errno=110:(strerror() not found))
>
> INTERNAL ERROR: Invalid error class (66) encountered while
> returning from
> MPI_Waitall. Please file a bug report. No error stack is
> available.
> [cli_29]: aborting job:
>
> The program attached reporduces the error. The error occurs
> only when running more than 512 processes. (I tested 8
> processes per node, each node has 2 CPUs). This program is
> extracted from ADIOI_Calc_others_req(). I found the
> collective I/O crashed is due to this error. I think this may
> also relate to the hanging problem I posted earlier but not
> yet solved.
>
> I am using mpich2-1.0.6p1.
>
> Wei-keng
>
>
More information about the mpich-discuss
mailing list