[mpich-discuss] MPI_Barrier(MPI_COMM_WORLD) failed
Xiao Bo Lu
xiao.lu at auckland.ac.nz
Tue Apr 21 00:49:21 CDT 2009
Hi Gauri,
I think we are using a NFS file manager system on linux. I also noticed
that when I ran "make testing", a few tests failed mainly to do with the
IO like:
Looking in ./f77/io/testlist
Unexpected output in iwriteatf: mpiexec_hpc2 (handle_sig_occurred 1144):
job ending due to env var MPIEXEC_TIMEOUT=180
Program iwriteatf exited without No Errors
Unexpected output in iwritef: mpiexec_hpc2 (handle_sig_occurred 1144):
job ending due to env var MPIEXEC_TIMEOUT=180
Program iwritef exited without No Errors
Looking in ./cxx/io/testlist
Unexpected output in iwriteatx: mpiexec_hpc2 (handle_sig_occurred 1144):
job ending due to env var MPIEXEC_TIMEOUT=180
Program iwriteatx exited without No Errors
Unexpected output in iwritex: mpiexec_hpc2 (handle_sig_occurred 1144):
job ending due to env var MPIEXEC_TIMEOUT=180
Program iwritex exited without No Errors
Another odd thing is that my mpiexec seems to work fine with small
problems but not for those large ones. I installed MUMPS (a parallel
numerical solver) and passed it's mpi examples.
Regards
Xiao
Gauri Kulkarni wrote:
> Hi,
>
> I have no experience with MPICH, but just want to chip in. Recently, I
> got errors like these as well (may not be related, but still). The
> solution - or rather resolution - that I found from discussion here is
> my version of MPICH2 is configured to be used with SLURM, meaning, the
> process manager is slurm. If I start an mpd and run the program that
> is compiled with my version of MPICH2, then I get these errors. What
> process manager are you using?
>
> mpiexec -np 2 ./helloworld.mympi
>
>
> Fatal error in MPI_Finalize: Other MPI error, error stack:
> MPI_Finalize(255)...................: MPI_Finalize failed
> MPI_Finalize(154)...................:
> MPID_Finalize(94)...................:
> MPI_Barrier(406)....................: MPI_Barrier(comm=0x44000002) failed
> MPIR_Barrier(77)....................:
> MPIC_Sendrecv(120)..................:
> MPID_Isend(103).....................: failure occurred while
> attempting to send an eager message
> MPIDI_CH3_iSend(172)................:
> MPIDI_CH3I_VC_post_sockconnect(1090):
> MPIDI_PG_SetConnInfo(615)...........: PMI_KVS_Get failedStatus of
> MPI_Init = 0 Status of MPI_Comm_Rank = 0 Status of MPI_Comm_Size = 0
> Hello world! I'm 1 of 2 on n53
> Fatal error in MPI_Finalize: Other MPI error, error stack:
> MPI_Finalize(255)...................: MPI_Finalize failed
> MPI_Finalize(154)...................:
> MPID_Finalize(94)...................:
> MPI_Barrier(406)....................: MPI_Barrier(comm=0x44000002) failed
> MPIR_Barrier(77)....................:
> MPIC_Sendrecv(120)..................:
> MPID_Isend(103).....................: failure occurred while
> attempting to send an eager message
> MPIDI_CH3_iSend(172)................:
> MPIDI_CH3I_VC_post_sockconnect(1090):
> MPIDI_PG_SetConnInfo(615)...........: PMI_KVS_Get failedStatus of
> MPI_Init = 0 Status of MPI_Comm_Rank = 0 Status of MPI_Comm_Size = 0
> Hello world! I'm 0 of 2 on n53
>
> Gauri.
> ---------
>
>
> On Mon, Apr 20, 2009 at 11:20 AM, Xiao Bo Lu <xiao.lu
> <http://xiao.lu>@auckland.ac.nz <http://auckland.ac.nz>> wrote:
>
> Hi all,
>
> I've recently installed MPICH2-1.0.8 on my local machine (x86_64
> Linux, gfortran 4.1.2) and I am now experiencing errors with my
> mpi code. The error messages are:
>
> Fatal error in MPI_Barrier: Other MPI error, error stack:
> MPI_Barrier(406)..........................:
> MPI_Barrier(MPI_COMM_WORLD) failed
> MPIR_Barrier(77)..........................:
> MPIC_Sendrecv(126)........................:
> MPIC_Wait(270)............................:
> MPIDI_CH3i_Progress_wait(215).............: an error occurred
> while handling an event returned by MPIDU_Sock_Wait()
> MPIDI_CH3I_Progress_handle_sock_event(420):
> MPIDU_Socki_handle_read(637)..............: connection failure
> (set=0,sock=1,errno=104:Connection reset by peer)[cli_0]: aborting
> job:
> Fatal error in MPI_Barrier: Other MPI error, error stack:
> MPI_Barrier(406)..........................:
> MPI_Barrier(MPI_COMM_WORLD) failed
> MPIR_Barrier(77)..........................:
> MPIC_Sendrecv(126)........................:
> MPIC_Wait(270)............................:
> MPIDI_CH3i_Progress_wait(215).............: an error occurred
> while handling an event returned by MPIDU_Sock_Wait()
> MPIDI_CH3I_Progress_handle_sock_event(420):
> MPIDU_Socki_handle_read size of processor is: 4
>
> I searched the net and found quite a few links about such error,
> but none of the posts could give a definitive fix. Do some of you
> know what could cause this error (e.g. incorrect installation;
> environmental setting..) and how to fix it?
>
> Regards
> Xiao
>
>
More information about the mpich-discuss
mailing list