[mpich-discuss] [MPICH2]Fatal error in PMPI_Bcast: A process has failed, error stack

ugiwgh ugiwgh at gmail.com
Thu Nov 8 08:59:00 CST 2012


I have setting two machines with mpich2-1.5. One is named "console", the other is "node".
My command is "/usr/local/mpich2-1.5/bin/mpirun -f /etc/hydra/hosts  /usr/local/mpich2-1.5/share/examples/logging/cpilog"
When I run it on "node", it runs ok. But it failed on "console". 

The following is error message
--------------
 /usr/local/mpich2-1.5/bin/mpirun -f /etc/hydra/hosts  /usr/local/mpich2-1.5/share/examples/logging/cpilog
Fatal error in PMPI_Bcast: A process has failed, error stack:
PMPI_Bcast(1525)...............: MPI_Bcast(buf=0x170e188, count=1, MPI_INT, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast_impl(1369)..........:
MPIR_Bcast_intra(1199).........:
MPIR_Bcast_binomial(195).......:
MPIC_Send(63)..................:
MPIDI_EagerContigShortSend(261): failure occurred while attempting to send an eager message
MPIDI_CH3_iStartMsg(36)........: Communication error with rank 1

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 1
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
[proxy:0:1 at node] HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:883): assert (!closed) failed
[proxy:0:1 at node] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:1 at node] main (./pm/pmiserv/pmip.c:210): demux engine error waiting for event
[mpiexec at console.paratera.com] HYDT_bscu_wait_for_completion (./tools/bootstrap/utils/bscu_wait.c:76): one of the processes terminated badly; aborting
[mpiexec at console.paratera.com] HYDT_bsci_wait_for_completion (./tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for completion
[mpiexec at console.paratera.com] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:216): launcher returned error waiting for completion
[mpiexec at console.paratera.com] main (./ui/mpich/mpiexec.c:325): process manager error waiting for completion


Any help will be appreciated.
GHui


More information about the mpich-discuss mailing list