[mpich-discuss] Fatal error in PMPI_Bcast:

Fujun Liu liufujun07 at gmail.com
Fri May 27 10:49:21 CDT 2011


The cpi also does not work. There is no error message, but it takes forever:

xxxx at query:~/MPI$ mpiexec -n 2 -f machinefile
/home/netlab/MPI/mpich2-build/examples/cpi
Process 1 of 2 is on query
Process 0 of 2 is on trigger

I think my two hosts are still trying to communicate to each other. Any
suggestions?

Best wishes,


On Fri, May 27, 2011 at 9:42 AM, Dave Goodell <goodell at mcs.anl.gov> wrote:

> Does the "examples/cpi" program from the MPICH2 build directory work
> correctly for you when you run it on multiple nodes?
>
> -Dave
>
> On May 26, 2011, at 5:49 PM CDT, Fujun Liu wrote:
>
> > Hi everyone,
> >
> > When I try one example from
> http://beige.ucs.indiana.edu/I590/node62.html, I got the following error
> message as below. In the MPI cluster, there are two hosts. If I run the two
> processes on just one host, everything works fine. But if I run two
> processes on the two-host cluster, the following error happens. I think the
> two hosts just can't send/receive message to each other, but I don't know
> how to resolve this.
> >
> > Thanks in advance!
> >
> > xxxx at query:~/MPI$ mpiexec -n 2 -f machinefile ./GreetMaster
> > Fatal error in PMPI_Bcast: Other MPI error, error stack:
> > PMPI_Bcast(1430).......................: MPI_Bcast(buf=0x7fff13114cb0,
> count=8192, MPI_CHAR, root=0, MPI_COMM_WORLD) failed
> > MPIR_Bcast_impl(1273)..................:
> > MPIR_Bcast_intra(1107).................:
> > MPIR_Bcast_binomial(143)...............:
> > MPIC_Recv(110).........................:
> > MPIC_Wait(540).........................:
> > MPIDI_CH3I_Progress(353)...............:
> > MPID_nem_mpich2_blocking_recv(905).....:
> > MPID_nem_tcp_connpoll(1823)............:
> > state_commrdy_handler(1665)............:
> > MPID_nem_tcp_recv_handler(1559)........:
> > MPID_nem_handle_pkt(587)...............:
> > MPIDI_CH3_PktHandler_EagerSend(632)....: failure occurred while posting a
> receive for message data (MPIDI_CH3_PKT_EAGER_SEND)
> > MPIDI_CH3U_Receive_data_unexpected(251): Out of memory (unable to
> allocate -1216907051 bytes)
> > [mpiexec at query] ONE OF THE PROCESSES TERMINATED BADLY: CLEANING UP
> > APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)
> >
> > --
> > Fujun Liu
> > Department of Computer Science, University of Kentucky, 2010.08-
> > fujun.liu at uky.edu, (859)229-3659
> >
> >
> >
> > _______________________________________________
> > mpich-discuss mailing list
> > mpich-discuss at mcs.anl.gov
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>



-- 
Fujun Liu
Department of Computer Science, University of Kentucky, 2010.08-
fujun.liu at uky.edu, (859)229-3659
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20110527/f78f7158/attachment.htm>


More information about the mpich-discuss mailing list