[mpich-discuss] Not able to run MPI program parallely...

Pavan Balaji balaji at mcs.anl.gov
Tue May 1 12:45:21 CDT 2012


On 05/01/2012 12:39 PM, Albert Spade wrote:
> [root at beowulf ~]# mpiexec -f hosts -n 4 /opt/mpich2-1.4.1p1/examples/./cpi
> Process 0 of 4 is on beowulf.master
> Process 3 of 4 is on beowulf.master
> Process 1 of 4 is on beowulf.master
> Process 2 of 4 is on beowulf.master
> Fatal error in PMPI_Reduce: Other MPI error, error stack:
> PMPI_Reduce(1270)...............: MPI_Reduce(sbuf=0xbff0fd08,
> rbuf=0xbff0fd00, count=1, MPI_DOUBLE, MPI_SUM, root=0, MPI_COMM_WORLD)
> failed
> MPIR_Reduce_impl(1087)..........:
> MPIR_Reduce_intra(895)..........:
> MPIR_Reduce_binomial(144).......:
> MPIDI_CH3U_Recvq_FDU_or_AEP(380): Communication error with rank 2
> MPIR_Reduce_binomial(144).......:
> MPIDI_CH3U_Recvq_FDU_or_AEP(380): Communication error with rank 1
> ^CCtrl-C caught... cleaning up processes

In your previous email you said that your host file contains this:

beowulf.master
beowulf.node1
beowulf.node2
beowulf.node3
beowulf.node4

The above output does not match this.  Process 1 should be scheduled on 
node1.  So something is not correct here.  Are you sure the information 
you gave us is right?

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list