[mpich-discuss] Not able to run MPI program parallely...
Pavan Balaji
balaji at mcs.anl.gov
Tue May 1 12:45:21 CDT 2012
On 05/01/2012 12:39 PM, Albert Spade wrote:
> [root at beowulf ~]# mpiexec -f hosts -n 4 /opt/mpich2-1.4.1p1/examples/./cpi
> Process 0 of 4 is on beowulf.master
> Process 3 of 4 is on beowulf.master
> Process 1 of 4 is on beowulf.master
> Process 2 of 4 is on beowulf.master
> Fatal error in PMPI_Reduce: Other MPI error, error stack:
> PMPI_Reduce(1270)...............: MPI_Reduce(sbuf=0xbff0fd08,
> rbuf=0xbff0fd00, count=1, MPI_DOUBLE, MPI_SUM, root=0, MPI_COMM_WORLD)
> failed
> MPIR_Reduce_impl(1087)..........:
> MPIR_Reduce_intra(895)..........:
> MPIR_Reduce_binomial(144).......:
> MPIDI_CH3U_Recvq_FDU_or_AEP(380): Communication error with rank 2
> MPIR_Reduce_binomial(144).......:
> MPIDI_CH3U_Recvq_FDU_or_AEP(380): Communication error with rank 1
> ^CCtrl-C caught... cleaning up processes
In your previous email you said that your host file contains this:
beowulf.master
beowulf.node1
beowulf.node2
beowulf.node3
beowulf.node4
The above output does not match this. Process 1 should be scheduled on
node1. So something is not correct here. Are you sure the information
you gave us is right?
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list