[mpich-discuss] Not able to run MPI program parallely...

Ju JiaJia jujj603 at gmail.com
Mon Apr 30 23:27:26 CDT 2012


Which Process Manager are you using ? If you are using mpd, make sure mpd
is running and  all the nodes are in the ring. Use mpdtrace -l to check.

On Tue, May 1, 2012 at 5:27 AM, Albert Spade <albert.spade at gmail.com> wrote:

> Hi I want to run my program parallely on the my small cluster. It  has 5
> nodes one master and 4 compute nodes.
> When I run the below program on invidual machine it works fine and give
> proper output. But if I run it on cluster it gives below error.
> I disabled firewall.
>
> OUTPUT....
> -----------------
> [root at beowulf ~]# mpiexec -n 4 ./cpi
> Process 2 of 4 is on beowulf.master
> Process 3 of 4 is on beowulf.master
> Process 1 of 4 is on beowulf.master
> Process 0 of 4 is on beowulf.master
> Fatal error in PMPI_Reduce: Other MPI error, error stack:
> PMPI_Reduce(1270)...............: MPI_Reduce(sbuf=0xbfa66ba8,
> rbuf=0xbfa66ba0, count=1, MPI_DOUBLE, MPI_SUM, root=0, MPI_COMM_WORLD)
> failed
> MPIR_Reduce_impl(1087)..........:
> MPIR_Reduce_intra(895)..........:
> MPIR_Reduce_binomial(144).......:
> MPIDI_CH3U_Recvq_FDU_or_AEP(380): Communication error with rank 2
> MPIR_Reduce_binomial(144).......:
> MPIDI_CH3U_Recvq_FDU_or_AEP(380): Communication error with rank 1
>
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120501/c2e1898f/attachment-0001.htm>


More information about the mpich-discuss mailing list