[mpich-discuss] Not able to run MPI program parallely...

Albert Spade albert.spade at gmail.com
Tue May 1 12:39:34 CDT 2012


Hi Pavan,
Thanks for your timely reply.

I have copied the output which I see if i am using the hosts manually and
also after setting HYDRA_HOST_FILE.
Whether I am doing some mistake in setting HYDRA_HOST_FILE..

Output
----------

[root at beowulf ~]# vi .bashrc
[root at beowulf ~]# mpiexec -n 4 /opt/mpich2-1.4.1p1/examples/./cpi
Process 0 of 4 is on beowulf.master
Process 3 of 4 is on beowulf.master
Process 1 of 4 is on beowulf.master
Process 2 of 4 is on beowulf.master
pi is approximately 3.1415926544231239, Error is 0.0000000008333307
wall clock time = 0.000202
[root at beowulf ~]# mpiexec -f hosts -n 4 /opt/mpich2-1.4.1p1/examples/./cpi
Process 0 of 4 is on beowulf.master
Process 3 of 4 is on beowulf.master
Process 1 of 4 is on beowulf.master
Process 2 of 4 is on beowulf.master
Fatal error in PMPI_Reduce: Other MPI error, error stack:
PMPI_Reduce(1270)...............: MPI_Reduce(sbuf=0xbff0fd08,
rbuf=0xbff0fd00, count=1, MPI_DOUBLE, MPI_SUM, root=0, MPI_COMM_WORLD)
failed
MPIR_Reduce_impl(1087)..........:
MPIR_Reduce_intra(895)..........:
MPIR_Reduce_binomial(144).......:
MPIDI_CH3U_Recvq_FDU_or_AEP(380): Communication error with rank 2
MPIR_Reduce_binomial(144).......:
MPIDI_CH3U_Recvq_FDU_or_AEP(380): Communication error with rank 1
^CCtrl-C caught... cleaning up processes
[root at beowulf ~]#

-------------------------------------------

This is output after setting HYDRA_HOST_FILE

[root at beowulf ~]# mpiexec -n 4 /opt/mpich2-1.4.1p1/examples/./cpi
Process 2 of 4 is on beowulf.master
Process 3 of 4 is on beowulf.master
Process 1 of 4 is on beowulf.master
Process 0 of 4 is on beowulf.master
Fatal error in PMPI_Reduce: Other MPI error, error stack:
PMPI_Reduce(1270)...............: MPI_Reduce(sbuf=0xbfd69028,
rbuf=0xbfd69020, count=1, MPI_DOUBLE, MPI_SUM, root=0, MPI_COMM_WORLD)
failed
MPIR_Reduce_impl(1087)..........:
MPIR_Reduce_intra(895)..........:
MPIR_Reduce_binomial(144).......:
MPIDI_CH3U_Recvq_FDU_or_AEP(380): Communication error with rank 2
MPIR_Reduce_binomial(144).......:
MPIDI_CH3U_Recvq_FDU_or_AEP(380): Communication error with rank 1
^CCtrl-C caught... cleaning up processes
[root at beowulf ~]#



On Tue, May 1, 2012 at 7:23 PM, Pavan Balaji <balaji at mcs.anl.gov> wrote:

>
> In the previous error message you sent, all processes were started on
> beowulf.master as well, which means you didn't set the HYDRA_HOST_FILE
> correctly.  What exactly is the error that you are seeing *after* setting
> the HYDRA_HOST_FILE variable?
>
>  -- Pavan
>
>
> On 05/01/2012 08:51 AM, Albert Spade wrote:
>
>> Yes it is HYDRA_HOST_FILE sorry for typo...
>> Also when I run ./cpi without setting hostfile, on single machine it
>> works fine.
>>  Here is its sample output:
>>
>> [root at beowulf ~]# mpiexec -n 4 /opt/mpich2-1.4.1p1/examples/.**/cpi
>> Process 0 of 4 is on beowulf.master
>> Process 3 of 4 is on beowulf.master
>> Process 2 of 4 is on beowulf.master
>> Process 1 of 4 is on beowulf.master
>> pi is approximately 3.1415926544231239, Error is 0.0000000008333307
>> wall clock time = 0.000333
>> [root at beowulf ~]#
>>
>> On Tue, May 1, 2012 at 7:13 PM, Pavan Balaji <balaji at mcs.anl.gov
>> <mailto:balaji at mcs.anl.gov>> wrote:
>>
>>
>>    On 05/01/2012 05:30 AM, Albert Spade wrote:
>>
>>        I also set environment for Hydra process manager by adding
>>        export HYDRA_FILE=/root/hosts
>>        in .bashrc file in /root
>>
>>
>>    Did you mean to set HYDRA_HOST_FILE?
>>
>>    Can you try running ./cpi without setting the HYDRA_HOST_FILE first?
>>
>>      -- Pavan
>>
>>    --
>>    Pavan Balaji
>>    http://www.mcs.anl.gov/~balaji <http://www.mcs.anl.gov/%**7Ebalaji<http://www.mcs.anl.gov/%7Ebalaji>
>> >
>>
>>
>>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120501/7a615bfd/attachment.htm>


More information about the mpich-discuss mailing list