[mpich-discuss] machinefile error

SULLIVAN David (AREVA) David.Sullivan at areva.com
Tue Jul 20 10:56:07 CDT 2010


Pavan,

I assumed that it was from hydra because in one of the documentation
files I read it was the default for the latest version of mpich2. 
No it is not Windows, that program call was just copied as an example of
what I was trying.

If I explicitly call mpiexec.hydra I get all sorts of problems, but I
think it has to do with how the nodes are set up. On advice from
debianclusters.org I entered the command mpd --daemon --ncpus=4. Next to
see where this is running I enter mpdtrace -l. The response is
node1_12345 (127.0.0.1). That ip address doesn't look like it will work.
So be gentle, how do I set this up properly given the real ip is
170.55.51.23. I think I need to do this on all the nodes since if I
follow the instructions for testing a ring, each node only replies with
their own identity when I query mpdtrace.

Again, thanks for the guidance,

Dave 

-----Original Message-----
From: mpich-discuss-bounces at mcs.anl.gov
[mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Pavan Balaji
Sent: Tuesday, July 20, 2010 11:12 AM
To: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] machinefile error


The error you are reporting doesn't seem to be coming from Hydra's
mpiexec. Also, is this on windows (since your executable is called
cpi.exe)?

On windows the default process manager is SMPD, and you'll need to
follow its instructions from the user guide. On UNIX, the default
process manager is MPD for up to MPICH2-1.2.1p1. For MPICH2-1.3, we are
changing the default to Hydra.

  -- Pavan

On 07/20/2010 07:13 AM, SULLIVAN David (AREVA) wrote:
> Hello all,
>  
> I am having some issues executing a program through mpiexec. I can run

> locally on a 4-core processor with no problem using the -host switch.
> When I define a machinefile I get the following error:
> " Too few entries in machinefile"
> This error is not covered in the manual or FAQs and Google has been of

> little assistance. I have tried the following to no avail:
>  
> 1. An old entry in the list-serve archives recommended a series of 
> commands like the following  mpiexec -machinefile mf1.txt -n 1 cpi.exe
:
> -machinefile mf2.txt -n 2   This yields the same error for too few
entries.
>  
> 2. The wiki recommends two course of actions :
>     a) using the -f switch instead of the -machinefile    This gets an

> error that -f needs to be used on its own
>     b) setting the HYDRA_HOST_FILE variable with no need for any 
> additional switch. This just runs all processes on the local machine
>  
> I am attempting to run a parallel program written by Los Alamos across

> three nodes, each with 4 cores per node. Any assistance would be 
> greatly appreciated
>  
>  
> 
> David Sullivan
> 
>  
> 
>  
> 
>  
> 
> *AREVA NP INC*
> 
> 400 Donald Lynch Boulevard
> 
> Marlborough, MA, 01752
> 
> Phone: (508) 573-6721
> 
> Fax: (434) 382-5597
> 
> David.Sullivan at AREVA.com <mailto:David.Sullivan at AREVA.com>
> 
>  
> 
> *//**//*
> 
>  
> 
> 
> ----------------------------------------------------------------------
> --
> 
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss


More information about the mpich-discuss mailing list