[mpich-discuss] SMPD, Problem launching when using -host

Jayesh Krishna jayesh at mcs.anl.gov
Fri Oct 10 09:26:27 CDT 2008


 Hi,
  It is recommended that you use the "-path" option available with mpiexec
to specify the path to the executable.

Regards,
Jayesh

-----Original Message-----
From: James S Perrin [mailto:james.s.perrin at manchester.ac.uk] 
Sent: Friday, October 10, 2008 4:57 AM
To: Jayesh Krishna
Cc: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] SMPD, Problem launching when using -host

Hi,

I have found the reason why my executable is failing to start, however I
  think -host is not behaving as it should or at least the documentation
needs clarifying.

I guessed that using -host was somehow changing the executable's
environment and so it is failing to start correctly because it couldn't
find a dll.

On windows the PATH variable should be made up of the system wide settings
and the user specific additions:

ie echo %PATH% => <system settings>;<user settings>

The user settings are required to launch the process. When I launch as
follows:

mpiexec -localroot -n 1 master : -n 1 slave

both get the path setting as above, however if I use

mpiexec -localroot -n 1 master : -host roobarb -n 1 slave

process 1 has PATH=<system settings>;<user settings> but process 2 has
PATH=<system settings> only

I have no idea why the following works but it does, if I add -host roobarb
to the process 1 process 2 now gets the full PATH variable

mpiexec -localroot -host roobarb -n 1 master : -host roobarb -n 1 slave

Final permutation, if I now don't specify -localroot both processes only
get the only the system settings for PATH:

mpiexec -host roobarb -n 1 master : -host roobarb -n 1 slave

In summary using -host only the system path settings are used and not the
user specific settings. Is this a security feature or a non-iteractive
login issue c.f bash under linux the .bashrc is not executed for processes
started remotely?

A little extra testing confirmed that when process gets both the system
and user path settings it is getting this from the current cmd shell.

The solution is to either make sure paths are added to the system path
variable or launch via a script that sets up the environment for each
processes though I would have like to avoid this if possible. The first is
a pain for development and the later a pain for user installations.

FYI I was examining the PATH variable using:

mpiexec -l -host roobarb -n 1 env : -host roobarb -n 1 env | grep \]PATH=

I have the UNIX commands env and grep in my PATH.

Regards
James

Jayesh Krishna wrote:
>  Hi,
>   Can you send us the debug output of mpiexec and smpd ? Please follow 
> the instructions below to send us the debug output,
> 
> # Stop any instances of smpd using the command, smpd -stop # Start 
> smpd in the debug mode using the command, smpd -d # Run a non-MPI 
> program with mpiexec in the verbose mode using the command, mpiexec 
> -verbose -n 1 hostname : -host IPADDRESS_OF_roobarb -n
> 1 hostname
> 
> # Run an MPI program (cpi.exe provided with MPICH2) with mpiexec in 
> the verbose mode using the command, mpiexec -verbose -n 1 cpi.exe : 
> -host IPADDRESS_OF_roobarb -n 1 cpi.exe
> 
> # Send us the debug/verbose outputs of mpiexec and smpd.
> 
>   Let us know the results.
> 
> Regards,
> Jayesh
> 
> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of James S Perrin
> Sent: Tuesday, October 07, 2008 5:25 AM
> Cc: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] SMPD, Problem launching when using -host
> 
> Hi,
> 
>      No I get the same error if I use the ipaddress.
> 
> Regards
> James
> 
> 
> Jayesh Krishna wrote:
>  >  Hi,
>  >   Does it work if you specify the ipaddress of the machine instead of
>  > hostname (mpiexec -n 1 master : -host IPADDRESS_OF_roobarb -n 1
slave) ?
>  >
>  > Regards,
>  > Jayesh
>  >
>  > -----Original Message-----
>  > From: James S Perrin [mailto:james.s.perrin at manchester.ac.uk]
>  > Sent: Monday, October 06, 2008 5:18 AM  > To: Jayesh Krishna  > Cc: 
> mpich-discuss at mcs.anl.gov  > Subject: Re: [mpich-discuss] SMPD, 
> Problem launching when using -host  >  > Hi,  >  > Jayesh Krishna 
> wrote:
>  >  >  Hi,
>  >  >
>  >  >  >> mpiexec -n 1 -host roobarb master : -n 1 slave
>  >  >         The command above("-host" option specified for only one
>  >  > executable) works for me. What is the error message that you get  
> >  > (Provide us with the snapshot of your command and the error 
> output. It  > > would also help us if you provide more details - Is 
> roobarb a remote  > > machine ? etc) ?
>  >
>  > The error is:
>  >
>  > [0] PMI_Init failed: FAIL - init called when another process has  > 
> exited without calling init Fatal error in MPI_Init_thread: Other MPI  
> > error, error stack:
>  > MPIR_Init_thread(294): Initialization failed  > 
> MPID_Init(82)........: channel initialization failed  > 
> MPID_Init(333).......: PMI_Init returned -1unable to read the cmd  > 
> header on the pmi context, generic socket failure, error stack:
>  > MPIDU_Sock_wait(2603): The specified network name is no longer  > 
> available. (errno 64).
>  >
>  > job aborted:
>  > rank: node: exit code[: error message]  > 0: ROOBARB: 3: Fatal 
> error in MPI_Init_thread: Other MPI error, error
> stack:
>  > MPIR_Init_thread(294): Initialization failed  > 
> MPID_Init(82)........: channel initialization failed  > 
> MPID_Init(333).......: PMI_Init returned -1  > 1: roobarb: -1073741515  
> >  > The second process is not starting for some reason.
>  >
>  > roobarb happens to be the local machine in this case but the 
> problem  > also occurs on a cluster.
>  >
>  > It will launch correctly if I use:
>  >
>  > mpiexec -n 1 master : -n 1 slave - SUCCESS  >  > which should be no 
> different from:
>  >
>  > mpiexec -n 1 master : -host roobarb -n 1 slave - FAILS  >  > when 
> everything is running on roobarb.
>  >
>  >  >  >> mpiexec -localroot -n 1 roobarb master : -host roobarb -n 1  
> > slave  >
>  >  >         When using the "-localroot" option you should not specify
the
>  >  > hostname for the 1st executable. The command should be,  >  >  
> >>  > mpiexec -localroot -n 1 master : -host roobarb -n 1 slave  >  > 
> sorry typo I meant if would work I used:
>  >
>  > mpiexec -localroot -host roobarb -n 1  master : -host roobarb -n 1  
> > slave  >  > Regards  > James  >  >  >  >  > -----Original 
> Message-----  >  > From: owner-mpich-discuss at mcs.anl.gov  >  > 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of James S Perrin  
> > > Sent: Friday, October 03, 2008 12:13 PM  > To: mpich  > Subject:
>  > [mpich-discuss] SMPD, Problem launching when using -host  >  > Hi,
>  >  >      Processes fail to start if -host is used for only some but
not
>  >  > all processes when launching. ie the machines that some 
> processes  > > launch on is left up to the smpd to allocate.
>  >  >
>  >  > eg
>  >  >
>  >  > mpiexec -n 1 -host roobarb master : -n 1 slave  >  > when  > 
> -localroot is used the following fails unless -host is also  >  > 
> specified for the master.
>  >  >
>  >  > mpiexc -localroot -n 1 roobarb master : -host roobarb -n 1 slave  
> >  > > Using MPICH2 1.0.7 on WinXP ia32.
>  >  >
>  >  > Regards
>  >  > James
>  >  > --
>  >  >
> ------------------------------------------------------------------------
>  >  >    James S. Perrin
>  >  >    Visualization
>  >  >
>  >  >    Research Computing Services
>  >  >    The University of Manchester
>  >  >    Kilburn Building, Oxford Road
>  >  >    Manchester, M13 9PL
>  >  >
>  >  >    t: +44 (0) 161 275 6945
>  >  >    e: james.perrin at manchester.ac.uk
>  >  >    w: www.manchester.ac.uk/researchcomputing
>  >  >
> ------------------------------------------------------------------------
>  >  >   "The test of intellect is the refusal to belabour the obvious"
>  >  >   - Alfred Bester
>  >  >
>  > 
> ----------------------------------------------------------------------
>  >  > --
>  >  >
>  >
>  > --
>  >
------------------------------------------------------------------------
>  >    James S. Perrin
>  >    Visualization
>  >
>  >    Research Computing Services
>  >    The University of Manchester
>  >    Kilburn Building, Oxford Road
>  >    Manchester, M13 9PL
>  >
>  >    t: +44 (0) 161 275 6945
>  >    e: james.perrin at manchester.ac.uk
>  >    w: www.manchester.ac.uk/researchcomputing
>  >
------------------------------------------------------------------------
>  >   "The test of intellect is the refusal to belabour the obvious"
>  >   - Alfred Bester
>  > 
> ----------------------------------------------------------------------
>  > --
>  >
> 
> --
> ------------------------------------------------------------------------
>    James S. Perrin
>    Visualization
> 
>    Research Computing Services
>    The University of Manchester
>    Kilburn Building, Oxford Road
>    Manchester, M13 9PL
> 
>    t: +44 (0) 161 275 6945
>    e: james.perrin at manchester.ac.uk
>    w: www.manchester.ac.uk/researchcomputing
> ------------------------------------------------------------------------
>   "The test of intellect is the refusal to belabour the obvious"
>   - Alfred Bester
> ----------------------------------------------------------------------
> --
> 

--
------------------------------------------------------------------------
   James S. Perrin
   Visualization

   Research Computing Services
   The University of Manchester
   Kilburn Building, Oxford Road
   Manchester, M13 9PL

   t: +44 (0) 161 275 6945
   e: james.perrin at manchester.ac.uk
   w: www.manchester.ac.uk/researchcomputing
------------------------------------------------------------------------
  "The test of intellect is the refusal to belabour the obvious"
  - Alfred Bester
------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20081010/64890075/attachment.htm>


More information about the mpich-discuss mailing list