[mpich-discuss] SMPD, Problem launching when using -host

Jayesh Krishna jayesh at mcs.anl.gov
Tue Oct 7 09:45:16 CDT 2008


 Hi,
  Can you send us the debug output of mpiexec and smpd ? Please follow the
instructions below to send us the debug output,

# Stop any instances of smpd using the command, smpd -stop
# Start smpd in the debug mode using the command, smpd -d
# Run a non-MPI program with mpiexec in the verbose mode using the
command, mpiexec -verbose -n 1 hostname : -host IPADDRESS_OF_roobarb -n 1
hostname

# Run an MPI program (cpi.exe provided with MPICH2) with mpiexec in the
verbose mode using the command, mpiexec -verbose -n 1 cpi.exe : -host
IPADDRESS_OF_roobarb -n 1 cpi.exe

# Send us the debug/verbose outputs of mpiexec and smpd.

  Let us know the results.

Regards,
Jayesh

-----Original Message-----
From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of James S Perrin
Sent: Tuesday, October 07, 2008 5:25 AM
Cc: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] SMPD, Problem launching when using -host

Hi,

     No I get the same error if I use the ipaddress.

Regards
James


Jayesh Krishna wrote:
>  Hi,
>   Does it work if you specify the ipaddress of the machine instead of 
> hostname (mpiexec -n 1 master : -host IPADDRESS_OF_roobarb -n 1 slave) ?
> 
> Regards,
> Jayesh
> 
> -----Original Message-----
> From: James S Perrin [mailto:james.s.perrin at manchester.ac.uk]
> Sent: Monday, October 06, 2008 5:18 AM
> To: Jayesh Krishna
> Cc: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] SMPD, Problem launching when using -host
> 
> Hi,
> 
> Jayesh Krishna wrote:
>  >  Hi,
>  >
>  >  >> mpiexec -n 1 -host roobarb master : -n 1 slave
>  >         The command above("-host" option specified for only one
>  > executable) works for me. What is the error message that you get  > 
> (Provide us with the snapshot of your command and the error output. It  
> > would also help us if you provide more details - Is roobarb a remote  
> > machine ? etc) ?
> 
> The error is:
> 
> [0] PMI_Init failed: FAIL - init called when another process has 
> exited without calling init Fatal error in MPI_Init_thread: Other MPI 
> error, error stack:
> MPIR_Init_thread(294): Initialization failed
> MPID_Init(82)........: channel initialization failed
> MPID_Init(333).......: PMI_Init returned -1unable to read the cmd 
> header on the pmi context, generic socket failure, error stack:
> MPIDU_Sock_wait(2603): The specified network name is no longer 
> available. (errno 64).
> 
> job aborted:
> rank: node: exit code[: error message]
> 0: ROOBARB: 3: Fatal error in MPI_Init_thread: Other MPI error, error
stack:
> MPIR_Init_thread(294): Initialization failed
> MPID_Init(82)........: channel initialization failed
> MPID_Init(333).......: PMI_Init returned -1
> 1: roobarb: -1073741515
> 
> The second process is not starting for some reason.
> 
> roobarb happens to be the local machine in this case but the problem 
> also occurs on a cluster.
> 
> It will launch correctly if I use:
> 
> mpiexec -n 1 master : -n 1 slave - SUCCESS
> 
> which should be no different from:
> 
> mpiexec -n 1 master : -host roobarb -n 1 slave - FAILS
> 
> when everything is running on roobarb.
> 
>  >  >> mpiexec -localroot -n 1 roobarb master : -host roobarb -n 1 
> slave  >
>  >         When using the "-localroot" option you should not specify the
>  > hostname for the 1st executable. The command should be,  >  >  >> 
> mpiexec -localroot -n 1 master : -host roobarb -n 1 slave
> 
> sorry typo I meant if would work I used:
> 
> mpiexec -localroot -host roobarb -n 1  master : -host roobarb -n 1 
> slave
> 
> Regards
> James
> 
>  >
>  > -----Original Message-----
>  > From: owner-mpich-discuss at mcs.anl.gov  > 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of James S Perrin  
> > Sent: Friday, October 03, 2008 12:13 PM  > To: mpich  > Subject: 
> [mpich-discuss] SMPD, Problem launching when using -host  >  > Hi,
>  >      Processes fail to start if -host is used for only some but not
>  > all processes when launching. ie the machines that some processes  
> > launch on is left up to the smpd to allocate.
>  >
>  > eg
>  >
>  > mpiexec -n 1 -host roobarb master : -n 1 slave  >  > when 
> -localroot is used the following fails unless -host is also  > 
> specified for the master.
>  >
>  > mpiexc -localroot -n 1 roobarb master : -host roobarb -n 1 slave  >  
> > Using MPICH2 1.0.7 on WinXP ia32.
>  >
>  > Regards
>  > James
>  > --
>  >
------------------------------------------------------------------------
>  >    James S. Perrin
>  >    Visualization
>  >
>  >    Research Computing Services
>  >    The University of Manchester
>  >    Kilburn Building, Oxford Road
>  >    Manchester, M13 9PL
>  >
>  >    t: +44 (0) 161 275 6945
>  >    e: james.perrin at manchester.ac.uk
>  >    w: www.manchester.ac.uk/researchcomputing
>  >
------------------------------------------------------------------------
>  >   "The test of intellect is the refusal to belabour the obvious"
>  >   - Alfred Bester
>  > 
> ----------------------------------------------------------------------
>  > --
>  >
> 
> --
> ------------------------------------------------------------------------
>    James S. Perrin
>    Visualization
> 
>    Research Computing Services
>    The University of Manchester
>    Kilburn Building, Oxford Road
>    Manchester, M13 9PL
> 
>    t: +44 (0) 161 275 6945
>    e: james.perrin at manchester.ac.uk
>    w: www.manchester.ac.uk/researchcomputing
> ------------------------------------------------------------------------
>   "The test of intellect is the refusal to belabour the obvious"
>   - Alfred Bester
> ----------------------------------------------------------------------
> --
> 

--
------------------------------------------------------------------------
   James S. Perrin
   Visualization

   Research Computing Services
   The University of Manchester
   Kilburn Building, Oxford Road
   Manchester, M13 9PL

   t: +44 (0) 161 275 6945
   e: james.perrin at manchester.ac.uk
   w: www.manchester.ac.uk/researchcomputing
------------------------------------------------------------------------
  "The test of intellect is the refusal to belabour the obvious"
  - Alfred Bester
------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20081007/7b80944f/attachment.htm>


More information about the mpich-discuss mailing list