<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7036.0">
<TITLE>RE: [mpich-discuss] SMPD, Problem launching when using -host</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<P><FONT SIZE=2> Hi,<BR>
Does it work if you specify the ipaddress of the machine instead of hostname (mpiexec -n 1 master : -host IPADDRESS_OF_roobarb -n 1 slave) ?<BR>
<BR>
Regards,<BR>
Jayesh<BR>
<BR>
-----Original Message-----<BR>
From: James S Perrin [<A HREF="mailto:james.s.perrin@manchester.ac.uk">mailto:james.s.perrin@manchester.ac.uk</A>]<BR>
Sent: Monday, October 06, 2008 5:18 AM<BR>
To: Jayesh Krishna<BR>
Cc: mpich-discuss@mcs.anl.gov<BR>
Subject: Re: [mpich-discuss] SMPD, Problem launching when using -host<BR>
<BR>
Hi,<BR>
<BR>
Jayesh Krishna wrote:<BR>
> Hi,<BR>
><BR>
> >> mpiexec -n 1 -host roobarb master : -n 1 slave<BR>
> The command above("-host" option specified for only one<BR>
> executable) works for me. What is the error message that you get<BR>
> (Provide us with the snapshot of your command and the error output. It<BR>
> would also help us if you provide more details - Is roobarb a remote<BR>
> machine ? etc) ?<BR>
<BR>
The error is:<BR>
<BR>
[0] PMI_Init failed: FAIL - init called when another process has exited without calling init Fatal error in MPI_Init_thread: Other MPI error, error stack:<BR>
MPIR_Init_thread(294): Initialization failed<BR>
MPID_Init(82)........: channel initialization failed<BR>
MPID_Init(333).......: PMI_Init returned -1unable to read the cmd header on the pmi context, generic socket failure, error stack:<BR>
MPIDU_Sock_wait(2603): The specified network name is no longer available. (errno 64).<BR>
<BR>
job aborted:<BR>
rank: node: exit code[: error message]<BR>
0: ROOBARB: 3: Fatal error in MPI_Init_thread: Other MPI error, error stack:<BR>
MPIR_Init_thread(294): Initialization failed<BR>
MPID_Init(82)........: channel initialization failed<BR>
MPID_Init(333).......: PMI_Init returned -1<BR>
1: roobarb: -1073741515<BR>
<BR>
The second process is not starting for some reason.<BR>
<BR>
roobarb happens to be the local machine in this case but the problem also occurs on a cluster.<BR>
<BR>
It will launch correctly if I use:<BR>
<BR>
mpiexec -n 1 master : -n 1 slave - SUCCESS<BR>
<BR>
which should be no different from:<BR>
<BR>
mpiexec -n 1 master : -host roobarb -n 1 slave - FAILS<BR>
<BR>
when everything is running on roobarb.<BR>
<BR>
> >> mpiexec -localroot -n 1 roobarb master : -host roobarb -n 1 slave<BR>
><BR>
> When using the "-localroot" option you should not specify the<BR>
> hostname for the 1st executable. The command should be,<BR>
><BR>
> >> mpiexec -localroot -n 1 master : -host roobarb -n 1 slave<BR>
<BR>
sorry typo I meant if would work I used:<BR>
<BR>
mpiexec -localroot -host roobarb -n 1 master : -host roobarb -n 1 slave<BR>
<BR>
Regards<BR>
James<BR>
<BR>
><BR>
> -----Original Message-----<BR>
> From: owner-mpich-discuss@mcs.anl.gov<BR>
> [<A HREF="mailto:owner-mpich-discuss@mcs.anl.gov">mailto:owner-mpich-discuss@mcs.anl.gov</A>] On Behalf Of James S Perrin<BR>
> Sent: Friday, October 03, 2008 12:13 PM<BR>
> To: mpich<BR>
> Subject: [mpich-discuss] SMPD, Problem launching when using -host<BR>
><BR>
> Hi,<BR>
> Processes fail to start if -host is used for only some but not<BR>
> all processes when launching. ie the machines that some processes<BR>
> launch on is left up to the smpd to allocate.<BR>
><BR>
> eg<BR>
><BR>
> mpiexec -n 1 -host roobarb master : -n 1 slave<BR>
><BR>
> when -localroot is used the following fails unless -host is also<BR>
> specified for the master.<BR>
><BR>
> mpiexc -localroot -n 1 roobarb master : -host roobarb -n 1 slave<BR>
><BR>
> Using MPICH2 1.0.7 on WinXP ia32.<BR>
><BR>
> Regards<BR>
> James<BR>
> --<BR>
> ------------------------------------------------------------------------<BR>
> James S. Perrin<BR>
> Visualization<BR>
><BR>
> Research Computing Services<BR>
> The University of Manchester<BR>
> Kilburn Building, Oxford Road<BR>
> Manchester, M13 9PL<BR>
><BR>
> t: +44 (0) 161 275 6945<BR>
> e: james.perrin@manchester.ac.uk<BR>
> w: www.manchester.ac.uk/researchcomputing<BR>
> ------------------------------------------------------------------------<BR>
> "The test of intellect is the refusal to belabour the obvious"<BR>
> - Alfred Bester<BR>
> ----------------------------------------------------------------------<BR>
> --<BR>
><BR>
<BR>
--<BR>
------------------------------------------------------------------------<BR>
James S. Perrin<BR>
Visualization<BR>
<BR>
Research Computing Services<BR>
The University of Manchester<BR>
Kilburn Building, Oxford Road<BR>
Manchester, M13 9PL<BR>
<BR>
t: +44 (0) 161 275 6945<BR>
e: james.perrin@manchester.ac.uk<BR>
w: www.manchester.ac.uk/researchcomputing<BR>
------------------------------------------------------------------------<BR>
"The test of intellect is the refusal to belabour the obvious"<BR>
- Alfred Bester<BR>
------------------------------------------------------------------------<BR>
</FONT>
</P>
</BODY>
</HTML>