<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>RE: [mpich-discuss] mpiexec kills the remote login shell</TITLE>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.6000.16735" name=GENERATOR></HEAD>
<BODY>
<DIV dir=ltr align=left><SPAN class=192584614-04022009><FONT face=Arial
color=#0000ff size=2>Hi,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=192584614-04022009><FONT face=Arial
color=#0000ff size=2> The mpiexec output shows the following error when
running hellow,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=192584614-04022009><FONT face=Arial
color=#0000ff size=2>==================</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=192584614-04022009><SPAN lang=EN>
<P><FONT face=Arial><FONT color=#0000ff><FONT size=2>Unable to exec 'hello' on
korebot<SPAN class=192584614-04022009> <SPAN lang=EN></P>
<P>Error 2 - No such file or directory</P>
<P><SPAN class=192584614-04022009><FONT face=Arial color=#0000ff
size=2>==================</FONT></SPAN></P></SPAN></SPAN></FONT></FONT></FONT></SPAN></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=192584614-04022009><FONT face=Arial
color=#0000ff size=2> Please provide the debug output of smpd (smpd
-d 2>&1 | tee smpd.out) along with mpiexec (mpiexec
-verbose -n 2 ./hellow 2>&1 |
tee mpiexec.out).</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=192584614-04022009><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=192584614-04022009><FONT face=Arial
color=#0000ff size=2># Can you run simple C programs (without using
mpiexec) on Korbet ?</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=192584614-04022009><FONT face=Arial
color=#0000ff size=2># Is the ssh connection aborted when you run non-MPI
programs (mpiexec -n 2 hostname) ?</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=192584614-04022009><FONT face=Arial
color=#0000ff size=2># Can you send us your ".smpd" config file
?</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=192584614-04022009><FONT face=Arial
color=#0000ff size=2># Did you modify the MPICH2 code to run on
Korbet (Please send us your configure command & any env settings set to
configure/make MPICH2)? </FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=192584614-04022009> </SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=192584614-04022009><FONT face=Arial
color=#0000ff size=2>Regards,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=192584614-04022009><FONT face=Arial
color=#0000ff size=2>Jayesh</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=192584614-04022009> </SPAN></DIV>
<DIV dir=ltr align=left>
<HR tabIndex=-1>
</DIV>
<DIV dir=ltr align=left><FONT face=Tahoma size=2><B>From:</B>
mpich-discuss-bounces@mcs.anl.gov [mailto:mpich-discuss-bounces@mcs.anl.gov]
<B>On Behalf Of </B>Jayesh Krishna<BR><B>Sent:</B> Wednesday, February 04, 2009
8:41 AM<BR><B>To:</B> 'Yu-Cheng Chou'<BR><B>Cc:</B>
mpich-discuss@mcs.anl.gov<BR><B>Subject:</B> Re: [mpich-discuss] mpiexec kills
the remote login shell<BR></FONT><BR></DIV>
<DIV></DIV><!-- Converted from text/plain format -->
<P><FONT size=2> Hi,<BR> I will take a look at the debug logs and get
back to you. Meanwhile, can you run simple C programs without using mpiexec on
Korbet ?<BR> MPICH2 currently does not support heterogeneous systems (So
you won't be able to run your MPI job across ARM & other
architectures).<BR><BR>Regards,<BR>Jayesh<BR><BR>-----Original
Message-----<BR>From: Yu-Cheng Chou [<A
href="mailto:cycchou@ucdavis.edu">mailto:cycchou@ucdavis.edu</A>]<BR>Sent:
Tuesday, February 03, 2009 7:52 PM<BR>To: Jayesh Krishna<BR>Cc:
mpich-discuss@mcs.anl.gov<BR>Subject: Re: [mpich-discuss] mpiexec kills the
remote login shell<BR><BR>> # Can you run non-MPI programs using mpiexec
(mpiexec -n 2 hostname) ?<BR>Yes.<BR><BR>> # Can you compile and run the
hello world program (examples/hellow.c)<BR>> provided with MPICH2 (mpiexec -n
2 ./hellow)?<BR>Yes.<BR><BR>> # How did you start smpd (the command used to
start smpd) ? How did<BR>> you run your MPI job (the command used to run your
job)?<BR>I have a ".smpd" file containing one line of information, which is
"phrase=123".<BR>Thus, I started smpd using "smpd -s".<BR>Then I used "mpiexec
-n 1 hellow" to run hellow on Korebot.<BR><BR>> # How did you find that
mpiexec kills the sshd process (We typically<BR>> ssh to unix machines and
run mpiexec without any problems) ?<BR>I logged in Korebot with two
terminals.<BR>>From #1 terminal, I checked all the processes running on
Korebot.<BR>>From #2 terminal, I started smpd and run hellow using the
commands mentioned above.<BR>After hellow was finished, the connection to
Korebot via #2 terminal was closed.<BR>>From #1 terminal, I knew that the
sshd process associated with #2 terminal was gone.<BR><BR>> Can you run
smpd/mpiexec in debug mode and provide us with the<BR>> outputs (smpd -d /
mpiexec -n 2 -verbose hostname) ?<BR>The first attached text file is the output
from running hellow in mpiexec's verbose mode.<BR><BR><BR>There is another
issue.<BR>This time, I used two machines. One is Korebot as mentioned above, and
the other is a laptop running Ubuntu Linux OS.<BR>I started smpd with the same
".smpd" file and command as mentioned above both on Korebot and the lap
top.<BR>There is a machine file called "hostfile" on Korebot. The file contains
the following information about the name of the two
machines.<BR><BR>korebot<BR>shrimp<BR><BR>Then from Korebot, I ran cpi using the
following command.<BR><BR>mpiexec -machinefile ./hostfile -verbose -n 2
cpi<BR><BR><BR>But the value of pi is a huge number. I think it is related to
"double type variables" being transferred between processes running on an
ARM-based Linux and a general Linux machines.<BR><BR>The second attached text
file is the output from running cpi in mpiexec's verbose
mode.<BR><BR><BR>><BR>> I am cross-compiling mpich2-1.0.8 with smpd for
Khepera III mobile robot.<BR>><BR>> This mobile robot has a Korebot board
which is an ARM-based computer<BR>> with a Linux operating
system.<BR>><BR>> The cross-compilation was fine.<BR>><BR>> Firstly,
I logged in to Korebot through ssh.<BR>> Secondly, I started smpd.<BR>>
Thirdly, I ran mpiexec to execute an MPI program (cpi) that comes with<BR>>
the package.<BR>><BR>> The result was correct, but when mpiexec was
finished, the ssh<BR>> connection to the Korebot was closed.<BR>> I found
that mpiexec kills the sshd process through which I was<BR>> remotely
connected to Korebot.<BR>><BR>> I've been looking for the cause, but still
have not found any clues.<BR>><BR>> Could you give me any ideas to solve
this problem?<BR>><BR>> Thank you,<BR>><BR>>
Yu-Cheng<BR>><BR></FONT></P></BODY></HTML>