<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7036.0">
<TITLE>RE: [mpich-discuss] mpiexec kills the remote login shell</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<P><FONT SIZE=2> Hi,<BR>
I will take a look at the debug logs and get back to you. Meanwhile, can you run simple C programs without using mpiexec on Korbet ?<BR>
MPICH2 currently does not support heterogeneous systems (So you won't be able to run your MPI job across ARM & other architectures).<BR>
<BR>
Regards,<BR>
Jayesh<BR>
<BR>
-----Original Message-----<BR>
From: Yu-Cheng Chou [<A HREF="mailto:cycchou@ucdavis.edu">mailto:cycchou@ucdavis.edu</A>]<BR>
Sent: Tuesday, February 03, 2009 7:52 PM<BR>
To: Jayesh Krishna<BR>
Cc: mpich-discuss@mcs.anl.gov<BR>
Subject: Re: [mpich-discuss] mpiexec kills the remote login shell<BR>
<BR>
> # Can you run non-MPI programs using mpiexec (mpiexec -n 2 hostname) ?<BR>
Yes.<BR>
<BR>
> # Can you compile and run the hello world program (examples/hellow.c)<BR>
> provided with MPICH2 (mpiexec -n 2 ./hellow)?<BR>
Yes.<BR>
<BR>
> # How did you start smpd (the command used to start smpd) ? How did<BR>
> you run your MPI job (the command used to run your job)?<BR>
I have a ".smpd" file containing one line of information, which is "phrase=123".<BR>
Thus, I started smpd using "smpd -s".<BR>
Then I used "mpiexec -n 1 hellow" to run hellow on Korebot.<BR>
<BR>
> # How did you find that mpiexec kills the sshd process (We typically<BR>
> ssh to unix machines and run mpiexec without any problems) ?<BR>
I logged in Korebot with two terminals.<BR>
>From #1 terminal, I checked all the processes running on Korebot.<BR>
>From #2 terminal, I started smpd and run hellow using the commands mentioned above.<BR>
After hellow was finished, the connection to Korebot via #2 terminal was closed.<BR>
>From #1 terminal, I knew that the sshd process associated with #2 terminal was gone.<BR>
<BR>
> Can you run smpd/mpiexec in debug mode and provide us with the<BR>
> outputs (smpd -d / mpiexec -n 2 -verbose hostname) ?<BR>
The first attached text file is the output from running hellow in mpiexec's verbose mode.<BR>
<BR>
<BR>
There is another issue.<BR>
This time, I used two machines. One is Korebot as mentioned above, and the other is a laptop running Ubuntu Linux OS.<BR>
I started smpd with the same ".smpd" file and command as mentioned above both on Korebot and the lap top.<BR>
There is a machine file called "hostfile" on Korebot. The file contains the following information about the name of the two machines.<BR>
<BR>
korebot<BR>
shrimp<BR>
<BR>
Then from Korebot, I ran cpi using the following command.<BR>
<BR>
mpiexec -machinefile ./hostfile -verbose -n 2 cpi<BR>
<BR>
<BR>
But the value of pi is a huge number. I think it is related to "double type variables" being transferred between processes running on an ARM-based Linux and a general Linux machines.<BR>
<BR>
The second attached text file is the output from running cpi in mpiexec's verbose mode.<BR>
<BR>
<BR>
><BR>
> I am cross-compiling mpich2-1.0.8 with smpd for Khepera III mobile robot.<BR>
><BR>
> This mobile robot has a Korebot board which is an ARM-based computer<BR>
> with a Linux operating system.<BR>
><BR>
> The cross-compilation was fine.<BR>
><BR>
> Firstly, I logged in to Korebot through ssh.<BR>
> Secondly, I started smpd.<BR>
> Thirdly, I ran mpiexec to execute an MPI program (cpi) that comes with<BR>
> the package.<BR>
><BR>
> The result was correct, but when mpiexec was finished, the ssh<BR>
> connection to the Korebot was closed.<BR>
> I found that mpiexec kills the sshd process through which I was<BR>
> remotely connected to Korebot.<BR>
><BR>
> I've been looking for the cause, but still have not found any clues.<BR>
><BR>
> Could you give me any ideas to solve this problem?<BR>
><BR>
> Thank you,<BR>
><BR>
> Yu-Cheng<BR>
><BR>
</FONT>
</P>
</BODY>
</HTML>