<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 11 (filtered medium)">
<style>
<!--
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman";}
a:link, span.MsoHyperlink
        {color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-compose;
        font-family:Arial;
        color:windowtext;}
@page Section1
        {size:8.5in 11.0in;
        margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
        {page:Section1;}
-->
</style>
</head>
<body lang=EN-US link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>Hi all,<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>I’m a newbie to the list (and MPICH2 for that matter),
and hope someone can help with a problem I’ve recently encountered and
don’t know why:<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>I used to have an MPICH2 (smpd) app (mpiblast) run phenomenally
on my ROCKS cluster. I also got it integrated w/ Sun Grid Engine well, and it
was running great for a while. Every user has a .smpd file in their home directory
containing a different passphrase with only RW access to that particular user,
as indicated in the MPICH2 documentation. Since I am having problems as of
late, I now have two test users I’ve been using to try to narrow things
down with.<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>As I start running a few parallel jobs between the users,
eventually a job hangs—I found out that the passphrase inside that user’s
.smpd file disappeared! The file has of course, the nodes that it contacted
for the parallel job, but the passphrase that I explicitly put in a while ago
is now gone. I add back a passphrase and the whole process works again. Eventually,
the passphrase of one of the users disappears again, and the cycle starts over
again.<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>Is there any reason in the MPICH2 implementation where a
passphrase would get deleted, or maybe a particular line gets cut out in the
.smpd file that would coincidentally always be my phrase?<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>Hope someone has an idea. <o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>Thank you all for your time,<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>-Ray C.<o:p></o:p></span></font></p>
</div>
</body>
</html>