<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<META NAME="Generator" CONTENT="MS Exchange Server version 5.5.2658.34">
<TITLE>MPICH2 ring breaking; three times in two days</TITLE>
</HEAD>
<BODY>
<P><FONT SIZE=2 FACE="Arial">Help! :)</FONT>
</P>
<P><FONT SIZE=2 FACE="Arial">We've been seeing a problem where most of our nodes drop out of the MPICH2 ring; this has happened three times in the last two days. It's not always the same nodes, either :(</FONT></P>
<P><FONT SIZE=2 FACE="Arial">The syslog file on our head node shows the following error:</FONT>
</P>
<P><FONT SIZE=2 FACE="Arial">(handle_rhs_challenge_response 1010): INVALID msg for rhs response msg=:{}: from host=xxxxx</FONT>
</P>
<P><FONT SIZE=2 FACE="Arial">xxxxx represents the various hosts which drop out of the ring.</FONT>
</P>
<P><FONT SIZE=2 FACE="Arial">Could this be a misbehaving job?</FONT>
</P>
<P><FONT SIZE=2 FACE="Arial">Help! :)</FONT>
</P>
<P><FONT SIZE=2 FACE="Arial">I'm using mpich2-1.0.1 on RHEL3</FONT>
</P>
<P><FONT SIZE=2 FACE="Arial">Simon</FONT>
</P>
<br><br><table bgcolor=white style="color:black"><tr><td><br>CONFIDENTIAL AND PRIVILEGED INFORMATION NOTICE<br>
<br>
This e-mail, and any attachments, may contain information that<br>
is confidential, subject to copyright, or exempt from disclosure.<br>
Any unauthorized review, disclosure, retransmission, <br>
dissemination or other use of or reliance on this information <br>
may be unlawful and is strictly prohibited. <br>
<br>
AVIS D'INFORMATION CONFIDENTIELLE ET PRIVILÉGIÉE<br>
<br>
Le présent courriel, et toute pièce jointe, peut contenir de <br>
l'information qui est confidentielle, régie par les droits <br>
d'auteur, ou interdite de divulgation. Tout examen, <br>
divulgation, retransmission, diffusion ou autres utilisations <br>
non autorisées de l'information ou dépendance non autorisée <br>
envers celle-ci peut être illégale et est strictement interdite.</td></tr></table></BODY>
</HTML>