<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>Nachricht</TITLE>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2900.3492" name=GENERATOR></HEAD>
<BODY>
<DIV><FONT face=Arial color=#0000ff size=2><SPAN
class=395415007-28012009>Hello,</SPAN></FONT></DIV>
<DIV><FONT face=Arial color=#0000ff size=2><SPAN
class=395415007-28012009></SPAN></FONT> </DIV>
<DIV><FONT face=Arial color=#0000ff size=2><SPAN class=395415007-28012009>I am
not sure if I need multiple rings. At the moment I am using one
ring.</SPAN></FONT></DIV>
<DIV><FONT face=Arial color=#0000ff size=2><SPAN class=395415007-28012009>But
</SPAN></FONT><FONT face=Arial color=#0000ff size=2><SPAN
class=395415007-28012009>the problem with one ring on a big cluster (144
nodes) is, that I have to start the ring independent from the job. Which
means I first start the ring around the whole cluster and when the job
wants to start it gets the machinelist from the queuing system. This works fine
until one machine of the cluster stops working. Then the ring is broken and
the job crashes. </SPAN></FONT></DIV>
<DIV><FONT face=Arial color=#0000ff size=2><SPAN class=395415007-28012009>Is
there a better way to solve the problem? Is it possible to remove or add
machines from or to the ring to repair the ring?</SPAN></FONT></DIV>
<DIV><FONT face=Arial color=#0000ff size=2><SPAN
class=395415007-28012009></SPAN></FONT> </DIV>
<DIV><FONT face=Arial color=#0000ff size=2><SPAN
class=395415007-28012009>Thanks,</SPAN></FONT></DIV>
<DIV><FONT face=Arial color=#0000ff size=2><SPAN
class=395415007-28012009></SPAN></FONT> </DIV>
<DIV><FONT face=Arial color=#0000ff size=2><SPAN
class=395415007-28012009>Oliver</SPAN></FONT></DIV>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV>
<DIV><FONT color=#0000ff size=2><BR></FONT> </DIV>
<BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px">
<DIV></DIV>
<DIV class=OutlookMessageHeader lang=de dir=ltr align=left><FONT face=Tahoma
size=2>-----Ursprüngliche Nachricht-----<BR><B>Von:</B>
mpich-discuss-bounces@mcs.anl.gov [mailto:mpich-discuss-bounces@mcs.anl.gov]
<B>Im Auftrag von </B>Rajeev Thakur<BR><B>Gesendet:</B> Dienstag, 27. Januar
2009 18:00<BR><B>An:</B> mpich-discuss@mcs.anl.gov<BR><B>Betreff:</B> Re:
[mpich-discuss] multiple mpd rings as one user<BR><BR></FONT></DIV>
<DIV dir=ltr align=left><SPAN class=140105916-27012009><FONT face=Arial
color=#0000ff size=2>Do you really need multiple MPD rings? You can run
multiple jobs with one ring.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=140105916-27012009><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=140105916-27012009><FONT face=Arial
color=#0000ff size=2>Rajeev</FONT></SPAN></DIV><BR>
<BLOCKQUOTE dir=ltr
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> mpich-discuss-bounces@mcs.anl.gov
[mailto:mpich-discuss-bounces@mcs.anl.gov] <B>On Behalf Of
</B>oliver.wissdorf@boehringer-ingelheim.com<BR><B>Sent:</B> Tuesday,
January 27, 2009 10:04 AM<BR><B>To:</B>
mpich-discuss@mcs.anl.gov<BR><B>Subject:</B> [mpich-discuss] multiple mpd
rings as one user<BR></FONT><BR></DIV>
<DIV></DIV><!-- Converted from text/rtf format -->
<P><FONT face=Arial size=2>Hello,</FONT> </P>
<P><FONT face=Arial size=2>I want to start multipel rings as one user on a
linux cluster to submit more than one job at a time. Therefore I use mpdboot
and mpdexec:</FONT></P><BR><BR>
<P><SPAN lang=de><FONT face=Arial
size=2>/usr/mpi/gcc/mvapich2-1.0.2/bin/mpdboot -n 17 -f $PWD/mpd.txt
--verbose --ifhn=172.17.30.101</FONT></SPAN> </P>
<P><SPAN lang=de><FONT face=Arial
size=2>/usr/mpi/gcc/mvapich2-1.0.2/bin/mpiexec -machinefile $WORKDIR/mpd.txt
-n $np <jobscript></FONT></SPAN> </P>
<P><SPAN lang=de><FONT face=Arial size=2>I tried to set MPD_CON_EXT before
starting the mpdboot and this allows me to start multiple mpds on the host
where the ring starts, but not on the other hosts of the
ring.</FONT></SPAN></P>
<P><SPAN lang=de><FONT face=Arial size=2>I also tried to use -1 option with
mpdboot, but this did not work either.</FONT></SPAN> </P>
<P><SPAN lang=de><FONT face=Arial size=2>Is there a way to solve this issue?
Did I make any mistakes?</FONT></SPAN> </P>
<P><SPAN lang=de><FONT face=Arial size=2>Oliver</FONT></SPAN>
</P></BLOCKQUOTE></BLOCKQUOTE></BODY></HTML>