<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content="text/html; charset=us-ascii" http-equiv=Content-Type>
<META name=GENERATOR content="MSHTML 8.00.6001.18812"></HEAD>
<BODY>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=181060320-13102009>Hi,</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=181060320-13102009> You can spawn processes dynamically
(MPI_Comm_spawn()) with MPI (Is that what you mean ?). How many processes are
you trying to launch on your cluster (The number of processes that can be
launched would depend on the capability of the OS to handle them - perf could
get affected if you launch too many procs on a single node)?</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=181060320-13102009></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=181060320-13102009>Regards,</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=181060320-13102009>Jayesh</SPAN></FONT></DIV><BR>
<DIV dir=ltr lang=en-us class=OutlookMessageHeader align=left>
<HR tabIndex=-1>
<FONT size=2 face=Tahoma><B>From:</B> abhishek pandey
[mailto:hipandey@gmail.com] <BR><B>Sent:</B> Tuesday, October 13, 2009 12:42
PM<BR><B>To:</B> Jayesh Krishna<BR><B>Subject:</B> Re: [mpich-discuss] If one
process of Cluster crashes<BR></FONT><BR></DIV>
<DIV></DIV>Hi Jayesh,<BR><BR>I haven't, I'll try it. Thanks.<BR><BR>BTW, can any
process be dynamically added/removed from a cluster ? Is there any upper
limit on the number of processes in cluster on window
?<BR><BR>Thanks,<BR>Abhishek<BR><BR>
<DIV class=gmail_quote>On Wed, Oct 14, 2009 at 2:37 AM, Jayesh Krishna <SPAN
dir=ltr><<A
href="mailto:jayesh@mcs.anl.gov">jayesh@mcs.anl.gov</A>></SPAN> wrote:<BR>
<BLOCKQUOTE
style="BORDER-LEFT: rgb(204,204,204) 1px solid; MARGIN: 0pt 0pt 0pt 0.8ex; PADDING-LEFT: 1ex"
class=gmail_quote>
<DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2
face=Arial><SPAN>Hi,</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN> Did
you try using the MPI error handlers (MPI_Comm_create_errhandler() /
MPI_ERRORS_RETURN)?</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2
face=Arial><SPAN></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2
face=Arial><SPAN>Regards,</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2
face=Arial><SPAN>Jayesh</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2
face=Arial><SPAN></SPAN></FONT> </DIV>
<DIV dir=ltr align=left>
<HR>
</DIV>
<DIV dir=ltr align=left><FONT size=2 face=Tahoma><B>From:</B> abhishek pandey
[mailto:<A href="mailto:hipandey@gmail.com"
target=_blank>hipandey@gmail.com</A>] <BR><B>Sent:</B> Tuesday, October 13,
2009 11:02 AM<BR><B>To:</B> Jayesh Krishna<BR><B>Subject:</B> Re:
[mpich-discuss] If one process of Cluster crashes<BR></FONT><BR></DIV>
<DIV>
<DIV></DIV>
<DIV class=h5>
<DIV></DIV>Hi Jayesh,<BR><BR>Thanks for reply.<BR><BR>This is an
application/network error. I am running several instances of my application on
different machines for very long time. So there is possibility of either crash
of one process or loss of network connectivity to any machine. In this
case, the cluster would goes down for now. But I want to ensure the other
processes should be running irrespective of one or more process
failure.<BR><BR>Is there any way, I can handle this situation ?
<BR><BR>Thanks,<BR>Abhishek<BR><BR>
<DIV class=gmail_quote>On Tue, Oct 13, 2009 at 8:20 PM, Jayesh Krishna <SPAN
dir=ltr><<A href="mailto:jayesh@mcs.anl.gov"
target=_blank>jayesh@mcs.anl.gov</A>></SPAN> wrote:<BR>
<BLOCKQUOTE
style="BORDER-LEFT: rgb(204,204,204) 1px solid; MARGIN: 0pt 0pt 0pt 0.8ex; PADDING-LEFT: 1ex"
class=gmail_quote>
<DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2
face=Arial><SPAN>Hi,</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN> We
are currently working on adding fault-tolerance to MPICH2. So in couple of
months we might have something that you can work with.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN> On
a side note, what kind of process crash do you see ? Is this an application
error (which you should fix anyway)? Is it due to an internal MPICH2 error ?
Please provide us more details.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2
face=Arial><SPAN></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2
face=Arial><SPAN>Regards,</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2
face=Arial><SPAN>Jayesh</SPAN></FONT></DIV><BR>
<DIV dir=ltr lang=en-us align=left>
<HR>
<FONT size=2 face=Tahoma><B>From:</B> <A
href="mailto:mpich-discuss-bounces@mcs.anl.gov"
target=_blank>mpich-discuss-bounces@mcs.anl.gov</A> [mailto:<A
href="mailto:mpich-discuss-bounces@mcs.anl.gov"
target=_blank>mpich-discuss-bounces@mcs.anl.gov</A>] <B>On Behalf Of
</B>abhishek pandey<BR><B>Sent:</B> Tuesday, October 13, 2009 7:23
AM<BR><B>To:</B> <A href="mailto:mpich-discuss@mcs.anl.gov"
target=_blank>mpich-discuss@mcs.anl.gov</A><BR><B>Subject:</B>
[mpich-discuss] If one process of Cluster crashes<BR></FONT><BR></DIV>
<DIV>
<DIV></DIV>
<DIV>
<DIV></DIV>Hi,<BR><BR>I am using MPICH2 on windows and sometime I face the
problem of crashing of one process in cluster. Is there any way to handle
this ? I do not want to start the cluster all over again.<BR>As far as I
know, if one process of cluster goes down anyhow then the cluster also goes
down.
<BR><BR><BR>Thanks,<BR>Abhishek.<BR></DIV></DIV></DIV></BLOCKQUOTE></DIV><BR></DIV></DIV></DIV></BLOCKQUOTE></DIV><BR></BODY></HTML>