<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.6000.16788" name=GENERATOR></HEAD>
<BODY>
<DIV dir=ltr align=left><SPAN class=387375820-09012009><FONT face=Arial
color=#0000ff size=2>If you call MPI_Comm_disconnect on the intercommunicator
prior to the node failure, it might work (not sure). In any case, in the coming
year, we plan to make MPICH2 more resilient to such
failures.</FONT></SPAN></DIV>
<DIV> </DIV>
<DIV><SPAN class=387375820-09012009></SPAN><FONT face=Arial><FONT
color=#0000ff><FONT size=2>R<SPAN
class=387375820-09012009>ajeev</SPAN></FONT></FONT></FONT><BR></DIV>
<BLOCKQUOTE
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> mpich-discuss-bounces@mcs.anl.gov
[mailto:mpich-discuss-bounces@mcs.anl.gov] <B>On Behalf Of </B>Federico Golfrč
Andreasi<BR><B>Sent:</B> Thursday, January 08, 2009 7:54 AM<BR><B>To:</B>
mpich-discuss@mcs.anl.gov<BR><B>Subject:</B> [mpich-discuss] MPI errors with
more than one communicator<BR></FONT><BR></DIV>
<DIV></DIV>The architecture is made up of 2 MPI programs (one is executed by
the other via MPI_Comm_spawn_multiple) and so two different communicator.<BR>I
noticed that if an error occurs (i.e. a node crash) only the tasks
related to that communicator (an intra-communicator) are aborted, but the
tasks in the other communicator still wait for messages that will never
arrive.<BR>How can I close all the tasks related to both the communicator when
an error occurs in one communicator?<BR><BR>Thank you,<BR><FONT
color=#888888>Federico</FONT> </BLOCKQUOTE></BODY></HTML>