<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2900.3132" name=GENERATOR></HEAD>
<BODY>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=494130319-20072007>I have isolated the problem as an issue in the TCP
Offloading engine in the new machines HP NC373i Multifunction Gigabit Server
Adapter. I'm not sure exactly which offloading section was failing, but the
cards reported TCP Offload Errors during the MPI_Bcast calls (150MB
buffers!).</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=494130319-20072007></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=494130319-20072007>Disabling TCP offloading fixed the problem (granted it
will probably hose performance, hopefully a driver/firmware fix
exists).</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=494130319-20072007></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=494130319-20072007>Thank you.</SPAN></FONT></DIV>
<DIV> </DIV>
<DIV align=left><FONT face=Arial size=2><STRONG>--</STRONG></FONT></DIV>
<DIV align=left><FONT face=Arial
size=2><STRONG>Christopher</STRONG></FONT></DIV>
<DIV align=left><FONT face=Arial size=2>9106755743</FONT></DIV>
<DIV> </DIV><BR>
<BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> owner-mpich-discuss@mcs.anl.gov
[mailto:owner-mpich-discuss@mcs.anl.gov] <B>On Behalf Of </B>Watford,
Christopher A (GE Infra, Energy)<BR><B>Sent:</B> Thursday, July 19, 2007 1:57
PM<BR><B>To:</B> Jayesh Krishna<BR><B>Cc:</B>
mpich-discuss@mcs.anl.gov<BR><B>Subject:</B> RE: [MPICH] MPICH.NT and MPICH2
issue on Windows 2003 R2<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV dir=ltr align=left><SPAN class=296035417-19072007><FONT face=Arial
color=#0000ff size=2>We use MPICH2 as well and applications with MPICH2 are
having problems as well on Win2k3r2. However, we have some machines still on
Win2ksvr and some on Win2k3x64, neither have this issue.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=296035417-19072007><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=296035417-19072007><FONT face=Arial
color=#0000ff size=2>MPICH.NT/MPICH2 on Win2k3r2 is where we see
it.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=296035417-19072007><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=296035417-19072007><FONT face=Arial
color=#0000ff size=2>Now cpi.c does not exhibit this problem, however, I'm not
sure exactly how cpi.c's usage differs from our applications, but I'll have a
look.</FONT></SPAN></DIV>
<DIV> </DIV>
<DIV align=left><FONT face=Arial size=2><STRONG>--</STRONG></FONT></DIV>
<DIV align=left><FONT face=Arial
size=2><STRONG>Christopher</STRONG></FONT></DIV>
<DIV align=left><FONT face=Arial size=2>9106755743</FONT></DIV>
<DIV> </DIV><BR>
<BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> Jayesh Krishna
[mailto:jayesh@mcs.anl.gov] <BR><B>Sent:</B> Thursday, July 19, 2007 12:45
PM<BR><B>To:</B> Watford, Christopher A (GE Infra, Energy)<BR><B>Cc:</B>
mpich-discuss@mcs.anl.gov<BR><B>Subject:</B> RE: [MPICH] MPICH.NT and MPICH2
issue on Windows 2003 R2<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=807173416-19072007>Hi,</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=807173416-19072007> We would recommend that you migrate to
MPICH2. All of our current development efforts are going into
MPICH2.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=807173416-19072007> Meanwhile, do you have the same problem with
OS flavors other than Win 2003 R2 ? Do you see the same problem with sample
MPI applications like cpi.c ?</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=807173416-19072007></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=807173416-19072007>Regards,</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=807173416-19072007>Jayesh</SPAN></FONT><BR></DIV>
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> owner-mpich-discuss@mcs.anl.gov
[mailto:owner-mpich-discuss@mcs.anl.gov] <B>On Behalf Of </B>Watford,
Christopher A (GE Infra, Energy)<BR><B>Sent:</B> Thursday, July 19, 2007
10:47 AM<BR><B>To:</B> mpich-discuss@mcs.anl.gov<BR><B>Subject:</B> RE:
[MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV dir=ltr align=left><SPAN class=369414315-19072007><FONT face=Arial
color=#0000ff size=2>(Apologies for the double post, our email system is not
the best.)</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=369414315-19072007><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=369414315-19072007><FONT face=Arial
color=#0000ff size=2>It also appears that I have no problems submitting from
host A to host B to run (solely on host B). I can watch it work fine on host
B via the command prompt on host A (where mpiexec was called). Yet it
appears as soon as I have a job that spans two Win2k3r2 hosts I run into a
problem.</FONT></SPAN></DIV>
<DIV> </DIV>
<DIV align=left><FONT face=Arial size=2><STRONG>--</STRONG></FONT></DIV>
<DIV align=left><FONT face=Arial
size=2><STRONG>Christopher</STRONG></FONT></DIV>
<DIV align=left><FONT face=Arial size=2>9106755743</FONT></DIV>
<DIV> </DIV><BR>
<BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> owner-mpich-discuss@mcs.anl.gov
[mailto:owner-mpich-discuss@mcs.anl.gov] <B>On Behalf Of </B>Watford,
Christopher A (GE Infra, Energy)<BR><B>Sent:</B> Thursday, July 19, 2007
10:32 AM<BR><B>To:</B> mpich-discuss@mcs.anl.gov<BR><B>Subject:</B>
[MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV><SPAN class=052263014-19072007><FONT face=Arial size=2>I've run into
an interesting problem with both MPICH.NT and MPICH2 on Windows 2003 R2.
With MPICH.NT our application, given N processors across M machines, will
have at most 2 processes going at once (across all machines). On MPICH2,
if the N processors span more than one machine nothing happens. Process 0
can make forward progress, but no other machines talk to one another,
which is similar to the problem we are seeing with MPICH.NT, but not quite
the same.</FONT></SPAN></DIV>
<DIV> </DIV>
<DIV align=left>
<P align=left><FONT face=Arial size=2><STRONG>Christopher
Watford<BR></STRONG>GE-Hitachi Nuclear Energy<BR></FONT><FONT face=Arial
size=2>E </FONT><A href="mailto:Christopher.Watford@ge.com"><FONT
face=Arial size=2>Christopher.Watford@ge.com</FONT></A><FONT face=Arial
size=2><BR><A
href="http://www.ge-energy.com/nuclear">http://www.ge-energy.com/nuclear</A></FONT><FONT
size=2></P></FONT></DIV></BLOCKQUOTE></BLOCKQUOTE></BLOCKQUOTE></BODY></HTML>