[MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2
Jayesh Krishna
jayesh at mcs.anl.gov
Wed Jul 25 09:12:02 CDT 2007
Hi,
Let us know what you find out...
Regards,
Jayesh
_____
From: Watford, Christopher A (GE Infra, Energy)
[mailto:christopher.watford at ge.com]
Sent: Wednesday, July 25, 2007 8:48 AM
To: mpich-discuss at mcs.anl.gov
Cc: Jayesh Krishna
Subject: RE: [MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2
An interesting thing to note is this is not an HP NC373i Multifunction
Gigabit Server Adapter TOE problem. The machines also have an HP NC380T PCIe
DP Multifunc Gig Server Adapter which ALSO has the same TOE problem.
It appears the Scalable Networking Pack (SNP) that comes with Windows 2003
R2 does not like offloading the packets our application is generating
(150MiB+). I believe Windows 2000 Server is not affected because it does not
use the new 'Chimney' system that comes with SNP for Win2k3.
I'm now engaging Microsoft to get to the bottom of this issue.
--
Christopher
9106755743
_____
From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Jayesh Krishna
Sent: Friday, July 20, 2007 3:49 PM
To: Watford, Christopher A (GE Infra, Energy)
Cc: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2
Hi,
Good to hear that you solved the problem. Let us know if you need any
further assistance.
Regards,
Jayesh
_____
From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Watford, Christopher A
(GE Infra, Energy)
Sent: Friday, July 20, 2007 2:05 PM
To: Jayesh Krishna
Cc: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2
I have isolated the problem as an issue in the TCP Offloading engine in the
new machines HP NC373i Multifunction Gigabit Server Adapter. I'm not sure
exactly which offloading section was failing, but the cards reported TCP
Offload Errors during the MPI_Bcast calls (150MB buffers!).
Disabling TCP offloading fixed the problem (granted it will probably hose
performance, hopefully a driver/firmware fix exists).
Thank you.
--
Christopher
9106755743
_____
From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Watford, Christopher A
(GE Infra, Energy)
Sent: Thursday, July 19, 2007 1:57 PM
To: Jayesh Krishna
Cc: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2
We use MPICH2 as well and applications with MPICH2 are having problems as
well on Win2k3r2. However, we have some machines still on Win2ksvr and some
on Win2k3x64, neither have this issue.
MPICH.NT/MPICH2 on Win2k3r2 is where we see it.
Now cpi.c does not exhibit this problem, however, I'm not sure exactly how
cpi.c's usage differs from our applications, but I'll have a look.
--
Christopher
9106755743
_____
From: Jayesh Krishna [mailto:jayesh at mcs.anl.gov]
Sent: Thursday, July 19, 2007 12:45 PM
To: Watford, Christopher A (GE Infra, Energy)
Cc: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2
Hi,
We would recommend that you migrate to MPICH2. All of our current
development efforts are going into MPICH2.
Meanwhile, do you have the same problem with OS flavors other than Win 2003
R2 ? Do you see the same problem with sample MPI applications like cpi.c ?
Regards,
Jayesh
_____
From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Watford, Christopher A
(GE Infra, Energy)
Sent: Thursday, July 19, 2007 10:47 AM
To: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2
(Apologies for the double post, our email system is not the best.)
It also appears that I have no problems submitting from host A to host B to
run (solely on host B). I can watch it work fine on host B via the command
prompt on host A (where mpiexec was called). Yet it appears as soon as I
have a job that spans two Win2k3r2 hosts I run into a problem.
--
Christopher
9106755743
_____
From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Watford, Christopher A
(GE Infra, Energy)
Sent: Thursday, July 19, 2007 10:32 AM
To: mpich-discuss at mcs.anl.gov
Subject: [MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2
I've run into an interesting problem with both MPICH.NT and MPICH2 on
Windows 2003 R2. With MPICH.NT our application, given N processors across M
machines, will have at most 2 processes going at once (across all machines).
On MPICH2, if the N processors span more than one machine nothing happens.
Process 0 can make forward progress, but no other machines talk to one
another, which is similar to the problem we are seeing with MPICH.NT, but
not quite the same.
Christopher Watford
GE-Hitachi Nuclear Energy
E <mailto:Christopher.Watford at ge.com> Christopher.Watford at ge.com
http://www.ge-energy.com/nuclear
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20070725/eca66f1c/attachment.htm>
More information about the mpich-discuss
mailing list