[MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2

Jayesh Krishna jayesh at mcs.anl.gov
Fri Jul 20 14:48:45 CDT 2007


Hi,
 Good to hear that you solved the problem. Let us know if you need any
further assistance.
 
Regards,
Jayesh

  _____  

From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Watford, Christopher A
(GE Infra, Energy)
Sent: Friday, July 20, 2007 2:05 PM
To: Jayesh Krishna
Cc: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2


I have isolated the problem as an issue in the TCP Offloading engine in the
new machines HP NC373i Multifunction Gigabit Server Adapter. I'm not sure
exactly which offloading section was failing, but the cards reported TCP
Offload Errors during the MPI_Bcast calls (150MB buffers!).
 
Disabling TCP offloading fixed the problem (granted it will probably hose
performance, hopefully a driver/firmware fix exists).
 
Thank you.
 
--
Christopher
9106755743
 


  _____  

From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Watford, Christopher A
(GE Infra, Energy)
Sent: Thursday, July 19, 2007 1:57 PM
To: Jayesh Krishna
Cc: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2


We use MPICH2 as well and applications with MPICH2 are having problems as
well on Win2k3r2. However, we have some machines still on Win2ksvr and some
on Win2k3x64, neither have this issue.
 
MPICH.NT/MPICH2 on Win2k3r2 is where we see it.
 
Now cpi.c does not exhibit this problem, however, I'm not sure exactly how
cpi.c's usage differs from our applications, but I'll have a look.
 
--
Christopher
9106755743
 


  _____  

From: Jayesh Krishna [mailto:jayesh at mcs.anl.gov] 
Sent: Thursday, July 19, 2007 12:45 PM
To: Watford, Christopher A (GE Infra, Energy)
Cc: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2


Hi,
 We would recommend that you migrate to MPICH2. All of our current
development efforts are going into MPICH2.
 Meanwhile, do you have the same problem with OS flavors other than Win 2003
R2 ? Do you see the same problem with sample MPI applications like cpi.c ?
 
Regards,
Jayesh

  _____  

From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Watford, Christopher A
(GE Infra, Energy)
Sent: Thursday, July 19, 2007 10:47 AM
To: mpich-discuss at mcs.anl.gov
Subject: RE: [MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2


(Apologies for the double post, our email system is not the best.)
 
It also appears that I have no problems submitting from host A to host B to
run (solely on host B). I can watch it work fine on host B via the command
prompt on host A (where mpiexec was called). Yet it appears as soon as I
have a job that spans two Win2k3r2 hosts I run into a problem.
 
--
Christopher
9106755743
 


  _____  

From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Watford, Christopher A
(GE Infra, Energy)
Sent: Thursday, July 19, 2007 10:32 AM
To: mpich-discuss at mcs.anl.gov
Subject: [MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2


I've run into an interesting problem with both MPICH.NT and MPICH2 on
Windows 2003 R2. With MPICH.NT our application, given N processors across M
machines, will have at most 2 processes going at once (across all machines).
On MPICH2, if the N processors span more than one machine nothing happens.
Process 0 can make forward progress, but no other machines talk to one
another, which is similar to the problem we are seeing with MPICH.NT, but
not quite the same.
 

Christopher Watford
GE-Hitachi Nuclear Energy
E  <mailto:Christopher.Watford at ge.com> Christopher.Watford at ge.com
http://www.ge-energy.com/nuclear

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20070720/662f680f/attachment.htm>


More information about the mpich-discuss mailing list