[MPICH] MPICH.NT and MPICH2 issue on Windows 2003 R2

Watford, Christopher A (GE Infra, Energy) christopher.watford at ge.com
Fri Jul 20 14:05:00 CDT 2007


I have isolated the problem as an issue in the TCP Offloading engine in
the new machines HP NC373i Multifunction Gigabit Server Adapter. I'm not
sure exactly which offloading section was failing, but the cards
reported TCP Offload Errors during the MPI_Bcast calls (150MB buffers!).
 
Disabling TCP offloading fixed the problem (granted it will probably
hose performance, hopefully a driver/firmware fix exists).
 
Thank you.
 
--
Christopher
9106755743
 


________________________________

	From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Watford,
Christopher A (GE Infra, Energy)
	Sent: Thursday, July 19, 2007 1:57 PM
	To: Jayesh Krishna
	Cc: mpich-discuss at mcs.anl.gov
	Subject: RE: [MPICH] MPICH.NT and MPICH2 issue on Windows 2003
R2
	
	
	We use MPICH2 as well and applications with MPICH2 are having
problems as well on Win2k3r2. However, we have some machines still on
Win2ksvr and some on Win2k3x64, neither have this issue.
	 
	MPICH.NT/MPICH2 on Win2k3r2 is where we see it.
	 
	Now cpi.c does not exhibit this problem, however, I'm not sure
exactly how cpi.c's usage differs from our applications, but I'll have a
look.
	 
	--
	Christopher
	9106755743
	 


________________________________

		From: Jayesh Krishna [mailto:jayesh at mcs.anl.gov] 
		Sent: Thursday, July 19, 2007 12:45 PM
		To: Watford, Christopher A (GE Infra, Energy)
		Cc: mpich-discuss at mcs.anl.gov
		Subject: RE: [MPICH] MPICH.NT and MPICH2 issue on
Windows 2003 R2
		
		
		Hi,
		 We would recommend that you migrate to MPICH2. All of
our current development efforts are going into MPICH2.
		 Meanwhile, do you have the same problem with OS flavors
other than Win 2003 R2 ? Do you see the same problem with sample MPI
applications like cpi.c ?
		 
		Regards,
		Jayesh
		
________________________________

		From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Watford,
Christopher A (GE Infra, Energy)
		Sent: Thursday, July 19, 2007 10:47 AM
		To: mpich-discuss at mcs.anl.gov
		Subject: RE: [MPICH] MPICH.NT and MPICH2 issue on
Windows 2003 R2
		
		
		(Apologies for the double post, our email system is not
the best.)
		 
		It also appears that I have no problems submitting from
host A to host B to run (solely on host B). I can watch it work fine on
host B via the command prompt on host A (where mpiexec was called). Yet
it appears as soon as I have a job that spans two Win2k3r2 hosts I run
into a problem.
		 
		--
		Christopher
		9106755743
		 


________________________________

			From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Watford,
Christopher A (GE Infra, Energy)
			Sent: Thursday, July 19, 2007 10:32 AM
			To: mpich-discuss at mcs.anl.gov
			Subject: [MPICH] MPICH.NT and MPICH2 issue on
Windows 2003 R2
			
			
			I've run into an interesting problem with both
MPICH.NT and MPICH2 on Windows 2003 R2. With MPICH.NT our application,
given N processors across M machines, will have at most 2 processes
going at once (across all machines). On MPICH2, if the N processors span
more than one machine nothing happens. Process 0 can make forward
progress, but no other machines talk to one another, which is similar to
the problem we are seeing with MPICH.NT, but not quite the same.
			 

			Christopher Watford
			GE-Hitachi Nuclear Energy
			E Christopher.Watford at ge.com
<mailto:Christopher.Watford at ge.com> 
			http://www.ge-energy.com/nuclear

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20070720/6200cce7/attachment.htm>


More information about the mpich-discuss mailing list