[mpich-discuss] General Scalability Question

Mon Oct 26 10:47:01 CDT 2009

Dave,
So does that imply you wrote the app from square one to use shared
memory? Or is that part of how mpi gets invoked. One of my applications
(GASP) uses lam-mpi. And it appears that I get only one mpi process per
node with multiple application instances. 

Is this accomplished via the use of the "nemisis" or "mt" channel?

Thanks
- Andy

Andrew Robertson P.E. 
CFD Analyst 
GASL Operations 
Tactical Propulsion and Controls 
ATK 
77 Raynor Avenue 
Ronkokoma NY 11779 
631-737-6100 Ext 120 
Fax: 631-588-7023 
www.atk.com 

!! Knowledge and Thoroughness Baby !! 

________________________________

From: mpich-discuss-bounces at mcs.anl.gov
[mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Hiatt, Dave M 
Sent: Monday, October 26, 2009 11:40 AM
To: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] General Scalability Question

So far my experience has been that the in core message transfer rate is
far better than a gigabyte switch and backbone.  Infiniband would be a
dramatic improvement but it's hard to believe that it could keep up with
in memory.  What has worked out best for our app is a single message
thread, and then the app using shared memory directly to distribute.
That dramatically lowers the number of open sockets and communication
overhead.  It may not work best in every case, but for us it worked
better regardless of very high core/process count per node or lower
count per node.  So we ran only one MPI process per physical node.  It
also lowers the number of sockets you have to support on node 0 if you
have point to point communication.  Linux at least defaults to 1048
sockets and files, and it's nice for node 0 performance to keep under
that.  You can raise it with ulimit, but when you're got 15000 cores,
it's pretty expensive to have one MPI process per core.

	-----Original Message-----
	From: mpich-discuss-bounces at mcs.anl.gov
[mailto:mpich-discuss-bounces at mcs.anl.gov]On Behalf Of Robertson, Andrew
	Sent: Monday, October 26, 2009 10:30 AM
	To: mpich-discuss at mcs.anl.gov
	Subject: [mpich-discuss] General Scalability Question

	Folks, 
	Our IT staff is not particularly knowledgeable about parallel
computing. Their current upgrade plan centers around quad/quad or
dual/hex boxes which would have 16 or 12 cores respectively. I have no
doubt that such a machine would run a parallel job efficiently. My
question is how well can I harness multiple boxes together? 

	The applications are all CFD (FLUENT, GASP, STAR, VULCAN). I am
talking to the various software vendors about this but would like some
info from the programming community. 

	Assuming the same memory per core am I better off with 

	High core count (12-16) boxes on a gigabit switch 
	Lower core count (2 -4) boxes on an infiniband switch. 

	I understand that if I configure mpich correctly it will use
shared memory on the mutli-core multi-processor boxes. If I end up with
the high core count boxes, should I spec the frontside bus (or whatever
it is called now) as high as possible??

	I also have concerns that a single power supply failure takes
out more cores, though perhaps that is not such a problem 

	Any information is greatly appreciated 

	Thanks 
	Andy 
	-------------------- 
	Andrew Robertson P.E. 
	CFD Analyst 
	GASL Operations 
	Tactical Propulsion and Controls 
	ATK 
	77 Raynor Avenue 
	Ronkokoma NY 11779 
	631-737-6100 Ext 120 
	Fax: 631-588-7023 
	www.atk.com <file://www.atk.com>  

	!! Knowledge and Thoroughness Baby !! 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20091026/df7bf356/attachment-0001.htm>