[mpich-discuss] General Scalability Question
Robertson, Andrew
andrew.robertson at atk.com
Mon Oct 26 10:47:01 CDT 2009
Dave,
So does that imply you wrote the app from square one to use shared
memory? Or is that part of how mpi gets invoked. One of my applications
(GASP) uses lam-mpi. And it appears that I get only one mpi process per
node with multiple application instances.
Is this accomplished via the use of the "nemisis" or "mt" channel?
Thanks
- Andy
Andrew Robertson P.E.
CFD Analyst
GASL Operations
Tactical Propulsion and Controls
ATK
77 Raynor Avenue
Ronkokoma NY 11779
631-737-6100 Ext 120
Fax: 631-588-7023
www.atk.com
!! Knowledge and Thoroughness Baby !!
________________________________
From: mpich-discuss-bounces at mcs.anl.gov
[mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Hiatt, Dave M
Sent: Monday, October 26, 2009 11:40 AM
To: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] General Scalability Question
So far my experience has been that the in core message transfer rate is
far better than a gigabyte switch and backbone. Infiniband would be a
dramatic improvement but it's hard to believe that it could keep up with
in memory. What has worked out best for our app is a single message
thread, and then the app using shared memory directly to distribute.
That dramatically lowers the number of open sockets and communication
overhead. It may not work best in every case, but for us it worked
better regardless of very high core/process count per node or lower
count per node. So we ran only one MPI process per physical node. It
also lowers the number of sockets you have to support on node 0 if you
have point to point communication. Linux at least defaults to 1048
sockets and files, and it's nice for node 0 performance to keep under
that. You can raise it with ulimit, but when you're got 15000 cores,
it's pretty expensive to have one MPI process per core.
-----Original Message-----
From: mpich-discuss-bounces at mcs.anl.gov
[mailto:mpich-discuss-bounces at mcs.anl.gov]On Behalf Of Robertson, Andrew
Sent: Monday, October 26, 2009 10:30 AM
To: mpich-discuss at mcs.anl.gov
Subject: [mpich-discuss] General Scalability Question
Folks,
Our IT staff is not particularly knowledgeable about parallel
computing. Their current upgrade plan centers around quad/quad or
dual/hex boxes which would have 16 or 12 cores respectively. I have no
doubt that such a machine would run a parallel job efficiently. My
question is how well can I harness multiple boxes together?
The applications are all CFD (FLUENT, GASP, STAR, VULCAN). I am
talking to the various software vendors about this but would like some
info from the programming community.
Assuming the same memory per core am I better off with
High core count (12-16) boxes on a gigabit switch
Lower core count (2 -4) boxes on an infiniband switch.
I understand that if I configure mpich correctly it will use
shared memory on the mutli-core multi-processor boxes. If I end up with
the high core count boxes, should I spec the frontside bus (or whatever
it is called now) as high as possible??
I also have concerns that a single power supply failure takes
out more cores, though perhaps that is not such a problem
Any information is greatly appreciated
Thanks
Andy
--------------------
Andrew Robertson P.E.
CFD Analyst
GASL Operations
Tactical Propulsion and Controls
ATK
77 Raynor Avenue
Ronkokoma NY 11779
631-737-6100 Ext 120
Fax: 631-588-7023
www.atk.com <file://www.atk.com>
!! Knowledge and Thoroughness Baby !!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20091026/df7bf356/attachment-0001.htm>
More information about the mpich-discuss
mailing list