[mpich-discuss] Scalability of Intel quad core (Harpertown) cluster

Codner, Clay codner.cg at pg.com
Mon Mar 31 08:44:22 CDT 2008


Looking at your pong data, it looks to me like latency is your
bottleneck from a network perspective.  You appear to be getting really
good bandwidth.  One thing you can check is that your TCP windows are
set to a large enough value.  Newer versions of linux do this
automatically, but it might be worth checking.  

 

Another thought.  Since you have great bandwidth, but not great latency,
try grouping the communications into larger messages, rather than many
small ones. 

 

The real issue, though, seems to be the boundary conditions.  My guess
is that enforcing the boundary conditions results in a sync point where
the program has to wait until everything is caught up.  Maybe you could
give us an idea of what new boundary conditions you are imposing?

 

From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Hee Il Kim
Sent: Saturday, March 29, 2008 5:15 AM
To: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] Scalability of Intel quad core (Harpertown)
cluster

 

Hi,

mpi_pong test reuslts show that out gigabit network has 

Max rate = 115.335908 MB/sec  Min latency = 49.948692 usec

I ran it several times and got almost the same results. Unfortunately I
could not analyse more and the reuslt was attached to this mail.

Hee Il

2008/3/29, Elvedin Trnjanin <trnja001 at umn.edu>:

You would do that within the code. If you're trying to pass every
element of an int array one at a time (message size is sizeof(int)), it
will have much worse performance than sending the entire array (message
size is sizeof(int)*arraydimensions.

Example -
http://www.scl.ameslab.gov/Projects/mpi_introduction/figs/mpi_pong.c
This is a start for a decent approximation of network bandwidth and
latency for a certain message size. Although not accurate, it'll
certainly give you an idea of your network's performance with various
message sizes and transfer types. It only works on two nodes at a time
however so other types of communication like AlltoAll are not tested.


Hee Il Kim wrote:
>
> I checked the bandwidth behavior mentioned by Elvedin. Could I change
> or setup the message size and frequency in a runtime level or any
> other steps?
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080331/783a36ff/attachment.htm>


More information about the mpich-discuss mailing list