[MPICH] MPICH 105, SUN NIAGARA dead in the water

chong tan chong_guan_tan at yahoo.com
Thu Apr 26 15:10:20 CDT 2007


your assessment of 20us is right.  Your guess on other were not.  The master has to wait for
other to return, and prepare the data to be exchange.  The size of the data are mostly small.  The smallest being 16 bytes, a few packages went up to about 8K in the test.

I don't see performance problem in the app.  It won't achieve the performance gain we are getting on SMP system if that is the case.  However, this app is not a app for socket.  In another word, socket is the problem, not the app, and I have no intention of using socket communication on SMP/SMT system like SUN's NIAGARA, that is why I ask for help at the first place.

tan



----- Original Message ----
From: Anthony Chan <chan at mcs.anl.gov>
To: chong tan <chong_guan_tan at yahoo.com>
Cc: mpich-discuss at mcs.anl.gov
Sent: Thursday, April 26, 2007 12:47:28 PM
Subject: Re: [MPICH] MPICH 105, SUN NIAGARA dead in the water


You are saying that 1 process takes 15 hours.  The 6 process job takes 10
times as long, 150 hours, and the master node send/recv 26*10^9 times
during that time.  The average send/recv time becomes
   150*3600.0sec / (26*10^9) ~ 20 usec.
The latency of tcp/ethernet is a bit less than 10 usec.  This means either
your app communicates with zero byte message and spending almost half of
time in communication or communicates in small message and does no
computation.  Any shared memory communication is of the order of several
nano seconds, so it is obvious that your app does better with shared
memory device.   In any case, the numbers suggest your app may have a
performance problem.  Hope this helps.

A.Chan



On Thu, 26 Apr 2007, chong tan wrote:

> yes and no.  The same code achieved almost 4X performance improvement using nemesis on X86 running Linux, so comm cost is not as high as one would think.   26B is the combined count for both send and recieve on the master whose job is in  syncing all the slaves.  The packages are small,  packed or not packed, the overhead is there.  packing is not going to speed up the thing by more than 4X.
>
> Don't have the number on how the sock comm cost will be on Linux.  On INTEL SMP, MPI memesis can handle more than 1 million send or recv per second.
>
> tan
>
> ----- Original Message ----
> From: Anthony Chan <chan at mcs.anl.gov>
> To: chong tan <chong_guan_tan at yahoo.com>
> Cc: mpich-discuss at mcs.anl.gov
> Sent: Thursday, April 26, 2007 11:43:25 AM
> Subject: Re: [MPICH] MPICH 105, SUN NIAGARA dead in the water
>
>
> On Thu, 26 Apr 2007, chong tan wrote:
>
> > shm/ssm  : dropped pacakge
> > nemesis   : not supported
> > socket     : works, but very-very-very-slow due to the number of communication.  In one test, with 26 billion send+recieve, 6 processes with MPI is more than 10X slower than uni-process  (uni-process takes 15 hours)
>
> This is off from the original topic.  But your stated performance
> suggests your app's ratio of communication/computation may be too high
> to archieve good performance.  Does your message tend to be small and
> frequent ?  If so, is it possible to combine them using MPI_Pack ?
> (I believe that there are tools that can do that for you.)
>
> A.Chan
>
> >
> >
> > any suggestion ?
> >
> > thanks
> >
> > tan
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam?  Yahoo! Mail has the best spam protection around
> > http://mail.yahoo.com
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20070426/c2ac15f6/attachment.htm>


More information about the mpich-discuss mailing list