[MPICH] MPICH2 performance issue with dual core

Tony Ladd tladd at che.ufl.edu
Wed Dec 26 12:00:03 CST 2007


I am using MPICH2 over Gigabit ethernet (Intel PRO 1000 + Extreme 
Networks x450a-s48t switches). For a single process per node MPICH2 is 
very fast; typical throughput on edge exchange is ~100MBytes/sec both 
ways. MPICH2 has more uniform throughput than LAM, is much faster than 
OpenMPI and almost as good throughput as MPIGAMMA (using 1MB TCP 
buffers). Latency is 24 microsecs with tuned NIC drivers. So far so 
(very) good.

Collective communications are excellent for 1 process as well, but 
terrible with 2 processes per node. For example, an AlltoAll with 16 
processes has average 1-way throughput of 56MBytes/sec when distributed 
over 16 nodes but only 6MBytes per sec when using 8 nodes and 2 
processes per node. This is of course the reverse of what one would 
expect. I also see the latency goes up more with 2 processes per node. 
So a 4 process Barrier call takes about 58 microsecs on 4 nodes and 68 
microsecs on 2 nodes. I checked with a single node and two processes and 
that was very fast (over 400MBytes/sec) so perhaps the issue is the 
interaction of shared memory and TCP. I compiled ch3:ssm and ch3:nemesis 
with the same result. Also with and without --enable-fast. This also did 
little.

Finally I notice the cpu utilization is 100%; can this be part of the 
problem?

I apologize if this has been gone over before, but I am new to MPICH2.

Thanks

Tony

-- 
Tony Ladd

Chemical Engineering Department
University of Florida
Gainesville, Florida 32611-6005
USA

Email: tladd-"(AT)"-che.ufl.edu
WebL   http://ladd.che.ufl.edu

Tel:   (352)-392-6509
FAX:   (352)-392-9514




More information about the mpich-discuss mailing list