[MPICH] MPICH2 performance issue with dual core
Tony Ladd
tladd at che.ufl.edu
Wed Dec 26 12:00:03 CST 2007
I am using MPICH2 over Gigabit ethernet (Intel PRO 1000 + Extreme
Networks x450a-s48t switches). For a single process per node MPICH2 is
very fast; typical throughput on edge exchange is ~100MBytes/sec both
ways. MPICH2 has more uniform throughput than LAM, is much faster than
OpenMPI and almost as good throughput as MPIGAMMA (using 1MB TCP
buffers). Latency is 24 microsecs with tuned NIC drivers. So far so
(very) good.
Collective communications are excellent for 1 process as well, but
terrible with 2 processes per node. For example, an AlltoAll with 16
processes has average 1-way throughput of 56MBytes/sec when distributed
over 16 nodes but only 6MBytes per sec when using 8 nodes and 2
processes per node. This is of course the reverse of what one would
expect. I also see the latency goes up more with 2 processes per node.
So a 4 process Barrier call takes about 58 microsecs on 4 nodes and 68
microsecs on 2 nodes. I checked with a single node and two processes and
that was very fast (over 400MBytes/sec) so perhaps the issue is the
interaction of shared memory and TCP. I compiled ch3:ssm and ch3:nemesis
with the same result. Also with and without --enable-fast. This also did
little.
Finally I notice the cpu utilization is 100%; can this be part of the
problem?
I apologize if this has been gone over before, but I am new to MPICH2.
Thanks
Tony
--
Tony Ladd
Chemical Engineering Department
University of Florida
Gainesville, Florida 32611-6005
USA
Email: tladd-"(AT)"-che.ufl.edu
WebL http://ladd.che.ufl.edu
Tel: (352)-392-6509
FAX: (352)-392-9514
More information about the mpich-discuss
mailing list