[mpich-discuss] MPICH channel (ssm vs. sock)

Jayesh Krishna jayesh at mcs.anl.gov
Mon Mar 30 10:26:27 CDT 2009


Hi,
 The performance difference could depend on your MPI program. Are you
using a publicly available benchmark (or is it your code)?
 
# Do you consistently see the performance difference (How many times did
you run your code - did you take an average - did you rule out the extreme
cases)?
# Can you send us your code ?
 
 The newer nemesis channel will soon replace ssm (Nemesis is available in
the latest 1.1b1 release. However we are still working on the performance
of Nemesis On Windows.). 
 
Regards,
Jayesh

  _____  

From: mpich-discuss-bounces at mcs.anl.gov
[mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of ???
Sent: Sunday, March 29, 2009 10:21 PM
To: mpich-discuss at mcs.anl.gov
Subject: [mpich-discuss] MPICH channel (ssm vs. sock)


Hi all:
 
I have done a test with different channel.
 
I use 2 Windows XP machines (Quad-core Intel Q6600) to make a cluster.
 
I start 4 MPI processes (2 in each machine)
 
 
Case 1: without -channel command (this uses the default socket channel)
 
The elapsed time: 138 sec
The true consumed CPU time (obtained by Windows API): ~ 40 sec for all MPI
processes
 
>From the result, I know that the difference between of 138 sec and 40 sec
results from the network data transfer.
Since the CPU is idle while the data is transffered via network.
 
 
Case 2: Add -channel ssm right after mpiexec (mpiexec -channel ssm
-pwdfile pwd.txt .......)
 
The elapsed time: 167 sec
The true consumed CPU time (obtained by Windows API): ~ 160 sec for all
MPI processes
 
>From Case 1, the CPU needs only 40 sec to do the job, but in Case 2, the
CPU needs 4 times CPU time, WHY???
 
Is the result of my test wierd or normal ? If it's normal, then the ssm
channel has no benefit at all!
 
I have found the following statements in the changelog of MPICH:
 
Unlike the ssm channel which waits for new data to
arrive by continuously polling the system in a busy loop, the essm channel
waits by blocking on an operating system event object.
 
 
Maybe the problem is the "continuously polling the system in a busy loop"
 
 
 
regards,
 
Seifer Lin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20090330/c11e138a/attachment.htm>


More information about the mpich-discuss mailing list