[MPICH] FreeBSD and the ch3:smm channel?

Wed Jan 31 16:49:59 CST 2007

First, what's the execution time of some of the codes you generally run on 
those machines for the different configurations?

Just to check message latency and bandwidth you can run netpipe that comes 
with the MPICH2 distibution:

#substitute ${MPICH2_SRCDIR} with the location of the source directory
cp ${MPICH2_SRCDIR}/test/mpi/basic/GetOpt.* .
cp ${MPICH2_SRCDIR}/test/mpi/basic/netmpi.c .

#assuming MPICH2 bin is in your path
mpicc Getopt.c netmpi.c -o netmpi

#just to see whether its running on the same node or not
mpiexec -n 2 hostname

mpiexec -n 2 ./netmpi

Try this for the different configurations, and with both processes on the 
same node and on different nodes.

It's possible that your codes aren't that sensitive to latency or 
bandwidth, so you won't see much of a difference.  Also, if certain 
processes communicate more than others, it would be beneficial, especially 
with nemesis, to put them on the same node.  By default, mpd will 
distribute them in round-robin, so process 0 and 1 will be on separate 
nodes.

-d

On Wed, 31 Jan 2007, Steve Kargl wrote:

> On Wed, Jan 31, 2007 at 03:40:25PM -0600, Darius Buntinas wrote:
>>
>> On Wed, 31 Jan 2007, Steve Kargl wrote:
>>
>>> On Wed, Jan 31, 2007 at 01:34:24PM -0600, Darius Buntinas wrote:
>>>>
>>>> We're working on a fixing this the right way, but in the mean time, as
>>>> long as you don't need to use --enable-fast, edit the file
>>>> src/mpid/ch3/channels/nemesis/setup_channel.args and remove every thing
>>>> after, and including, the line that starts with "eval" (i.e., the eval
>>>> line and the whole for loop).  Then give configure a try again.
>>>>
>>>> Let me know if this helps.
>>>>
>>>
>>> That appears to work.  I can build and install mpich2 with
>>> the nemesis device.  Unfortunately, it doesn't appear to
>>> help performance on the SMP systems.
>>>
>>
>> What kind of latency/bandwidth are you getting, and what kind or machine
>> are you running on?
>>
>
> The cluster topology can be seen at
>
> http://troutmask.apl.washington.edu/~kargl/hpc.html
>
> The nodes are connected with gigE ethernet using standard TCP/IP
> packets.  I tried jumbo frames, but that also seemed to reduce
> performance.  Each node has 2 dual-core opteron processors (ie
> 4 effective cpus per node).   I was hoping the ch3:smm (or
> ch3:nemesis) would improve communication for same-node-processes.
>
> I'm still trying to determine the best way to measure the latency
> for our codes.  All I have at the moment is antidotal measures
> (ie, wall-clock and Fortran cpu_time).  My code is a standard
> master-slave algorithm and ch3:sock works well.  My colleague uses
> a scatter-gather algorithm and communication appears to be killing
> him.  If I switch us to ch3:nemesis, performance appears to go
> down for both codes.
>
> With my code and ch3:nemesis and an otherwise idle cluster, I do
>
> $ mpiexec -n 24 ./ripmp
>
> and top(1) immediately shows
>
>  PID USERNAME    THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
> 4079 kargl         1  96    0 33880K 10276K select 0   0:01  0.00% python2.4
> 54225 kargl         1  96    0 32960K  9524K select 0   0:00  0.00% python2.4
> 54228 kargl         1  96    0 34148K 10548K select 0   0:00  0.00% python2.4
> 54227 kargl         1  96    0 34148K 10548K select 0   0:00  0.00% python2.4
> 54226 kargl         1  96    0 34148K 10548K select 3   0:00  0.00% python2.4
> 54229 kargl         1  96    0 34148K 10548K select 0   0:00  0.00% python2.4
> 54231 kargl         1   4    0  7156K  2304K sbwait 0   0:00  0.00% ripmp
> 54232 kargl         1   4    0  7156K  2304K sbwait 2   0:00  0.00% ripmp
> 54230 kargl         1   4    0  7156K  2304K sbwait 0   0:00  0.00% ripmp
> 54233 kargl         1   4    0  7156K  2304K sbwait 3   0:00  0.00% ripmp
>
> ripmp is my code and the python2.4 jobs are from mpiexec.  The ripmp
> jobs remain in the sbwait state for at least 45 seconds, then the state
> changes to accept and back to sbwait, after 60+ seconds the ripmp jobs
> suddenly start to run
>
> 54233 kargl         1 112    0 40860K  3176K RUN    0   0:43 84.52% ripmp
> 54230 kargl         1 112    0 40836K  3192K CPU3   1   0:41 84.27% ripmp
> 54231 kargl         1 112    0 40836K  3200K CPU2   3   0:40 83.98% ripmp
> 54232 kargl         1 112    0 40836K  3196K CPU1   2   0:38 83.78% ripmp
>
> With the ch3:sock, the ripmp jobs start to run almost immediately and
> actually reach 98% cpu utilitization.
>
>
> If you have a suggestion on how to measure the difference in
> latency for ch3:sock and ch3:nemesis, then I'll try to gather
> some numbers.
>
>