[mpich-discuss] Hydra framework (thru Grid Engine)

Bernard Chambon bernard.chambon at cc.in2p3.fr
Tue Dec 13 02:20:39 CST 2011


Hello,

thank you very much for your answer

Le 12 déc. 2011 à 16:39, Pavan Balaji a écrit :

>> 
>> 2/ with Hydra (1.4.1p1) , is it possible to use 10 GigE TCP interface
>> We have machines with two TCP interfaces (eth0 = 1Gb/s, eth2 = 10Gb/s)
>> with Hydra I can't use eth2 even when specifying -iface eth2
> 
> This should work correctly.  Just to make sure you are using this right: setting the -iface option makes sure that your application is using the eth2 interface, so MPI communication will use eth2.  Hydra itself will use the default network interface for launching processes, unless it finds a local interface in your host file list.  If the SGE provided host list is using the 1GE interface, it will use 1GE.
> 
> You might or might not care about what network interface is used for launching processes, but I want to make sure you are checking the network usage on the right machines.


Not sure to really understand
Do you means that the [optional] machine file (mpiexec -f machines_file)  will only be used to launch process ?


To be clear, I want to be able to get with hydra + Grid Engine the following behavior :

Manual test with a piece of code, between the machines ccwpge0061 | ccwpge0062  (ccwpge0061p | ccwpge0062p for secondary interface)
and using old mpd solution

### on ccwpge0061 ### :
>hostname; mpd --ifhn=ccwpge0061p --daemon --echo
ccwpge0061
mpd_port=50712

### on ccwpge0062 ### :
>hostname; mpd -h ccwpge0061 -p 50712 --ifhn=ccwpge0062p --daemon --echo
ccwpge0062
mpd_port=56112

>mpdtrace -l
ccwpge0062_56112 (10.158.175.62)
ccwpge0061_50712 (10.158.175.61)

### Back on ccwpge0061 ###
>mpdtrace -l
ccwpge0061_50712 (10.158.175.61)
ccwpge0062_56112 (10.158.175.62)

> mpiexec -machinefile /tmp/machines.eth2 -n 2 bin/advance_test

> dstat -n -N eth0,eth2
--net/eth0- --net/eth2-
 recv  send: recv  send
   0     0 :   0     0 
  17k   42k:2065k 1158M
2209B   72k:2067k 1158M
1610B   24k:2063k 1158M
1738B   24k:2061k 1158M
2114B   24k:2064k 1158M

Ok, I use eth2 at 10Gige/s  (see ~1GB/s usage)


The same code (advance_test) run with hydra thru Gridengine  :
  mpiexec -rmk sge -iface eth2 -n $NSLOTS ./bin/advance_test

don't use secondary interface, but the first one (eth0 - 1Gb/s, see ~118MB/s)
>dstat -n -N eth0,eth2
   --net/eth0- --net/eth2-
   recv  send: recv  send
     0     0 :   0     0 
   479k  118M: 432B  140B
   476k  118M:1438B  420B
   478k  118M:   0     0 

So, where to specify the equivalent ifhn option of mpd ? Does the  -iface option should be sufficient?



Best reagards
---------------
Bernard CHAMBON
IN2P3 / CNRS
04 72 69 42 18


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111213/a4030a51/attachment.htm>


More information about the mpich-discuss mailing list