[mpich-discuss] Hydra framework (thru Grid Engine)
Bernard Chambon
bernard.chambon at cc.in2p3.fr
Tue Dec 13 02:20:39 CST 2011
Hello,
thank you very much for your answer
Le 12 déc. 2011 à 16:39, Pavan Balaji a écrit :
>>
>> 2/ with Hydra (1.4.1p1) , is it possible to use 10 GigE TCP interface
>> We have machines with two TCP interfaces (eth0 = 1Gb/s, eth2 = 10Gb/s)
>> with Hydra I can't use eth2 even when specifying -iface eth2
>
> This should work correctly. Just to make sure you are using this right: setting the -iface option makes sure that your application is using the eth2 interface, so MPI communication will use eth2. Hydra itself will use the default network interface for launching processes, unless it finds a local interface in your host file list. If the SGE provided host list is using the 1GE interface, it will use 1GE.
>
> You might or might not care about what network interface is used for launching processes, but I want to make sure you are checking the network usage on the right machines.
Not sure to really understand
Do you means that the [optional] machine file (mpiexec -f machines_file) will only be used to launch process ?
To be clear, I want to be able to get with hydra + Grid Engine the following behavior :
Manual test with a piece of code, between the machines ccwpge0061 | ccwpge0062 (ccwpge0061p | ccwpge0062p for secondary interface)
and using old mpd solution
### on ccwpge0061 ### :
>hostname; mpd --ifhn=ccwpge0061p --daemon --echo
ccwpge0061
mpd_port=50712
### on ccwpge0062 ### :
>hostname; mpd -h ccwpge0061 -p 50712 --ifhn=ccwpge0062p --daemon --echo
ccwpge0062
mpd_port=56112
>mpdtrace -l
ccwpge0062_56112 (10.158.175.62)
ccwpge0061_50712 (10.158.175.61)
### Back on ccwpge0061 ###
>mpdtrace -l
ccwpge0061_50712 (10.158.175.61)
ccwpge0062_56112 (10.158.175.62)
> mpiexec -machinefile /tmp/machines.eth2 -n 2 bin/advance_test
> dstat -n -N eth0,eth2
--net/eth0- --net/eth2-
recv send: recv send
0 0 : 0 0
17k 42k:2065k 1158M
2209B 72k:2067k 1158M
1610B 24k:2063k 1158M
1738B 24k:2061k 1158M
2114B 24k:2064k 1158M
Ok, I use eth2 at 10Gige/s (see ~1GB/s usage)
The same code (advance_test) run with hydra thru Gridengine :
mpiexec -rmk sge -iface eth2 -n $NSLOTS ./bin/advance_test
don't use secondary interface, but the first one (eth0 - 1Gb/s, see ~118MB/s)
>dstat -n -N eth0,eth2
--net/eth0- --net/eth2-
recv send: recv send
0 0 : 0 0
479k 118M: 432B 140B
476k 118M:1438B 420B
478k 118M: 0 0
So, where to specify the equivalent ifhn option of mpd ? Does the -iface option should be sufficient?
Best reagards
---------------
Bernard CHAMBON
IN2P3 / CNRS
04 72 69 42 18
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111213/a4030a51/attachment.htm>
More information about the mpich-discuss
mailing list