[mpich-discuss] Hydra framework (thru Grid Engine)

Bernard Chambon bernard.chambon at cc.in2p3.fr
Mon Dec 12 08:58:26 CST 2011


Hello,

I am working with mpich2, using Grid Engine as batch system and I am testing Hydra framework to get rid of burden of launching mpd thru pe_start (*)

With hydra, everything seems to be ok, however I have 2 questions

1/ After compiling hydra framework, I didn't find mpicc in hydra directories
   (so I used mpicc in older mpich2 install) 
   Do I need to understand that hydra is "only" an execution framework (mpiexec, etc.) ?

2/ with Hydra (1.4.1p1) , is it possible to use 10 GigE TCP interface
   We have machines with two TCP interfaces (eth0 = 1Gb/s, eth2 = 10Gb/s)
   with Hydra I can't use eth2 even when specifying -iface eth2

   To be more explicit, running :
     mpiexec -rmk sge -iface eth2 -n $NSLOTS ./bin/advance_test

   I got the eth0 usage, not the eth2 interface
  >dstat -n -N eth0,eth2
   --net/eth0- --net/eth2-
   recv  send: recv  send
     0     0 :   0     0 
   479k  118M: 432B  140B
   476k  118M:1438B  420B
   478k  118M:   0     0 
   475k  118M:   0     0 


  Do I need to use mvapich, for a 10 Gige tcp Network

Best regards


* PS : 
 Other reason for using hydra, is that mpd process are not very stable (in my configuration)
 I get mpd launching correctly by gridEngine (qrsh -inherit …) but I encountered instability of mpd process
 and got (randomly) message like :
   handle_rhs_input 1098): back in ring
   handle_lhs_challenge_response 1052): INVALID msg for lhs response msg=:{}:

 and, of course, my job abort
 
---------------
Bernard CHAMBON
IN2P3 / CNRS
04 72 69 42 18

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111212/44dcb329/attachment.htm>


More information about the mpich-discuss mailing list