Hello;
<div><br></div><div>I have a system with 2 Ethernets (eth0 and eth2). eth0 is connected to a 10GigE switch, and eth2 is connected to a separate GigE switch.</div><div><br></div><div>Using HYDRA in version 1.2.1p1, when I want to use a different interface, I get my desired results. The file "hostsGigE" has 2 host names with the gige interface IP's, well "hosts10GigE" has 2 10gige IP's. The following commands work:</div>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<div> # mpiexec -f hostsGigE -n 2 `pwd`/osu_bw ---Shows bandwidth around 117MB/s</div><div> # mpiexec -f hosts10GigE -n 2 `pwd`/osu_bw ---Shows bandwidth around 900MB/s</div><div><br></div><div>When using the latest MPICH2 (1.3a2 and 1.3b1), it seems to always be using the 10GigE network</div>
<div> # mpiexec -f hostsGigE -n 2 `pwd`/osu_bw --Shows bandwidth around 900MB/s</div><div> # mpiexec -f hosts10GigE -n 2 `pwd`/osu_bw --Shows bandwidth around 900MB/s</div>
<div><br></div><div>These commands also show bandwidth around 900MB/s (including using the IP instead of hostnames) (IE using -iface -hosts and -f flags):</div><meta http-equiv="content-type" content="text/html; charset=utf-8"><div>
# mpiexec -f hosts10GigE -n 2 -iface eth2 `pwd`/osu_bw </div><div> # mpiexec -hosts node01-eth2,node02-eth2 -iface eth2 -n 2 `pwd`/osu_bw</div><div> # mpiexec -hosts 172.20.101.1,172.20.101.2 -n 2 `pwd`/osu_bw</div>
<div><br></div><div><br></div><div>Anyone know what I am doing wrong? And why it works as expected in the HYDRA 1.2.1p1 version, but not in the latest 1.3b1? I am a little confused on how it even knows about the 10GigE network when I only gave it GigE hostnames? Perhaps my system is sending it out on the 10GigE network, but then why does it work fine in 1.2.1p1?</div>
<div><br></div><div>The system I am running on is Linux: CentOS 5.5. It is a cluster running with PBS (Torque). I do have HYDRA_RMK set to "pbs", but I also tried it with this environment variable unset. It seems the command line parameters take default. The info here "<a href="http://wiki.mcs.anl.gov/mpich2/index.php/Using_the_Hydra_Process_Manager#Hydra_with_Non-Ethernet_Networks">http://wiki.mcs.anl.gov/mpich2/index.php/Using_the_Hydra_Process_Manager#Hydra_with_Non-Ethernet_Networks</a>" shows what I am doing should work. My "ifconfig" is below.</div>
<meta http-equiv="content-type" content="text/html; charset=utf-8"><div><br></div><div>Any help would be appreciated.</div><div><br></div><div>~cody</div><div><br></div><div><br></div><div><div>eth0 Link encap:Ethernet HWaddr 00:1B:21:69:79:A0 </div>
<div> inet addr:192.168.20.1 Bcast:192.168.20.255 Mask:255.255.255.0</div><div> inet6 addr: fe80::21b:21ff:fe69:79a0/64 Scope:Link</div><div> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1</div>
<div> RX packets:7454839 errors:0 dropped:0 overruns:0 frame:0</div><div> TX packets:149930410 errors:0 dropped:0 overruns:0 carrier:0</div><div> collisions:0 txqueuelen:1000 </div><div> RX bytes:45436437528 (42.3 GiB) TX bytes:221935890089 (206.6 GiB)</div>
</div><div><div>eth2 Link encap:Ethernet HWaddr E4:1F:13:4D:13:0E </div><div> inet addr:172.20.101.1 Bcast:172.20.101.255 Mask:255.255.255.0</div><div> inet6 addr: fe80::e61f:13ff:fe4d:130e/64 Scope:Link</div>
<div> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1</div><div> RX packets:556581 errors:0 dropped:0 overruns:0 frame:0</div><div> TX packets:8745499 errors:0 dropped:0 overruns:0 carrier:0</div>
<div> collisions:0 txqueuelen:1000 </div><div> RX bytes:39219489 (37.4 MiB) TX bytes:12766433186 (11.8 GiB)</div><div> Memory:92b60000-92b80000</div></div><meta http-equiv="content-type" content="text/html; charset=utf-8"><meta http-equiv="content-type" content="text/html; charset=utf-8">