[MPICH] Using non-default ethernet interfaces

Jeff Squyres jsquyres at cisco.com
Wed May 16 18:32:55 CDT 2007


I forgot to mention that I tried the usual trick of giving a hostfile  
with hostnames that correspond to the IPoIB IP addresses.   
Unfortunately, mpdboot fails on it for some reason, even though all  
the interfaces are up and I am able to ping them:

-----
[16:29] svbu-mpi001:~ % cat h
svbu-mpi001-ib0
svbu-mpi002-ib0
[16:29] svbu-mpi001:~ % ping svbu-mpi001-ib0
PING svbu-mpi001-ib.cisco.com (192.168.0.1) 56(84) bytes of data.
64 bytes from svbu-mpi001-ib.cisco.com (192.168.0.1): icmp_seq=0  
ttl=64 time=0.049 ms
64 bytes from svbu-mpi001-ib.cisco.com (192.168.0.1): icmp_seq=1  
ttl=64 time=0.022 ms

--- svbu-mpi001-ib.cisco.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.022/0.035/0.049/0.014 ms, pipe 2
[16:29] svbu-mpi001:~ % ping svbu-mpi002-ib0
PING svbu-mpi002-ib.cisco.com (192.168.0.2) 56(84) bytes of data.
64 bytes from svbu-mpi002-ib.cisco.com (192.168.0.2): icmp_seq=0  
ttl=64 time=1.79 ms
64 bytes from svbu-mpi002-ib.cisco.com (192.168.0.2): icmp_seq=1  
ttl=64 time=0.107 ms

--- svbu-mpi002-ib.cisco.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.107/0.950/1.793/0.843 ms, pipe 2
[16:29] svbu-mpi001:~ % mpdboot -n 2 -f h
mpdboot_svbu-mpi001.cisco.com (handle_mpd_output 374): failed to ping  
mpd on svbu-mpi001-ib0; recvd output={}

[16:29] svbu-mpi001:~ %
-----

Did I do something wrong?

Many thanks.



On May 16, 2007, at 3:34 PM, Jeff Squyres wrote:

> Greetings.  I'm trying to run MVAPICH2 over ethernet to do some  
> performance comparisons, but I'm having a heck of a time trying to  
> figure out how to use a non-default TCP interface.
>
> Specifically, eth0 is my "normal" gigE network (the IP address  
> associated with the hostname).  But I want to run an MVAPICH2 job  
> over ib0 -- my IPoIB interface.
>
> I looked through the user documentation and didn't see anything  
> about how to do this -- did I miss it?  Pointers would be greatly  
> appreciated.
>
> Thanks.
>
> -- 
> Jeff Squyres
> Cisco Systems


-- 
Jeff Squyres
Cisco Systems




More information about the mpich-discuss mailing list