[MPICH] Using non-default ethernet interfaces
Jeff Squyres
jsquyres at cisco.com
Wed May 16 18:32:55 CDT 2007
I forgot to mention that I tried the usual trick of giving a hostfile
with hostnames that correspond to the IPoIB IP addresses.
Unfortunately, mpdboot fails on it for some reason, even though all
the interfaces are up and I am able to ping them:
-----
[16:29] svbu-mpi001:~ % cat h
svbu-mpi001-ib0
svbu-mpi002-ib0
[16:29] svbu-mpi001:~ % ping svbu-mpi001-ib0
PING svbu-mpi001-ib.cisco.com (192.168.0.1) 56(84) bytes of data.
64 bytes from svbu-mpi001-ib.cisco.com (192.168.0.1): icmp_seq=0
ttl=64 time=0.049 ms
64 bytes from svbu-mpi001-ib.cisco.com (192.168.0.1): icmp_seq=1
ttl=64 time=0.022 ms
--- svbu-mpi001-ib.cisco.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.022/0.035/0.049/0.014 ms, pipe 2
[16:29] svbu-mpi001:~ % ping svbu-mpi002-ib0
PING svbu-mpi002-ib.cisco.com (192.168.0.2) 56(84) bytes of data.
64 bytes from svbu-mpi002-ib.cisco.com (192.168.0.2): icmp_seq=0
ttl=64 time=1.79 ms
64 bytes from svbu-mpi002-ib.cisco.com (192.168.0.2): icmp_seq=1
ttl=64 time=0.107 ms
--- svbu-mpi002-ib.cisco.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.107/0.950/1.793/0.843 ms, pipe 2
[16:29] svbu-mpi001:~ % mpdboot -n 2 -f h
mpdboot_svbu-mpi001.cisco.com (handle_mpd_output 374): failed to ping
mpd on svbu-mpi001-ib0; recvd output={}
[16:29] svbu-mpi001:~ %
-----
Did I do something wrong?
Many thanks.
On May 16, 2007, at 3:34 PM, Jeff Squyres wrote:
> Greetings. I'm trying to run MVAPICH2 over ethernet to do some
> performance comparisons, but I'm having a heck of a time trying to
> figure out how to use a non-default TCP interface.
>
> Specifically, eth0 is my "normal" gigE network (the IP address
> associated with the hostname). But I want to run an MVAPICH2 job
> over ib0 -- my IPoIB interface.
>
> I looked through the user documentation and didn't see anything
> about how to do this -- did I miss it? Pointers would be greatly
> appreciated.
>
> Thanks.
>
> --
> Jeff Squyres
> Cisco Systems
--
Jeff Squyres
Cisco Systems
More information about the mpich-discuss
mailing list