[MPICH] Using non-default ethernet interfaces

Anthony Chan chan at mcs.anl.gov
Wed May 16 19:14:28 CDT 2007


AFAIK, the hostfile is for mpd only, so hostfile could contain "standard"
machine name like svbu-mpi001 and svbu-mpi001 (which corresponds to
ethernet hostname), then do mpdboot with this hostfile.

Now create a machinefile, e.g. machinefile.txt, that maps the ethernet
hostname to IPoIB hostname

> cat machinefile.txt
svbu-mpi001 ifhn=svbu-mpi001-ib
svbu-mpi002 ifhn=svbu-mpi002-ib
...

Now launch your MPI job as

> mpiexec -machinefile machinefile.txt -n 2 <your_benchmark_program>

Hope this helps.

A.Chan

On Wed, 16 May 2007, Jeff Squyres wrote:

> I forgot to mention that I tried the usual trick of giving a hostfile
> with hostnames that correspond to the IPoIB IP addresses.
> Unfortunately, mpdboot fails on it for some reason, even though all
> the interfaces are up and I am able to ping them:
>
> -----
> [16:29] svbu-mpi001:~ % cat h
> svbu-mpi001-ib0
> svbu-mpi002-ib0
> [16:29] svbu-mpi001:~ % ping svbu-mpi001-ib0
> PING svbu-mpi001-ib.cisco.com (192.168.0.1) 56(84) bytes of data.
> 64 bytes from svbu-mpi001-ib.cisco.com (192.168.0.1): icmp_seq=0
> ttl=64 time=0.049 ms
> 64 bytes from svbu-mpi001-ib.cisco.com (192.168.0.1): icmp_seq=1
> ttl=64 time=0.022 ms
>
> --- svbu-mpi001-ib.cisco.com ping statistics ---
> 2 packets transmitted, 2 received, 0% packet loss, time 1000ms
> rtt min/avg/max/mdev = 0.022/0.035/0.049/0.014 ms, pipe 2
> [16:29] svbu-mpi001:~ % ping svbu-mpi002-ib0
> PING svbu-mpi002-ib.cisco.com (192.168.0.2) 56(84) bytes of data.
> 64 bytes from svbu-mpi002-ib.cisco.com (192.168.0.2): icmp_seq=0
> ttl=64 time=1.79 ms
> 64 bytes from svbu-mpi002-ib.cisco.com (192.168.0.2): icmp_seq=1
> ttl=64 time=0.107 ms
>
> --- svbu-mpi002-ib.cisco.com ping statistics ---
> 2 packets transmitted, 2 received, 0% packet loss, time 1001ms
> rtt min/avg/max/mdev = 0.107/0.950/1.793/0.843 ms, pipe 2
> [16:29] svbu-mpi001:~ % mpdboot -n 2 -f h
> mpdboot_svbu-mpi001.cisco.com (handle_mpd_output 374): failed to ping
> mpd on svbu-mpi001-ib0; recvd output={}
>
> [16:29] svbu-mpi001:~ %
> -----
>
> Did I do something wrong?
>
> Many thanks.
>
>
>
> On May 16, 2007, at 3:34 PM, Jeff Squyres wrote:
>
> > Greetings.  I'm trying to run MVAPICH2 over ethernet to do some
> > performance comparisons, but I'm having a heck of a time trying to
> > figure out how to use a non-default TCP interface.
> >
> > Specifically, eth0 is my "normal" gigE network (the IP address
> > associated with the hostname).  But I want to run an MVAPICH2 job
> > over ib0 -- my IPoIB interface.
> >
> > I looked through the user documentation and didn't see anything
> > about how to do this -- did I miss it?  Pointers would be greatly
> > appreciated.
> >
> > Thanks.
> >
> > --
> > Jeff Squyres
> > Cisco Systems
>
>
> --
> Jeff Squyres
> Cisco Systems
>
>




More information about the mpich-discuss mailing list