[MPICH] Selecting which ethernet interface to use

Jeff Squyres jsquyres at cisco.com
Thu Aug 23 07:07:29 CDT 2007


On Aug 22, 2007, at 10:41 PM, Darius Buntinas wrote:

> I don't know much about slurm, but looking at the docs, I bet it  
> doesn't support that.
>
> But if you can set environment variables for each process in slurm,  
> you can set the address other processes will use to connect to a  
> process by setting the MPICH_INTERFACE_HOSTNAME environment  
> variable for that process.
>
> E.g., if the address for a node is bb01 and the address for the ib  
> interface is bb01-ib, set MPICH_INTERFACE_HOSTNAME=bb01-ib for any  
> processes on that node.

Hmm.  Is there a way to *not* specify the hostname?  SLURM can  
distribute environment variables to the launched processes, but it  
usually sends the same values to all processes.  Eg:

setenv FOO bar
srun -N 4 env | grep FOO

results in

FOO=bar
FOO=bar
FOO=bar
FOO=bar

(where each of those came from a different node)

So it would be pretty messy to setup a MPICH_INTERFACE_HOSTNAME  
specific for each host.  The desired ethernet interface is the same  
on all of my hosts (ib0); is there a way to tell all MPICH2 processes  
to use ib0?

FWIW: I tried setting MPICH_INTERFACE_HOSTNAME to "ib0" and that  
didn't work; for the heckuvit I also tried setting MPICH_INTERFACE to  
"ib0" and that didn't work either (on the long shot that  
MPICH_INTERFACE was a host-unspecific variant of  
MPICH_INTERFACE_HOSTNAME).

Also, it seems a little odd that you specify "<hostname>-<interface>"  
when the name of the variable is MPICH_INTERFACE_HOSTNAME --  
shouldn't it be MPICH_HOSTNAME_INTERFACE to match the ordering?  Just  
a nit.  :-)

> Another way, if you can use mpd, would be to create a machinefile  
> that looks like:

I don't really want to use mpd -- kinda the point of SLURM exporting  
a PMI interface is to avoid using mpd and directly launch MPI  
processes through the SLURM interface itself (i.e., I don't have to  
write a script -- I can just srun my MPI processes directly).

Plus, if I used mpd, I'd have to glean the hosts that were allocated  
to me from SLURM to create a hostfile, then make the translations in  
that hostfile for what the corresponding public ethernet interface  
name is in the ifhn clause, etc.  It would be much simpler if I could  
just setenv a variable that says what ethernet interface to use on  
every host...

Am I stuck?  Do I need to go this route (use mpd/create a hostfile)  
to use something other than eth0?

-- 
Jeff Squyres
Cisco Systems




More information about the mpich-discuss mailing list