[MPICH] Selecting which ethernet interface to use

William Gropp gropp at mcs.anl.gov
Thu Aug 23 11:11:39 CDT 2007


This is why there is a rank-specific interface name:  
MPICH_INTERFACE_HOSTNAME_Rnnn .  So this isn't really scalable, but  
it does give you a route for now.  Your note does point out a need  
for a more implicit specification, such as a regular pattern that  
could be used in a well-designed cluster.

Bill

On Aug 23, 2007, at 7:07 AM, Jeff Squyres wrote:

> On Aug 22, 2007, at 10:41 PM, Darius Buntinas wrote:
>
>> I don't know much about slurm, but looking at the docs, I bet it  
>> doesn't support that.
>>
>> But if you can set environment variables for each process in  
>> slurm, you can set the address other processes will use to connect  
>> to a process by setting the MPICH_INTERFACE_HOSTNAME environment  
>> variable for that process.
>>
>> E.g., if the address for a node is bb01 and the address for the ib  
>> interface is bb01-ib, set MPICH_INTERFACE_HOSTNAME=bb01-ib for any  
>> processes on that node.
>
> Hmm.  Is there a way to *not* specify the hostname?  SLURM can  
> distribute environment variables to the launched processes, but it  
> usually sends the same values to all processes.  Eg:
>
> setenv FOO bar
> srun -N 4 env | grep FOO
>
> results in
>
> FOO=bar
> FOO=bar
> FOO=bar
> FOO=bar
>
> (where each of those came from a different node)
>
> So it would be pretty messy to setup a MPICH_INTERFACE_HOSTNAME  
> specific for each host.  The desired ethernet interface is the same  
> on all of my hosts (ib0); is there a way to tell all MPICH2  
> processes to use ib0?
>
> FWIW: I tried setting MPICH_INTERFACE_HOSTNAME to "ib0" and that  
> didn't work; for the heckuvit I also tried setting MPICH_INTERFACE  
> to "ib0" and that didn't work either (on the long shot that  
> MPICH_INTERFACE was a host-unspecific variant of  
> MPICH_INTERFACE_HOSTNAME).
>
> Also, it seems a little odd that you specify "<hostname>- 
> <interface>" when the name of the variable is  
> MPICH_INTERFACE_HOSTNAME -- shouldn't it be  
> MPICH_HOSTNAME_INTERFACE to match the ordering?  Just a nit.  :-)
>
>> Another way, if you can use mpd, would be to create a machinefile  
>> that looks like:
>
> I don't really want to use mpd -- kinda the point of SLURM  
> exporting a PMI interface is to avoid using mpd and directly launch  
> MPI processes through the SLURM interface itself (i.e., I don't  
> have to write a script -- I can just srun my MPI processes directly).
>
> Plus, if I used mpd, I'd have to glean the hosts that were  
> allocated to me from SLURM to create a hostfile, then make the  
> translations in that hostfile for what the corresponding public  
> ethernet interface name is in the ifhn clause, etc.  It would be  
> much simpler if I could just setenv a variable that says what  
> ethernet interface to use on every host...
>
> Am I stuck?  Do I need to go this route (use mpd/create a hostfile)  
> to use something other than eth0?
>
> -- 
> Jeff Squyres
> Cisco Systems
>




More information about the mpich-discuss mailing list