[MPICH] Selecting which ethernet interface to use

Jeff Squyres jsquyres at cisco.com
Fri Aug 24 06:53:22 CDT 2007


On Aug 23, 2007, at 12:11 PM, William Gropp wrote:

> This is why there is a rank-specific interface name:  
> MPICH_INTERFACE_HOSTNAME_Rnnn .  So this isn't really scalable, but  
> it does give you a route for now.

Good enough for the moment; thanks.

> Your note does point out a need for a more implicit specification,  
> such as a regular pattern that could be used in a well-designed  
> cluster.

Perhaps something as simple as MPICH_INTERFACE=ib0 (meaning: use  
"ib0" for *all* MPI processes, perhaps unless overridden by the _R<x>  
or _HOSTNAME variants).

Just my $0.02.


> Bill
>
> On Aug 23, 2007, at 7:07 AM, Jeff Squyres wrote:
>
>> On Aug 22, 2007, at 10:41 PM, Darius Buntinas wrote:
>>
>>> I don't know much about slurm, but looking at the docs, I bet it  
>>> doesn't support that.
>>>
>>> But if you can set environment variables for each process in  
>>> slurm, you can set the address other processes will use to  
>>> connect to a process by setting the MPICH_INTERFACE_HOSTNAME  
>>> environment variable for that process.
>>>
>>> E.g., if the address for a node is bb01 and the address for the  
>>> ib interface is bb01-ib, set MPICH_INTERFACE_HOSTNAME=bb01-ib for  
>>> any processes on that node.
>>
>> Hmm.  Is there a way to *not* specify the hostname?  SLURM can  
>> distribute environment variables to the launched processes, but it  
>> usually sends the same values to all processes.  Eg:
>>
>> setenv FOO bar
>> srun -N 4 env | grep FOO
>>
>> results in
>>
>> FOO=bar
>> FOO=bar
>> FOO=bar
>> FOO=bar
>>
>> (where each of those came from a different node)
>>
>> So it would be pretty messy to setup a MPICH_INTERFACE_HOSTNAME  
>> specific for each host.  The desired ethernet interface is the  
>> same on all of my hosts (ib0); is there a way to tell all MPICH2  
>> processes to use ib0?
>>
>> FWIW: I tried setting MPICH_INTERFACE_HOSTNAME to "ib0" and that  
>> didn't work; for the heckuvit I also tried setting MPICH_INTERFACE  
>> to "ib0" and that didn't work either (on the long shot that  
>> MPICH_INTERFACE was a host-unspecific variant of  
>> MPICH_INTERFACE_HOSTNAME).
>>
>> Also, it seems a little odd that you specify "<hostname>- 
>> <interface>" when the name of the variable is  
>> MPICH_INTERFACE_HOSTNAME -- shouldn't it be  
>> MPICH_HOSTNAME_INTERFACE to match the ordering?  Just a nit.  :-)
>>
>>> Another way, if you can use mpd, would be to create a machinefile  
>>> that looks like:
>>
>> I don't really want to use mpd -- kinda the point of SLURM  
>> exporting a PMI interface is to avoid using mpd and directly  
>> launch MPI processes through the SLURM interface itself (i.e., I  
>> don't have to write a script -- I can just srun my MPI processes  
>> directly).
>>
>> Plus, if I used mpd, I'd have to glean the hosts that were  
>> allocated to me from SLURM to create a hostfile, then make the  
>> translations in that hostfile for what the corresponding public  
>> ethernet interface name is in the ifhn clause, etc.  It would be  
>> much simpler if I could just setenv a variable that says what  
>> ethernet interface to use on every host...
>>
>> Am I stuck?  Do I need to go this route (use mpd/create a  
>> hostfile) to use something other than eth0?
>>
>> -- 
>> Jeff Squyres
>> Cisco Systems
>>


-- 
Jeff Squyres
Cisco Systems




More information about the mpich-discuss mailing list