[MPICH] Selecting which ethernet interface to use
William Gropp
gropp at mcs.anl.gov
Thu Aug 23 11:11:39 CDT 2007
This is why there is a rank-specific interface name:
MPICH_INTERFACE_HOSTNAME_Rnnn . So this isn't really scalable, but
it does give you a route for now. Your note does point out a need
for a more implicit specification, such as a regular pattern that
could be used in a well-designed cluster.
Bill
On Aug 23, 2007, at 7:07 AM, Jeff Squyres wrote:
> On Aug 22, 2007, at 10:41 PM, Darius Buntinas wrote:
>
>> I don't know much about slurm, but looking at the docs, I bet it
>> doesn't support that.
>>
>> But if you can set environment variables for each process in
>> slurm, you can set the address other processes will use to connect
>> to a process by setting the MPICH_INTERFACE_HOSTNAME environment
>> variable for that process.
>>
>> E.g., if the address for a node is bb01 and the address for the ib
>> interface is bb01-ib, set MPICH_INTERFACE_HOSTNAME=bb01-ib for any
>> processes on that node.
>
> Hmm. Is there a way to *not* specify the hostname? SLURM can
> distribute environment variables to the launched processes, but it
> usually sends the same values to all processes. Eg:
>
> setenv FOO bar
> srun -N 4 env | grep FOO
>
> results in
>
> FOO=bar
> FOO=bar
> FOO=bar
> FOO=bar
>
> (where each of those came from a different node)
>
> So it would be pretty messy to setup a MPICH_INTERFACE_HOSTNAME
> specific for each host. The desired ethernet interface is the same
> on all of my hosts (ib0); is there a way to tell all MPICH2
> processes to use ib0?
>
> FWIW: I tried setting MPICH_INTERFACE_HOSTNAME to "ib0" and that
> didn't work; for the heckuvit I also tried setting MPICH_INTERFACE
> to "ib0" and that didn't work either (on the long shot that
> MPICH_INTERFACE was a host-unspecific variant of
> MPICH_INTERFACE_HOSTNAME).
>
> Also, it seems a little odd that you specify "<hostname>-
> <interface>" when the name of the variable is
> MPICH_INTERFACE_HOSTNAME -- shouldn't it be
> MPICH_HOSTNAME_INTERFACE to match the ordering? Just a nit. :-)
>
>> Another way, if you can use mpd, would be to create a machinefile
>> that looks like:
>
> I don't really want to use mpd -- kinda the point of SLURM
> exporting a PMI interface is to avoid using mpd and directly launch
> MPI processes through the SLURM interface itself (i.e., I don't
> have to write a script -- I can just srun my MPI processes directly).
>
> Plus, if I used mpd, I'd have to glean the hosts that were
> allocated to me from SLURM to create a hostfile, then make the
> translations in that hostfile for what the corresponding public
> ethernet interface name is in the ifhn clause, etc. It would be
> much simpler if I could just setenv a variable that says what
> ethernet interface to use on every host...
>
> Am I stuck? Do I need to go this route (use mpd/create a hostfile)
> to use something other than eth0?
>
> --
> Jeff Squyres
> Cisco Systems
>
More information about the mpich-discuss
mailing list