[MPICH] Selecting which ethernet interface to use
Jeff Squyres
jsquyres at cisco.com
Thu Aug 23 07:07:29 CDT 2007
On Aug 22, 2007, at 10:41 PM, Darius Buntinas wrote:
> I don't know much about slurm, but looking at the docs, I bet it
> doesn't support that.
>
> But if you can set environment variables for each process in slurm,
> you can set the address other processes will use to connect to a
> process by setting the MPICH_INTERFACE_HOSTNAME environment
> variable for that process.
>
> E.g., if the address for a node is bb01 and the address for the ib
> interface is bb01-ib, set MPICH_INTERFACE_HOSTNAME=bb01-ib for any
> processes on that node.
Hmm. Is there a way to *not* specify the hostname? SLURM can
distribute environment variables to the launched processes, but it
usually sends the same values to all processes. Eg:
setenv FOO bar
srun -N 4 env | grep FOO
results in
FOO=bar
FOO=bar
FOO=bar
FOO=bar
(where each of those came from a different node)
So it would be pretty messy to setup a MPICH_INTERFACE_HOSTNAME
specific for each host. The desired ethernet interface is the same
on all of my hosts (ib0); is there a way to tell all MPICH2 processes
to use ib0?
FWIW: I tried setting MPICH_INTERFACE_HOSTNAME to "ib0" and that
didn't work; for the heckuvit I also tried setting MPICH_INTERFACE to
"ib0" and that didn't work either (on the long shot that
MPICH_INTERFACE was a host-unspecific variant of
MPICH_INTERFACE_HOSTNAME).
Also, it seems a little odd that you specify "<hostname>-<interface>"
when the name of the variable is MPICH_INTERFACE_HOSTNAME --
shouldn't it be MPICH_HOSTNAME_INTERFACE to match the ordering? Just
a nit. :-)
> Another way, if you can use mpd, would be to create a machinefile
> that looks like:
I don't really want to use mpd -- kinda the point of SLURM exporting
a PMI interface is to avoid using mpd and directly launch MPI
processes through the SLURM interface itself (i.e., I don't have to
write a script -- I can just srun my MPI processes directly).
Plus, if I used mpd, I'd have to glean the hosts that were allocated
to me from SLURM to create a hostfile, then make the translations in
that hostfile for what the corresponding public ethernet interface
name is in the ifhn clause, etc. It would be much simpler if I could
just setenv a variable that says what ethernet interface to use on
every host...
Am I stuck? Do I need to go this route (use mpd/create a hostfile)
to use something other than eth0?
--
Jeff Squyres
Cisco Systems
More information about the mpich-discuss
mailing list