[mpich2-dev] mpiexec to call launcher for all names in machines file without hostname resolution (was -nolocal option)

Dave Goodell goodell at mcs.anl.gov
Fri Jul 8 17:14:11 CDT 2011


I'll let Pavan field this from here.  I have no idea if/how hydra was designed to deal with machinefile entries that don't correspond to regular hosts.  I suspect there is a way, but that the "right way" involves writing more code to fully support your custom process launcher instead of trying to shoehorn it into an ssh-like model.  He will know much more about all of this.

FWIW, I'm surprised that the "-nolocal" option ever did what you were expecting when using MPD.  You may have been exploiting a bug or lucky coincidence rather than using an intentional feature.

-Dave

On Jul 8, 2011, at 5:03 PM CDT, John Marshall wrote:

> On 07/08/2011 05:35 PM, Dave Goodell wrote:
>> Was that the option to MPD's mpiexec that said "don't launch any processes on the local node, even though the local node is in the MPD ring"?
>> 
>> If so, then hydra just doesn't need such an option.  Simply don't include the local/head node in the machinefile and hydra won't launch any processes there.
>> 
>> Or are you trying to obtain the effect of setting the "MPICH_NOLOCAL" environment variable to "1"?  That says don't use shared memory to communicate between processes on the same node.
> The mpd option is closer to what I am looking for but still not it because I do want to be able to start up a process on the local node also.
> 
> For example, with a machines file:
> 
> 00
> 01
> 02
> 
> I want mpiexec to blindly call my launcher with the machine names of 00, 01, and 02 without trying to resolve the names (of course, 00, 01, 02 are not hostnames). So, in effect, my machine names are really just labels which the launcher will interpret. The problem is, mpiexec wants to resolve the entries in the machines file, expecting that they are hostnames.
> 
> My change simply forces an is_local = 0 for all names. Is there an alternative?
> 
> Thanks,
> John
> 
>> -Dave
>> 
>> On Jul 8, 2011, at 4:29 PM CDT, John Marshall wrote:
>> 
>>> Hi,
>>> 
>>> From what I can tell, there is no longer a nolocal option. For what I am doing, I currently need this kind of functionality since the entries in my "machines" list are not actual machine names but labels. I have made a quick change to src/pm/hydra/utils/sock/sock.c so that if an env var is set, all machines are treated as non-local (*is_local = 0).
>>> 
>>> I know I'm late to the party on this, but can someone explain why the -nolocal option was removed. Or, maybe I have missed something to get this functionality, i.e., to pass the machine name/label to the launcher as is without any complaints/errors and let the launcher interpret.
>>> 
>>> Thanks,
>>> John
> 



More information about the mpich2-dev mailing list