[mpich-discuss] mpdboot beheviour
Reuti
reuti at Staff.Uni-Marburg.DE
Fri Jan 15 11:44:09 CST 2010
Am 15.01.2010 um 18:31 schrieb Dave Goodell:
> On Jan 15, 2010, at 10:57 AM, Cezary Śliwa wrote:
>
>> Regarding mpdboot in mpich2-1.2.1. The default is to use --ncpus=1
>> rather than the value from mpd.hosts. Does it make sense? Why not
>> use the value from mpd.hosts as the default?
>
> This is a longstanding known user interface problem. It trips
> people up all the time, including me on occasion. Unfortunately,
> we are unlikely to change the behavior for two reasons: (1) it will
> break compatibility with the thousands of scripts out there that
> invoke mpdboot assuming the old
A new option to mpdboot could be introduced to select "version 2" of
mpdboot with its new behavior.
> behavior and (2) mpd is receiving a bare minimum of maintenance and
> development at this point because it is being replaced by hydra.
>
>> In case of running a job under PBS or SGE, the correct number is
>> in the file. At present, one has to extract this information from
>> the file and put it on the mpdboot command line. This is cumbersome.
For SGE it's best to avoid mpdboot and start the daemons dedicated
for each job in a PE's start_proc_args, i.e. each job gets its own ring:
http://gridengine.sunsource.net/howto/mpich2-integration/mpich2-
integration.html
This way the daemons can also be started w/o any rsh or ssh between
the nodes.
-- Reuti
> I agree wholeheartedly.
>
>> Moreover, if the host running mpdboot is not in mpd.hosts, it
>> makes sense to use --ncpus=0 as the default.
>
> Just FYI, this doesn't actually work the way you would expect.
> mpdboot still basically sets up an mpd as though you had specified
> --ncpus=1. You can't use mpd to have a "head node" in a
> straightforward fashion. The best you can do is use the "-1"
> option to mpiexec to avoid placing processes locally first, but
> that is a pretty weak option too.
>
> Hydra supports running mpiexec on a node that isn't in the hostfile
> (at least with all bootstrap servers that support remote process
> creation).
>
> -Dave
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
More information about the mpich-discuss
mailing list