[MPICH] one shot jobs in mpich2?

Ralph M. Butler rbutler at mtsu.edu
Mon Jun 20 15:49:35 CDT 2005


> Date: Mon, 20 Jun 2005 17:12:39 +0200
> From: Reuti <reuti at staff.uni-marburg.de>
> To: Ralph M. Butler <rbutler at mtsu.edu>
> Cc: mpich-discuss at mcs.anl.gov
> Subject: Re: [MPICH] one shot jobs in mpich2?
>
> Hi Ralph,
>
> Ralph M. Butler wrote:
> >>Date: Fri, 17 Jun 2005 18:09:39 +0200
> >>From: Alexander Spiegel <spiegel at rz.rwth-aachen.de>
> >>To: Reuti <reuti at staff.uni-marburg.de>
> >>Cc: mpich-discuss at mcs.anl.gov
> >>Subject: Re: [MPICH] one shot jobs in mpich2?
> >>
> >>Hi,
> >>
> >>Reuti wrote:
> >>
> >>>I didn't found a way to integrate the mpd method into SGE, as it creates
> >>>many new processgroups, which prevents a proper shutdown of a job in
> >>>case that you issue a qdel for it.
> >>
> >>What's about removing the lines with 'setpgrp' in mpd.py and mpdman.py
> >>scpripts? Or introducing an environment variable to control the creation
> >>of new process groups?
> >>
> >>I made some small tests and it seems to work. The new processgroups were
> >>no more created by mpd.
> >
> >
> > Over the years, we have had a variety of differing requests regarding
> > the use of setpgrp, setsid, etc.  For example, we have worked closely
> > with the managers of the Chiba City cluster here at Argonne on this
> > matter on multiple occassions.  mpd is a process management system.
> > This may sometimes conflict with use of mpd in an environment where
> > another package thinks that it is the  process management system.  One
> > reason mpd creates process groups is for the reasons mentioned for SGE
> > in a prior email, i.e. as a process management system, mpd wants to be
> > able kill the entire process group at once.  However, what mpd views as
> > a killable set may differ from someone else's.  Frequently, it is an MPI
> > 'rank' and any progeny it may further generate.  (However, mpd makes no
> > guarantees because it accepts the fact that users may do their own
> > setpgrp/setsid sorts of operations.)
> >
> > Having said all that, we could probably use options (perhaps in the
> > .mpd.conf file) to determine when mpd does (or does not) use the
> > setpgrp/setsid syscalls.  We just need to make sure that we are able to
> > balance the conflicting sets of requests so that all needs are met.
>
> thanks for the comment. For the use with SGE there is another thing to
> mention: mpd shouldn't vanish on all the nodes into daemon land. For
> smpd I use the "-d 0", so that the daemons are bound to the started rsh
> (in SGE terms: qrsh) command and it's working fine this way. Is there
> any similar thing for mpd? - Reuti

Yes, by default, mpds do *not* start as daemons, e.g.:
    mpd &
The -d (or --daemon) option can be used to run them as daemons.
mpdboot uses the -d option because it uses ssh/rsh/etc. to start
the daemons and does not want the ssh's left hanging.




More information about the mpich-discuss mailing list