[MPICH] one shot jobs in mpich2?

Ralph M. Butler rbutler at mtsu.edu
Mon Jun 20 09:23:30 CDT 2005


> Date: Fri, 17 Jun 2005 18:09:39 +0200
> From: Alexander Spiegel <spiegel at rz.rwth-aachen.de>
> To: Reuti <reuti at staff.uni-marburg.de>
> Cc: mpich-discuss at mcs.anl.gov
> Subject: Re: [MPICH] one shot jobs in mpich2?
>
> Hi,
>
> Reuti wrote:
> > I didn't found a way to integrate the mpd method into SGE, as it creates
> > many new processgroups, which prevents a proper shutdown of a job in
> > case that you issue a qdel for it.
>
> What's about removing the lines with 'setpgrp' in mpd.py and mpdman.py
> scpripts? Or introducing an environment variable to control the creation
> of new process groups?
>
> I made some small tests and it seems to work. The new processgroups were
> no more created by mpd.

Over the years, we have had a variety of differing requests regarding
the use of setpgrp, setsid, etc.  For example, we have worked closely
with the managers of the Chiba City cluster here at Argonne on this
matter on multiple occassions.  mpd is a process management system.
This may sometimes conflict with use of mpd in an environment where
another package thinks that it is the  process management system.  One
reason mpd creates process groups is for the reasons mentioned for SGE
in a prior email, i.e. as a process management system, mpd wants to be
able kill the entire process group at once.  However, what mpd views as
a killable set may differ from someone else's.  Frequently, it is an MPI
'rank' and any progeny it may further generate.  (However, mpd makes no
guarantees because it accepts the fact that users may do their own
setpgrp/setsid sorts of operations.)

Having said all that, we could probably use options (perhaps in the
.mpd.conf file) to determine when mpd does (or does not) use the
setpgrp/setsid syscalls.  We just need to make sure that we are able to
balance the conflicting sets of requests so that all needs are met.

--ralph




More information about the mpich-discuss mailing list