[mpich-discuss] mpd as system process?

Marc Moreau jebnor at gmail.com
Thu Aug 5 12:46:17 CDT 2010


Hello all,

I'm setting up MPICH2 on my cluster where users run many relatively
short processes ( 2-10 hours ).  I am using SunGridEngine to manage
the scheduling. The problem that I am running into is that SGE kills
the mpd process when the job is done, even when other jobs are using
it.  So if there are multiple MPI jobs running on the same node, they
all die when the first process dies.

As a solution I'd like to set everything up so that users can just
'run' MPI jobs and not need to worry about starting and killing mpd
within each job.  I'm thinking it would be nice to setup mpd as a
system process and then have all the jobs run on the system mpd.  Is
this sane and possible? Any other solutions ?

Pointers to documents accepted at par,
-- Marc


More information about the mpich-discuss mailing list