[mpich-discuss] mpd as system process?

Reuti reuti at staff.uni-marburg.de
Thu Aug 5 12:53:19 CDT 2010


Hi Marc,

Am 05.08.2010 um 19:46 schrieb Marc Moreau:

> I'm setting up MPICH2 on my cluster where users run many relatively
> short processes ( 2-10 hours ).  I am using SunGridEngine to manage
> the scheduling. The problem that I am running into is that SGE kills
> the mpd process when the job is done, even when other jobs are using
> it.  So if there are multiple MPI jobs running on the same node, they
> all die when the first process dies.
> 
> As a solution I'd like to set everything up so that users can just
> 'run' MPI jobs and not need to worry about starting and killing mpd
> within each job.  I'm thinking it would be nice to setup mpd as a
> system process and then have all the jobs run on the system mpd.  Is
> this sane and possible? Any other solutions ?

please have a look here:

http://gridengine.sunsource.net/howto/mpich2-integration/mpich2-integration.html

it will create one dedicated ring per job. The ring will be setup and removed by the PE start/stop_proc_args scripts. The users just need to setup the correct portnumber in their scripts (please check the included demo-script in the archive for this).

-- Reuti


More information about the mpich-discuss mailing list