[mpich-discuss] multiple mpich2 jobs on a single node

Rajeev Thakur thakur at mcs.anl.gov
Thu May 1 12:44:09 CDT 2008


In a PBS environment, you need not use MPD at all. You can simply use the
mpiexec written by Pete Wyckoff: http://www.osc.edu/~pw/mpiexec/index.php 

Rajeev

> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Bryan Putnam
> Sent: Thursday, May 01, 2008 12:40 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: RE: [mpich-discuss] multiple mpich2 jobs on a single node
> 
> On Thu, 1 May 2008, Rajeev Thakur wrote:
> 
> > Can you try running them with a single mpdboot?
> > 
> > Rajeev 
> 
> Rajeev,
> 
> Yes, it does work correctly when I use only one mpdboot as 
> you describe. 
> However, this is within a PBS environment, and the user doesn't know 
> whether or not the daemon is already running. 
> 
> Bryan 
> 
> 
>  > 
> > > -----Original Message-----
> > > From: owner-mpich-discuss at mcs.anl.gov 
> > > [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Bryan Putnam
> > > Sent: Thursday, May 01, 2008 7:52 AM
> > > To: mpich-discuss at mcs.anl.gov
> > > Subject: [mpich-discuss] multiple mpich2 jobs on a single node
> > > 
> > > Hi, perhaps this is an FAQ, but I can't seem to find it.
> > > 
> > > I've noticed that if I have two mpich2 jobs running on for 
> > > example, the same 8-processor node, submitted as
> > > 
> > > mpdboot
> > > mpirun -np 4 ./a.out
> > > 
> > > 
> > > mpdboot
> > > mpirun -np 4 ./a.out2
> > > 
> > > 
> > > then when one of the jobs completes, it kills all the mpd 
> > > processes along 
> > > with it, and the remaining job dies with message
> > > 
> > > job aborted; reason = mpd disappeared
> > > 
> > > 
> > > Is there an easy fix for this?
> > > 
> > > Thanks,
> > > Bryan
> > > 
> > > 
> > > --
> > > Bryan Putnam
> > > Rosen Center for Advanced Computing, Purdue University
> > > Young Hall (Rm. 519)
> > > 302 Wood Street
> > > West Lafayette, IN 47907-2108
> > > Ph 765-496-8225 Fax 765-494-0566
> > > bfp at purdue.edu
> > > http://www.purdue.edu/itap
> > > 
> > > 
> > > 
> > > 
> > 
> > 
> 
> 
> 
> 




More information about the mpich-discuss mailing list