[mpich-discuss] multiple mpich2 jobs on a single node
Bryan Putnam
bfp at purdue.edu
Thu May 1 18:31:29 CDT 2008
On Thu, 1 May 2008, Rajeev Thakur wrote:
> Try doing this. For one job,
That worked! Thanks!
Bryan
>
> setenv MPD_CON_EXT job1
> mpdboot
> mpirun -np 4 ./a.out
>
> For the other one,
>
> setenv MPD_CON_EXT job2
> mpdboot
> mpirun -np 4 ./a.out2
>
> Rajeev
>
>
> > -----Original Message-----
> > From: owner-mpich-discuss at mcs.anl.gov
> > [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Bryan Putnam
> > Sent: Thursday, May 01, 2008 1:23 PM
> > To: mpich-discuss at mcs.anl.gov
> > Subject: RE: [mpich-discuss] multiple mpich2 jobs on a single node
> >
> > On Thu, 1 May 2008, Rajeev Thakur wrote:
> >
> > > In a PBS environment, you need not use MPD at all. You can
> > simply use the
> > > mpiexec written by Pete Wyckoff:
> > http://www.osc.edu/~pw/mpiexec/index.php
> >
> > Hi Rajeev,
> >
> > Yes we used to use that program when we used OpenPBS, but it
> > doesn't work
> > with the latest PBSPro.
> >
> > Bryan
> > >
> > > Rajeev
> > >
> > > > -----Original Message-----
> > > > From: owner-mpich-discuss at mcs.anl.gov
> > > > [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Bryan Putnam
> > > > Sent: Thursday, May 01, 2008 12:40 PM
> > > > To: mpich-discuss at mcs.anl.gov
> > > > Subject: RE: [mpich-discuss] multiple mpich2 jobs on a single node
> > > >
> > > > On Thu, 1 May 2008, Rajeev Thakur wrote:
> > > >
> > > > > Can you try running them with a single mpdboot?
> > > > >
> > > > > Rajeev
> > > >
> > > > Rajeev,
> > > >
> > > > Yes, it does work correctly when I use only one mpdboot as
> > > > you describe.
> > > > However, this is within a PBS environment, and the user
> > doesn't know
> > > > whether or not the daemon is already running.
> > > >
> > > > Bryan
> > > >
> > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: owner-mpich-discuss at mcs.anl.gov
> > > > > > [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of
> > Bryan Putnam
> > > > > > Sent: Thursday, May 01, 2008 7:52 AM
> > > > > > To: mpich-discuss at mcs.anl.gov
> > > > > > Subject: [mpich-discuss] multiple mpich2 jobs on a single node
> > > > > >
> > > > > > Hi, perhaps this is an FAQ, but I can't seem to find it.
> > > > > >
> > > > > > I've noticed that if I have two mpich2 jobs running on for
> > > > > > example, the same 8-processor node, submitted as
> > > > > >
> > > > > > mpdboot
> > > > > > mpirun -np 4 ./a.out
> > > > > >
> > > > > >
> > > > > > mpdboot
> > > > > > mpirun -np 4 ./a.out2
> > > > > >
> > > > > >
> > > > > > then when one of the jobs completes, it kills all the mpd
> > > > > > processes along
> > > > > > with it, and the remaining job dies with message
> > > > > >
> > > > > > job aborted; reason = mpd disappeared
> > > > > >
> > > > > >
> > > > > > Is there an easy fix for this?
> > > > > >
> > > > > > Thanks,
> > > > > > Bryan
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Bryan Putnam
> > > > > > Rosen Center for Advanced Computing, Purdue University
> > > > > > Young Hall (Rm. 519)
> > > > > > 302 Wood Street
> > > > > > West Lafayette, IN 47907-2108
> > > > > > Ph 765-496-8225 Fax 765-494-0566
> > > > > > bfp at purdue.edu
> > > > > > http://www.purdue.edu/itap
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
> >
> >
>
>
More information about the mpich-discuss
mailing list