[mpich-discuss] hydra and PBS

Bryan Putnam bfp at purdue.edu
Fri Aug 7 21:50:04 CDT 2009


On Fri, 7 Aug 2009, Bryan Putnam wrote:

> On Fri, 7 Aug 2009, Pavan Balaji wrote:
> 
> > 
> > Which version are you using? I remember there was a bug with respect to MPMD
> > launches at some point. Can you update to 1.1.1p1?
> 
> I'll try that and let you know, we're using the previous mpich2-1.1

It worked correctly with mpich2-1.1.1p1

Thanks,
Bryan
> 
> Bryan
> > 
> >  -- Pavan
> > 
> > On 08/07/2009 09:13 PM, Bryan Putnam wrote:
> > > On Fri, 7 Aug 2009, Pavan Balaji wrote:
> > > 
> > > > > Could you please also tell me if it's possible to set up a master/slave
> > > > > type
> > > > > job using mpiexec.hydra. That is the equivalent of something like
> > > > > 
> > > > > mpiexec -machinefile $PBS_NODEFILE -np 1 ./master : -np 3 ./slave
> > > > This should work, except mpiexec takes the option "-f" instead of
> > > > "-machinefile". I'll add in an alias to "-machinefile" to do the same
> > > > thing in
> > > > the next release; it might be useful.
> > > 
> > > Pavan, that didn't seems to work for me, I see
> > > 
> > > coates-a001 1010% mpiexec.hydra -f $PBS_NODEFILE -np 4 ./hellof
> > >  node           2 : Hello world!
> > >  node           3 : Hello world!
> > >  node           0 : Hello world!
> > >  node           1 : Hello world!
> > > 
> > > Above is OK, but below...
> > > 
> > > coates-a001 1011% mpiexec.hydra -f $PBS_NODEFILE -np 1 ./hellof : -np 3
> > > ./hellof
> > > Fatal error in MPI_Init: Other MPI error, error stack:
> > > MPIR_Init_thread(377): Initialization failed
> > > (unknown)(): Other MPI error
> > > Fatal error in MPI_Init: Other MPI error, error stack:
> > > MPIR_Init_thread(377): Initialization failed
> > > (unknown)(): Other MPI error
> > > Fatal error in MPI_Init: Other MPI error, error stack:
> > > MPIR_Init_thread(377): Initialization failed
> > > (unknown)(): Other MPI error
> > > Fatal error in MPI_Init: Other MPI error, error stack:
> > > MPIR_Init_thread(377): Initialization failed
> > > (unknown)(): Other MPI error
> > > coates-a001 1012% 
> > > 
> > > >  -- Pavan
> > > > 
> > > > -- 
> > > > Pavan Balaji
> > > > http://www.mcs.anl.gov/~balaji
> > > > 
> > > 
> > > 
> > 
> > -- 
> > Pavan Balaji
> > http://www.mcs.anl.gov/~balaji
> > 
> 
> 
> 




More information about the mpich-discuss mailing list