[MPICH] mpich and pbs

Steve Young chemadm at hamilton.edu
Wed Jun 20 15:25:55 CDT 2007


Thanks Anthony. I was figuring it was going to come to that ;). 

Also just tried amber running on a ring I created from the command line
with the same job mentioned earlier...

% mpirun -np 8 /usr/local/amber9/exe/sander.MPI -ng 4 -groupfile
groupfile.txt

  Running multisander version of sander amber9
     Total processors =            8
     Number of groups =            4
<snip>

So it really does appear that each process is NOT talking to one
another. As for PBS .. yes it can allocate 4 cpu's per node nicely =).
I'll see if I can find out from Pete W. Anyhow, I appreciate the help.
Thanks,

-Steve



On Wed, 2007-06-20 at 15:12 -0500, Anthony Chan wrote:
> 
> On Wed, 20 Jun 2007, Steve Young wrote:
> 
> > Yes. that is the only way you can run their version of it.
> >
> > % mpiexec -np 8 ./cpi
> > mpiexec: Error: PBS_JOBID not set in environment.  Code must be run from
> > a PBS script, perhaps interactively using "qsub -I".
> 
> The small cluster that I tested pbs/osc's mpiexec is down now,
> so can't test various scenarios that may cause your problem.
> Did you set up your PBS so that it allows 4 processes on each
> node ?  I think it is called virtual cpu or something.
> It appears each MPI process is NOT talking to each other...
> You may also want to contact Pete Wyckoff on why this happens.
> 
> A.Chan
> 
> >
> > >
> > > On Wed, 20 Jun 2007, Steve Young wrote:
> > >
> > > > Ok I did the cpi test and had the following results:
> > > >
> > > > Process 0 of 1 is on node0038
> > > > pi is approximately 3.1415926544231341, Error is 0.0000000008333410
> > > > wall clock time = 0.000107
> > > > Process 0 of 1 is on node0038
> > > > pi is approximately 3.1415926544231341, Error is 0.0000000008333410
> > > > wall clock time = 0.000101
> > > > Process 0 of 1 is on node0038
> > > > pi is approximately 3.1415926544231341, Error is 0.0000000008333410
> > > > wall clock time = 0.000118
> > > > Process 0 of 1 is on node0038
> > > > pi is approximately 3.1415926544231341, Error is 0.0000000008333410
> > > > wall clock time = 0.000107
> > > > Process 0 of 1 is on node0037
> > > > Process 0 of 1 is on node0037
> > > > pi is approximately 3.1415926544231341, Error is 0.0000000008333410
> > > > wall clock time = 0.000108
> > > > pi is approximately 3.1415926544231341, Error is 0.0000000008333410
> > > > wall clock time = 0.000126
> > > > Process 0 of 1 is on node0037
> > > > pi is approximately 3.1415926544231341, Error is 0.0000000008333410
> > > > wall clock time = 0.000102
> > > > Process 0 of 1 is on node0037
> > > > pi is approximately 3.1415926544231341, Error is 0.0000000008333410
> > > > wall clock time = 0.000101
> > > > mpiexec: Warning: tasks 0-7 exited before completing MPI startup.
> > >
> > > Did you run osc's mpiexec in a pbs script (which you qsub
> > > the job to PBS) ?
> > >
> > > A.Chan
> >
> >




More information about the mpich-discuss mailing list