[Nek5000-users] Parallel on several nodes

Thu Feb 12 17:43:55 CST 2015

Hi Tanmoy

I still get errors using the shell script. Here is what happens:
On our cluster, we have 12 processors for each node; if I set the number 
of processors up to 12 there are results generated and the total 
simulation time decreases. Everything works fine and below is the script:

#!/bin/bash
#PBS -l nodes=2:ppn=12
#PBS -l mem=16gb
#PBS -l walltime=1:00:00
cd /data/User/Nek5000/nek5_svn/examples/eddy
echo eddy_uv > SESSION.NAME
echo `pwd`'/' >> SESSION.NAME
rm -f eddy_uv.his1
rm -f eddy_uv.sch1
rm -f eddy_uv.log1
mv eddy_uv.log eddy_uv.log1
mv eddy_uv.his eddy_uv.his1
mv eddy_uv.sch eddy_uv.sch1
rm -f logfile
rm -f ioinfo
sleep 5
mpiexec -n 12 -machinefile $PBS_NODEFILE nek5000 > eddy_uv.log
sleep 5
ln eddy_uv.log logfile
exit 0;

However, when I set the number of CPUs to 16 there is no outcome. Here 
is the script:

#!/bin/bash
#PBS -l nodes=4:ppn=4
#PBS -l mem=16gb
#PBS -l walltime=1:00:00
cd /data/User/Nek5000/nek5_svn/examples/eddy
echo eddy_uv > SESSION.NAME
echo `pwd`'/' >> SESSION.NAME
rm -f eddy_uv.his1
rm -f eddy_uv.sch1
rm -f eddy_uv.log1
mv eddy_uv.log eddy_uv.log1
mv eddy_uv.his eddy_uv.his1
mv eddy_uv.sch eddy_uv.sch1
rm -f logfile
rm -f ioinfo
sleep 5
mpiexec -n 16 -machinefile $PBS_NODEFILE nek5000 > eddy_uv.log
sleep 5
ln eddy_uv.log logfile
exit 0;

I'm not sure what I'm missing here? (in short I'm not able to increase 
the number of CPUs more than 12, which is the maximum number of 
processors per node.)

On 1/14/2015 11:06 PM, nek5000-users at lists.mcs.anl.gov wrote:
> #PBS -j oe
> #PBS -o sparkyLog