[Nek5000-users] Parallel on several nodes

nek5000-users at lists.mcs.anl.gov nek5000-users at lists.mcs.anl.gov
Thu Feb 12 18:19:38 CST 2015


Hi Saleh,

CPU's would refer to the number of processors in each core, that sits on a
node..You may have multiple cores on a node. So, theoretically, you can
have CPU's = 12X2, set in your PBS script, but that will not speed up your
code, since it will turn on the 2 hyperthreads/ processor in the core.

I see in the 2nd PBS script, you have total 16 processors, and 4 processors
per node. Even though it is a very inefficient way of running your script,
I would say it would still work. like 4 nodes and 4 nodes/ processor.
However, I don’t have the expertise to comment further without knowing the
architecture of your nodes.

However, I can say, it is a good practise to keep ppn =
<max_no_processors_in_a_node>  (12 in your case always) and change nodes =
x, where x = 1,2,3,4 or whatever. That is a lot more efficient, and scaling
analysis is also very elegant that way.

Best Regards,
Tanmoy

On Thu, Feb 12, 2015 at 4:43 PM, <nek5000-users at lists.mcs.anl.gov> wrote:

> Hi Tanmoy
>
> I still get errors using the shell script. Here is what happens:
> On our cluster, we have 12 processors for each node; if I set the number
> of processors up to 12 there are results generated and the total simulation
> time decreases. Everything works fine and below is the script:
>
> #!/bin/bash
> #PBS -l nodes=2:ppn=12
> #PBS -l mem=16gb
> #PBS -l walltime=1:00:00
> cd /data/User/Nek5000/nek5_svn/examples/eddy
> echo eddy_uv > SESSION.NAME
> echo `pwd`'/' >> SESSION.NAME
> rm -f eddy_uv.his1
> rm -f eddy_uv.sch1
> rm -f eddy_uv.log1
> mv eddy_uv.log eddy_uv.log1
> mv eddy_uv.his eddy_uv.his1
> mv eddy_uv.sch eddy_uv.sch1
> rm -f logfile
> rm -f ioinfo
> sleep 5
> mpiexec -n 12 -machinefile $PBS_NODEFILE nek5000 > eddy_uv.log
> sleep 5
> ln eddy_uv.log logfile
> exit 0;
>
> However, when I set the number of CPUs to 16 there is no outcome. Here is
> the script:
>
> #!/bin/bash
> #PBS -l nodes=4:ppn=4
> #PBS -l mem=16gb
> #PBS -l walltime=1:00:00
> cd /data/User/Nek5000/nek5_svn/examples/eddy
> echo eddy_uv > SESSION.NAME
> echo `pwd`'/' >> SESSION.NAME
> rm -f eddy_uv.his1
> rm -f eddy_uv.sch1
> rm -f eddy_uv.log1
> mv eddy_uv.log eddy_uv.log1
> mv eddy_uv.his eddy_uv.his1
> mv eddy_uv.sch eddy_uv.sch1
> rm -f logfile
> rm -f ioinfo
> sleep 5
> mpiexec -n 16 -machinefile $PBS_NODEFILE nek5000 > eddy_uv.log
> sleep 5
> ln eddy_uv.log logfile
> exit 0;
>
>
> I'm not sure what I'm missing here? (in short I'm not able to increase the
> number of CPUs more than 12, which is the maximum number of processors per
> node.)
>
>
>
> On 1/14/2015 11:06 PM, nek5000-users at lists.mcs.anl.gov wrote:
>
>> #PBS -j oe
>> #PBS -o sparkyLog
>>
>
>
> _______________________________________________
> Nek5000-users mailing list
> Nek5000-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20150212/687222f2/attachment.html>


More information about the Nek5000-users mailing list