[Nek5000-users] problem with running nek5000 on supercomputers
nek5000-users at lists.mcs.anl.gov
nek5000-users at lists.mcs.anl.gov
Tue Nov 11 07:25:01 CST 2014
Hi Ami,
Great to hear the problem is fixed!
In principle, it should suffice to simply
makenek blah
after editing the SIZE file, since make is smart enough to know that you've touched SIZE.
Normally I don't recompile when changing processor count (unless it's by a huge factor).
So, suppose my job has 4000 elements and I'm running on anywhere from 16 to 2048
processors. It's sufficient to set lelt=(4000/16) and lp=2048. Then you can run on that
range P=16 to 2048 without recompile.
Paul
________________________________
From: nek5000-users-bounces at lists.mcs.anl.gov [nek5000-users-bounces at lists.mcs.anl.gov] on behalf of nek5000-users at lists.mcs.anl.gov [nek5000-users at lists.mcs.anl.gov]
Sent: Tuesday, November 11, 2014 3:23 AM
To: nek5000-users at lists.mcs.anl.gov
Subject: Re: [Nek5000-users] problem with running nek5000 on supercomputers
It seems that for each time I had to clean compiled files before changing the processor count and run it again! It works now!
Thanks,
Ami
On Mon, Nov 10, 2014 at 4:29 PM, Amirreza Hashemi <amirezahashemi at gmail.com<mailto:amirezahashemi at gmail.com>> wrote:
Dr. Fishcer,
Thanks for your answer, for sure, I change the lelt and lelg and keep them as lowest even number that I can pick when I increase the number of CPUs. I picked also lp=32768, but I still get this "allocation failed" error. Is there any thing else that effect on the allocation which I have to change it?!
Thanks,
Ami
On Mon, Nov 10, 2014 at 8:55 AM, <nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov>> wrote:
Hi Ami,
When you increase the processor count, you can recompile with a smaller lelt.
Suppose you have lelt=1000 and you go from P=64 to P=128. Then set lelt=500 and
recompile.
Make certain lelg >= 500*P and lp >= 128.
I would suggest lelg="big enough" (and an even number) and lp=32768.
Paul
________________________________
From: nek5000-users-bounces at lists.mcs.anl.gov<mailto:nek5000-users-bounces at lists.mcs.anl.gov> [nek5000-users-bounces at lists.mcs.anl.gov<mailto:nek5000-users-bounces at lists.mcs.anl.gov>] on behalf of nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov> [nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov>]
Sent: Monday, November 10, 2014 5:05 AM
To: nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov>
Subject: [Nek5000-users] problem with running nek5000 on supercomputers
Hi Neks,
I am trying to run my simulation with nek5000 on supercomputers, the size of the problem is something like:
text data bss dec hex filename
2233955 187412 2554984280 2557405647 986ee9cf nek5000
I don't have any problem when I use 32~64 CPUs to run parallel on supercomputers, but once I go for more than 64 CPUs, I always get "allocation failed" error. I have tried to run on different supercomputer like as Gordon SDSC, OSC, TACC, but I got the same type of error each time. I have checked the active memories, it seems for each of them memories have large unused portion. Does any one have any idea why when I increase the number of CPUs I get this allocation error?!
Thank you,
Ami
_______________________________________________
Nek5000-users mailing list
Nek5000-users at lists.mcs.anl.gov<mailto:Nek5000-users at lists.mcs.anl.gov>
https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20141111/015a98ae/attachment.html>
More information about the Nek5000-users
mailing list