[Nek5000-users] problem with running nek5000 on supercomputers

nek5000-users at lists.mcs.anl.gov nek5000-users at lists.mcs.anl.gov
Tue Nov 11 07:25:01 CST 2014


Hi Ami,

Great to hear the problem is fixed!

In principle, it should suffice to simply

makenek blah

after editing the SIZE file, since make is smart enough to know that you've touched SIZE.

Normally I don't recompile when changing processor count (unless it's by a huge factor).

So, suppose my job has 4000 elements and I'm running on anywhere from 16 to 2048
processors.  It's sufficient to set lelt=(4000/16) and lp=2048.  Then you can run on that
range P=16 to 2048 without recompile.

Paul


________________________________
From: nek5000-users-bounces at lists.mcs.anl.gov [nek5000-users-bounces at lists.mcs.anl.gov] on behalf of nek5000-users at lists.mcs.anl.gov [nek5000-users at lists.mcs.anl.gov]
Sent: Tuesday, November 11, 2014 3:23 AM
To: nek5000-users at lists.mcs.anl.gov
Subject: Re: [Nek5000-users] problem with running nek5000 on supercomputers

It seems that for each time I had to clean compiled files before changing the processor count and run it again! It works now!

Thanks,
Ami

On Mon, Nov 10, 2014 at 4:29 PM, Amirreza Hashemi <amirezahashemi at gmail.com<mailto:amirezahashemi at gmail.com>> wrote:
Dr. Fishcer,

Thanks for your answer, for sure, I change the lelt and lelg and keep them as lowest even number that I can pick when I increase the number of CPUs. I picked also lp=32768, but I still get this "allocation failed" error. Is there any thing else that effect on the allocation which I have to change it?!

Thanks,
Ami

On Mon, Nov 10, 2014 at 8:55 AM, <nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov>> wrote:

Hi Ami,

When you increase the processor count, you can recompile with a smaller lelt.

Suppose you have lelt=1000 and you go from P=64 to P=128.  Then set lelt=500 and
recompile.

Make certain lelg >= 500*P and lp >= 128.

I would suggest lelg="big enough" (and an even number) and lp=32768.

Paul

________________________________
From: nek5000-users-bounces at lists.mcs.anl.gov<mailto:nek5000-users-bounces at lists.mcs.anl.gov> [nek5000-users-bounces at lists.mcs.anl.gov<mailto:nek5000-users-bounces at lists.mcs.anl.gov>] on behalf of nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov> [nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov>]
Sent: Monday, November 10, 2014 5:05 AM
To: nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov>
Subject: [Nek5000-users] problem with running nek5000 on supercomputers

Hi Neks,

I am trying to run my simulation with nek5000 on supercomputers, the size of the problem is something like:

 text      data     bss     dec     hex filename
2233955  187412 2554984280      2557405647      986ee9cf        nek5000

I don't have any problem when I use 32~64 CPUs to run parallel on supercomputers, but once I go for more than 64 CPUs, I always get "allocation failed" error. I have tried to run on different supercomputer like as Gordon SDSC, OSC, TACC, but I got the same type of error each time. I have checked the active memories, it seems for each of them memories have large unused portion. Does any one have any idea why when I increase the number of CPUs I get this allocation error?!

Thank you,
Ami

_______________________________________________
Nek5000-users mailing list
Nek5000-users at lists.mcs.anl.gov<mailto:Nek5000-users at lists.mcs.anl.gov>
https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20141111/015a98ae/attachment.html>


More information about the Nek5000-users mailing list