[Nek5000-users] NEK gets stuck

nek5000-users at lists.mcs.anl.gov nek5000-users at lists.mcs.anl.gov
Tue Oct 11 10:37:32 CDT 2011


I sent the logs, but they're awaiting moderator permission to be posted on
the list.

On Tue, Oct 11, 2011 at 9:06 PM, Mani Chandra <mc0710 at gmail.com> wrote:

> Hi Stefan,
>
> The problem starts to appear when I use 128 processors. It works with 64
> processors.
>
> Mani
>
> On Tue, Oct 11, 2011 at 7:06 PM, <nek5000-users at lists.mcs.anl.gov> wrote:
>
>> Doesn't sound like a memory problem given 4GB of memory per core and a
>> total static data size of ~350MB (according to the output of size).
>> The size of the executable doesn't matter in this case.
>>
>> - Can you post your logfile again (for the case where the SEMG was
>> disabled).
>> - What's the lowest number of processors you can reproduce the problem
>> (try with lx1=4)
>>
>> -Stefan
>>
>> On 10/11/11, nek5000-users at lists.mcs.anl.gov
>> <nek5000-users at lists.mcs.anl.gov> wrote:
>> > Hi Stefan,
>> >
>> > Each node has 64 GB of RAM. There are 16 cores in each node. Each core
>> has
>> > the 4096 KB of cache. The size of the executable 'nek5000' is 5.6 MB. I
>> > tried running with p43=1 and it still gets stuck. The full
>> specifications of
>> > each core are given below:
>> >
>> > vendor_id       : GenuineIntel
>> > cpu family      : 6
>> > model           : 15
>> > model name      : Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
>> > stepping        : 11
>> > cpu MHz         : 2933.445
>> > cache size      : 4096 KB
>> > physical id     : 6
>> > siblings        : 4
>> > core id         : 3
>> > cpu cores       : 4
>> > fpu             : yes
>> > fpu_exception   : yes
>> > cpuid level     : 10
>> > wp              : yes
>> > flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
>> mca
>> > cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm
>> > constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
>> > bogomips        : 5866.92
>> > clflush size    : 64
>> > cache_alignment : 64
>> > address sizes   : 40 bits physical, 48 bits virtual
>> >
>> > Thanks,
>> > Mani
>> >
>> >
>> > On Mon, Oct 10, 2011 at 11:56 PM, <nek5000-users at lists.mcs.anl.gov>
>> wrote:
>> >
>> >> What's the memory size per core?
>> >>
>> >> Sure p43=0 is correct if you want to use the multilevel Schwarz
>> >> solver. Just as a cross check: set p43=1 and try again.
>> >>
>> >> On 10/10/11, nek5000-users at lists.mcs.anl.gov
>> >> <nek5000-users at lists.mcs.anl.gov> wrote:
>> >> > Hi Stefan,
>> >> >
>> >> > The following is the output of 'size nek5000'
>> >> >
>> >> >    text    data     bss     dec     hex filename
>> >> > 5163006   59896 333337824       338560726       142e06d6
>>  nek5000
>> >> >
>> >> >
>> >> > In the .rea file, p43 has been set to 0.
>> >> >
>> >> > Mani
>> >> >
>> >> _______________________________________________
>> >> Nek5000-users mailing list
>> >> Nek5000-users at lists.mcs.anl.gov
>> >> https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users
>> >>
>> >
>> _______________________________________________
>> Nek5000-users mailing list
>> Nek5000-users at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20111011/ba662752/attachment.html>


More information about the Nek5000-users mailing list