[Nek5000-users] NEK gets stuck

nek5000-users at lists.mcs.anl.gov nek5000-users at lists.mcs.anl.gov
Tue Oct 11 10:36:36 CDT 2011


Hi Stefan,

The problem starts to appear when I use 128 processors. It works with 64
processors.

Mani

On Tue, Oct 11, 2011 at 7:06 PM, <nek5000-users at lists.mcs.anl.gov> wrote:

> Doesn't sound like a memory problem given 4GB of memory per core and a
> total static data size of ~350MB (according to the output of size).
> The size of the executable doesn't matter in this case.
>
> - Can you post your logfile again (for the case where the SEMG was
> disabled).
> - What's the lowest number of processors you can reproduce the problem
> (try with lx1=4)
>
> -Stefan
>
> On 10/11/11, nek5000-users at lists.mcs.anl.gov
> <nek5000-users at lists.mcs.anl.gov> wrote:
> > Hi Stefan,
> >
> > Each node has 64 GB of RAM. There are 16 cores in each node. Each core
> has
> > the 4096 KB of cache. The size of the executable 'nek5000' is 5.6 MB. I
> > tried running with p43=1 and it still gets stuck. The full specifications
> of
> > each core are given below:
> >
> > vendor_id       : GenuineIntel
> > cpu family      : 6
> > model           : 15
> > model name      : Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
> > stepping        : 11
> > cpu MHz         : 2933.445
> > cache size      : 4096 KB
> > physical id     : 6
> > siblings        : 4
> > core id         : 3
> > cpu cores       : 4
> > fpu             : yes
> > fpu_exception   : yes
> > cpuid level     : 10
> > wp              : yes
> > flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca
> > cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm
> > constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
> > bogomips        : 5866.92
> > clflush size    : 64
> > cache_alignment : 64
> > address sizes   : 40 bits physical, 48 bits virtual
> >
> > Thanks,
> > Mani
> >
> >
> > On Mon, Oct 10, 2011 at 11:56 PM, <nek5000-users at lists.mcs.anl.gov>
> wrote:
> >
> >> What's the memory size per core?
> >>
> >> Sure p43=0 is correct if you want to use the multilevel Schwarz
> >> solver. Just as a cross check: set p43=1 and try again.
> >>
> >> On 10/10/11, nek5000-users at lists.mcs.anl.gov
> >> <nek5000-users at lists.mcs.anl.gov> wrote:
> >> > Hi Stefan,
> >> >
> >> > The following is the output of 'size nek5000'
> >> >
> >> >    text    data     bss     dec     hex filename
> >> > 5163006   59896 333337824       338560726       142e06d6
>  nek5000
> >> >
> >> >
> >> > In the .rea file, p43 has been set to 0.
> >> >
> >> > Mani
> >> >
> >> _______________________________________________
> >> Nek5000-users mailing list
> >> Nek5000-users at lists.mcs.anl.gov
> >> https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users
> >>
> >
> _______________________________________________
> Nek5000-users mailing list
> Nek5000-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20111011/a870f573/attachment.html>


More information about the Nek5000-users mailing list