[Nek5000-users] Performance problem

nek5000-users at lists.mcs.anl.gov nek5000-users at lists.mcs.anl.gov
Fri Aug 6 04:13:18 CDT 2010


Dear Mani,

I haven't checked your logfile yet but there are my first thoughts:

N=4 is low
Your polynomial order (N=4) is low and the tensor-product formulation won't buy you much. The performance of all matrix-matrix multiplies (MxM) will limited by the memory access times. This is in particular a problem on multi-core and multi-socket machines. We have seen that the performance drop can be significant.
On top of that you carry around a large number of duplicate DOF and your surface to volume ratio is high (more communication). I

 
Parallel Performance
Your gridpoints per core (~4700) is quite small!
On Blue Gene (BG) systems we can scale well (e.g. 70-80% parallel efficiency) with around 10k gridpoints per core. On other system (e.g. Cray XT5) you need much more gridpoints per core (say 80k) because the network has a higher latency (NEK is sensitive to latency not bandwidth) and the processors are much faster.

Cheers,
Stefan

On Aug 6, 2010, at 10:51 AM, <nek5000-users at lists.mcs.anl.gov> wrote:

> Hi,
> 
>    I'm solving for Rayleigh-Benard convection in a 3D box of 37632, 4rth order elements. I fired the job on 512 processors on a machine with quad-core, quad socket configuration (32 nodes with 16 cores each ) with a 20 Gbps infiniband interconnect. In 12 hours it has run 163 time steps. Is this normal or is there maybe some way to improve performance? Attached is the SIZE file.
> 
> Regards,
> Mani chandra
> 
> <SIZE.txt>_______________________________________________
> Nek5000-users mailing list
> Nek5000-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users




More information about the Nek5000-users mailing list