<div class="gmail_quote">On Fri, Nov 12, 2010 at 14:51, Rodrigo R. Paz <span dir="ltr"><<a href="mailto:rodrigop@intec.unl.edu.ar">rodrigop@intec.unl.edu.ar</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

IMHO, the main problem here is the low memory bandwidth (FSB) in Xeon E5420 nodes.</blockquote></div><br><div>This is well-known and fundamental (not dependent on the programming model).  It appears that you are comparing OpenMP threading within a node to a single thread.  The comparison we are more interested is with an equivalent number of MPI processes per node.</div>

<div><br></div><div>Jed</div>