On Sun, Oct 7, 2012 at 4:45 PM, TAY wee-beng <span dir="ltr"><<a href="mailto:zonexo@gmail.com" target="_blank">zonexo@gmail.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


  <div bgcolor="#FFFFFF" text="#000000">

    <div>Hi,<br>

      <br>

      I have attached 3 results using 12,24 and 32 processors. I am

      using a completely different clusters and I'm testing if it's the

      cluster configuration problems. It's seems that VecScatterEnd does

      not scale well from 12 to 24 to 32. Does these results show that

      there's problems with the partition still? My partition is

      clustered closely at the center, I'm wondering if this has a great

      effect on scaling...<br></div></div></blockquote><div><br></div><div>Your partition is probably not well balanced, but you're also seeing what happens when an algorithm is limited by bandwidth. The vendors tend not to tell you that cores don't matter unless you can get memory to them. Sparse linear algebra is heavily bandwidth-limited so you should not expect to see speedup by using all cores on a node. If you use additional nodes, you will get additional memory buses which are a better indicator of performance.</div>

</div>