<div class="gmail_quote">On Sat, Jan 14, 2012 at 14:07, Junchao Zhang <span dir="ltr">&lt;<a href="mailto:junchao.zhang@gmail.com">junchao.zhang@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div>I&#39;m not sure whether I should reorder a matrix before benchmarking. From the view of benchmarking, maybe I should also benchmark <i>bad </i>matrices.</div></blockquote></div><br><div>Sure, but using a bad ordering is essentially a contrived difficulty. I can take a tridiagonal matrix, permute to a random ordering, and then every MatMult() implementation will be horrible because of horrible memory access.</div>

<div><br></div><div>At the end of the day, what matters is the ability to solve science and engineering problems of interest. If you can get a factor of 2 without changing the problem (just by choosing a good ordering) then it&#39;s probably a good idea to do so. Just because some legacy application inadvertently used a bad ordering doesn&#39;t mean that you would actually do that for a practical problem.</div>

<div><br></div><div>For the PDE problems and architectures that we have tested (and using decent orderings like RCM), we usually see a high fraction of peak memory bandwidth based on a performance model that assumes optimal vector reuse given cache size constraints.</div>