For clarification purposes: <br><div><br></div><div>1) What is the definition of "performance model" and "cache model"? I see those two terms used in this thread but want to know the exact difference if any.</div><div><br></div><div>2) Is what's described in Dinesh's paper a "cache model"? What exactly is the caveat or what are the major assumptions that it makes?</div><div><br></div><div>3) Is quantifying the "useful bandwidth sustained for some level of catch" analogous/related to cache register reuse and/or vectorization (e.g., how well one can maximize SIMD on the machine if that makes any sense)</div><div><br></div><div>Thank you guys for all your help,</div><div>Justin </div><div><br>On Thursday, May 7, 2015, Jed Brown <<a>jed@jedbrown.org</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Matthew Knepley <<a>knepley@gmail.com</a>> writes:<br>

<br>

> On Thu, May 7, 2015 at 1:32 PM, Jed Brown <<a>jed@jedbrown.org</a>> wrote:<br>

><br>

>> Matthew Knepley <<a>knepley@gmail.com</a>> writes:<br>

>> > You are making assumptions about the algorithm here. Pure streaming<br>

>> > computations, like VecAXPY do not depend on the cache model.<br>

>><br>

>> I thought we were talking about Krylov solvers, which includes MatMult.<br>

>><br>

><br>

> Yes, and in the earlier mail I said to just use the model in Dinesh's<br>

> paper so he did not have to worry about this one.<br>

<br>

That is a cache model (analogous to "perfect cache" in the pTatin<br>

paper).  So don't claim you have no cache model.<br>

</blockquote></div>