<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, May 7, 2015 at 9:23 AM, Justin Chang <span dir="ltr"><<a href="mailto:jychang48@gmail.com" target="_blank">jychang48@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">So to summarize, if I understand everything<span></span>, I should do the following:<div><br></div><div>1) calculate the flop/byte ratio for various problem sizes and solver methods on one process and:</div></blockquote><div><br></div><div>Should be roughly invariant to problem size.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>a) use the ratio to estimate the ideal flops/s based on the stream bw. Compare this to the measured flops/s from my Petsc program?</div></blockquote><div><br></div><div>Yes</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>b) show the associated wall-clock times.</div></blockquote><div><br></div><div>Your strong scaling plots are fine here.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>c) warn that this model is uncached and merely gives a rough estimate of the performance.</div></blockquote><div><br></div><div>Yes, I would say "does not deal with cache effects".</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>2) Do a strong-scaling/speed up study to illustrate how these problems scale across multiple processes. Optionally see what the optimal number of processes is required for various problem sizes and solvers.</div><div><br></div><div>Am I missing anything?</div></blockquote><div><br></div><div>Sounds good to me.</div><div><br></div><div>   Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>Thanks,</div><div>Justin<div><div class="h5"><br><br>On Thursday, May 7, 2015, Matthew Knepley <<a>knepley@gmail.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, May 7, 2015 at 8:35 AM, Jed Brown <span dir="ltr"><<a>jed@jedbrown.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>Matthew Knepley <<a>knepley@gmail.com</a>> writes:<br>

> I think it would be much more interesting, and no more work to<br>

><br>

>   a) Model the flop/byte \beta ratio simply<br>

><br>

>   b) Report how close you get to the max performance given \beta on your<br>

> machine<br>

<br>

</span>This is also easily gamed; just use high-memory data structures, extra<br>

STREAM copies, etc.<br>

</blockquote></div><br>I completely agree that all the performance measures can be gamed, which is why</div><div class="gmail_extra">its important to always show time to solution as well.</div><div class="gmail_extra"><br></div><div class="gmail_extra">   Matt<br clear="all"><div><br></div>-- <br><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div>

</div></div>

</blockquote></div></div></div>

</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature">What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div>

</div></div>