<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><br></div>Anyway, what I really wanted to say is, it's good to know that these "dynamic range/performance spectrum/static scaling" plots are designed to go past the sweet spots. I also agree that it would be interesting to see a time vs dofs*iterations/time plot. Would it then also be useful to look at the step to setting up the preconditioner?<span class="HOEnZb"><font color="#888888"><br><div><br></div></font></span></div></blockquote><div><br></div><div>Yes, I generally split up timing between "mesh setup" (symbolic factorization of LU), "matrix setup" (eg, factorizations), and solve time. The degree of amortization that you get for the two setup phases depends on your problem and so it is useful to separate them.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><span class="HOEnZb"><font color="#888888"><div></div><div>Justin</div></font></span></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Aug 30, 2016 at 1:55 PM, Jed Brown <span dir="ltr"><<a href="mailto:jed@jedbrown.org" target="_blank">jed@jedbrown.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank">mfadams@lbl.gov</a>> writes:<br>

> I would guess it is the latter.<br>

<br>

</span>In this case, definitely.<br>

<span><br>

> It is hard to get "rollover" to the right.  You could get it on KNL<br>

> (cache configuration of HBM) when you spill out of HBM.<br>

<br>

</span>Yes, but the same occurs if you start repeatedly spilling from some<br>

level of cache, which can happen even if the overall data structure is<br>

much larger than cache.  Not all algorithms have the flexibility to<br>

choose tile sizes independently from problem size and specification;<br>

it's easy to forget that this luxury is not universal when focusing on<br>

dense linear algebra, for example.<br>

<span><br>

> Personally, if you are you going to go into this much detail (eg, more than<br>

> just one plot) I would show a plot of iteration count vs problem size, and<br>

> be done with it, and then fix the iteration count for the weak scaling and<br>

> dynamic range plot (I agree we could use a better name).<br>

<br>

</span>Alternatively, plot the performance spectrum (dynamic range) for the<br>

end-to-end solve and per iteration.  The end user ultimately doesn't<br>

care about the cost per iteration (and it's meaningless when comparing<br>

to an algorithm that converges differently), so I'd prefer that the<br>

spectrum for the end-to-end application always be shown.<br>

</blockquote></div><br></div>

</div></div></blockquote></div><br></div></div>