<div dir="ltr"><div><div>Last question<br><br></div>I would like to report the efficiency of my code. That is, flops/s over the theoretical peak performance (on n-cores). Where the TPP is clock * FLOPS/cycle * n. My current machine is a Intel® Core™ i7-4790 CPU @ 3.60GHz and I am assuming that the FLOPS/cycle is 4. <br><br>One of my serial test runs has achieved a FLOPS/s of 2.01e+09, which translates to an efficiency of almost 14%. I know these are crude measurements but would these manual flop counts be appropriate for this kind of 

measurement? Or would hardware counts from PAPI?<br><br></div><div>Thanks,<br></div><div>Justin<br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Apr 21, 2015 at 11:16 AM, Jed Brown <span dir="ltr"><<a href="mailto:jed@jedbrown.org" target="_blank">jed@jedbrown.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">Matthew Knepley <<a href="mailto:knepley@gmail.com">knepley@gmail.com</a>> writes:<br>

> Flop is Floating Point Operation. The index calculation is an Integer<br>

> Operation. I agree that we could probably start counting<br>

> those as well since in some sorts of applications its important, but right<br>

> now we don't.<br>

<br>

</span>Index calculations often satisfy recurrences that the compiler folds<br>

into pointer increments and the like.  Also, some architectures, like<br>

PowerPC, have floating point instructions that include mutating index<br>

operations in the true spirit of RISC. ;-)<br>

</blockquote></div><br></div>