[petsc-dev] Hardware counter logging in PETSc (was Re: Where next with PETSc and KNL?)

Jed Brown jed at jedbrown.org
Thu Sep 29 18:35:07 CDT 2016

Barry Smith <bsmith at mcs.anl.gov> writes:
>   Ok, but this is somewhat orthogonal, you are proposing something
>   like PetscLogBytes() ? Which of course we should have put in
>   initially with PetscLogFlops() twenty years ago?  I don't object to
>   such a thing.

What does it mean?  Depending on vector sizes and whatever happened
last, that could already be in cache.  If you log those bytes, you can
get a "bandwidth" that is basically the instruction rate.  Maybe we can
still interpret it.

With sparse or irregular operations, it's very common that you don't use
a whole cache line every time you fetch it.  The performance counters
don't know that, so they say you got 64 bytes even though you might have
only used 4-8 bytes.  You could easily conclude that you are
bandwidth-bound and have saturated DRAM bandwidth so there is little
opportunity for improvement.  Then you restructure the code and get a
huge performance gain despite lower claimed memory bandwidth.

Anyway, I think it will be somewhat hard to precisely define the
analytic counting and that you will absolutely need both analytic and
perf counters (like cache misses) to make any sense of bandwidth-limited
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 800 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20160929/89f4818f/attachment.sig>

More information about the petsc-dev mailing list