[petsc-users] Obtaining bytes per second

Wed May 6 10:42:22 CDT 2015

Matthew Knepley <knepley at gmail.com> writes:
>> This is for a perfect cache model -- each byte of the data structures
>> needs to be fetched from DRAM only once.
>>
>
> I meant uncached, in which you count # Vecs for any operation you are
> doing. 

Wrong, you're describing a perfect cache model [1].  The initial data
does not reside in cache, but you only need to fetch each byte from DRAM
once.  The reality for larger problem sizes is that the entire wavefront
is not resident (e.g., perhaps because matrix entries evict vector
entries) and thus a single SpMV needs to re-load some vector entries.
For heavier matrices, this does not significantly change the bandwidth
requirements.  If the access pattern is predictable (i.e., you often get
lucky if you choose a locality-preserving order) then the perfect-cache
pure bandwidth model is good.

> If you count # Vecs for the whole program, then you have perfect
> cache.

[1] Or we have different definitions of "perfect cache", but I don't
think it's useful to discuss DRAM bandwidth if all data is resident in
cache, so I'm referring to perfect caching of inputs that are not
resident in cache.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150506/4ac84bc6/attachment.pgp>