[petsc-users] Obtaining bytes per second

Wed May 6 11:26:20 CDT 2015

On Wed, May 6, 2015 at 10:42 AM, Jed Brown <jed at jedbrown.org> wrote:

> Matthew Knepley <knepley at gmail.com> writes:
> >> This is for a perfect cache model -- each byte of the data structures
> >> needs to be fetched from DRAM only once.
> >>
> >
> > I meant uncached, in which you count # Vecs for any operation you are
> > doing.
>
> Wrong, you're describing a perfect cache model [1].  The initial data
> does not reside in cache, but you only need to fetch each byte from DRAM
> once.  The reality for larger problem sizes is that the entire wavefront
> is not resident (e.g., perhaps because matrix entries evict vector
> entries) and thus a single SpMV needs to re-load some vector entries.
> For heavier matrices, this does not significantly change the bandwidth
> requirements.  If the access pattern is predictable (i.e., you often get
> lucky if you choose a locality-preserving order) then the perfect-cache
> pure bandwidth model is good.
>
> > If you count # Vecs for the whole program, then you have perfect
> > cache.
>
> [1] Or we have different definitions of "perfect cache", but I don't
> think it's useful to discuss DRAM bandwidth if all data is resident in
> cache, so I'm referring to perfect caching of inputs that are not
> resident in cache.
>

Yes, this is a language problem only.

  Matt

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150506/8bde59c1/attachment.html>