[petsc-users] Multi-DOF DMDA Vec

Barry Smith bsmith at mcs.anl.gov
Fri Jun 13 21:22:40 CDT 2014


  The main reason to “pull out” a single component is, for example, to solve a linear system for that single component; that is, to work on that single component a great deal. You wouldn’t pull out the individual components to iterate on them all together.

  Barry

On Jun 13, 2014, at 9:09 PM, Jed Brown <jed at jedbrown.org> wrote:

> Anush Krishnan <anush at bu.edu> writes:
>> With regard to the interlaced memory performing better: If I used three
>> vectors created from the same DMDA for each degree of freedom, how
>> different would that be in performance compared to a fully interlaced
>> vector? Wouldn't cache reuse be about the same for both cases?
> 
> No, when you traverse the grid accessing all three components, you will
> have three times as many prefetch streams (typically reducing prefetch
> capability, thus generating more cold cache misses) and will spill
> irregularly over cache lines more frequently, thus reducing the
> effective cache size.  This can result in an integer-factor slowdown as
> compared to interlaced storage.  By all means, run the experiment, but
> the expected result for memory bandwidth/cache-limited operations is
> that interlaced delivers significantly better performance.



More information about the petsc-users mailing list