[petsc-users] Multi-DOF DMDA Vec
Anush Krishnan
anush at bu.edu
Fri Jun 13 21:41:25 CDT 2014
On 13 June 2014 22:22, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
> The main reason to “pull out” a single component is, for example, to
> solve a linear system for that single component; that is, to work on that
> single component a great deal. You wouldn’t pull out the individual
> components to iterate on them all together.
>
I needed to pull out a single component to perform a matrix-vector multiply
for further processing, but I realised I could just rearrange the matrix
instead.
Thank you, Barry and Jed. That was very helpful.
>
> Barry
>
> On Jun 13, 2014, at 9:09 PM, Jed Brown <jed at jedbrown.org> wrote:
>
> > Anush Krishnan <anush at bu.edu> writes:
> >> With regard to the interlaced memory performing better: If I used three
> >> vectors created from the same DMDA for each degree of freedom, how
> >> different would that be in performance compared to a fully interlaced
> >> vector? Wouldn't cache reuse be about the same for both cases?
> >
> > No, when you traverse the grid accessing all three components, you will
> > have three times as many prefetch streams (typically reducing prefetch
> > capability, thus generating more cold cache misses) and will spill
> > irregularly over cache lines more frequently, thus reducing the
> > effective cache size. This can result in an integer-factor slowdown as
> > compared to interlaced storage. By all means, run the experiment, but
> > the expected result for memory bandwidth/cache-limited operations is
> > that interlaced delivers significantly better performance.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140613/356bd939/attachment.html>
More information about the petsc-users
mailing list