[petsc-users] Multi-DOF DMDA Vec

Anush Krishnan anush at bu.edu
Fri Jun 13 21:41:25 CDT 2014


On 13 June 2014 22:22, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   The main reason to “pull out” a single component is, for example, to
> solve a linear system for that single component; that is, to work on that
> single component a great deal. You wouldn’t pull out the individual
> components to iterate on them all together.
>

I needed to pull out a single component to perform a matrix-vector multiply
for further processing, but I realised I could just rearrange the matrix
instead.

Thank you, Barry and Jed. That was very helpful.


>
>   Barry
>
> On Jun 13, 2014, at 9:09 PM, Jed Brown <jed at jedbrown.org> wrote:
>
> > Anush Krishnan <anush at bu.edu> writes:
> >> With regard to the interlaced memory performing better: If I used three
> >> vectors created from the same DMDA for each degree of freedom, how
> >> different would that be in performance compared to a fully interlaced
> >> vector? Wouldn't cache reuse be about the same for both cases?
> >
> > No, when you traverse the grid accessing all three components, you will
> > have three times as many prefetch streams (typically reducing prefetch
> > capability, thus generating more cold cache misses) and will spill
> > irregularly over cache lines more frequently, thus reducing the
> > effective cache size.  This can result in an integer-factor slowdown as
> > compared to interlaced storage.  By all means, run the experiment, but
> > the expected result for memory bandwidth/cache-limited operations is
> > that interlaced delivers significantly better performance.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140613/356bd939/attachment.html>


More information about the petsc-users mailing list