[petsc-dev] Cache detection

Jed Brown jed at 59A2.org
Wed Feb 3 04:31:25 CST 2010

On Tue, 2 Feb 2010 17:28:34 -0600, Barry Smith <bsmith at mcs.anl.gov> wrote:
>     That would be nice. :-)

Done.  The downside of the way I'm currently doing this is that we have
no way to tell the user that autodetection failed so we're just using
some arbitrary default.  It's not acceptable to be loud about this
because it doesn't make a big difference and the user is unlikely to
know these details off the top of their head (or know how to find them).

The current defaults underestimate the values on x86-64, but my tests
(of MatMult_SeqAIJ_Inode) with incorrect line size (32 instead of 64)
show little impact.  I expect that overestimating the values (using 64
when the machine had 32) would be quite bad (maybe worse than no
software prefetch at all).  If issuing a bunch of prefetches is
expensive (e.g. I don't know what it costs on IBM hardware), then the
underestimated line sizes could hurt performance too.

Also, I just pushed prefetch for the forward-solve part of
MatSolve_SeqAIJ_Inode.  For ex19, this gives me a 10-12 percent speedup
for the whole operation, but I was unsuccessful at speeding up the
back-solve using prefetch.


More information about the petsc-dev mailing list