[petsc-dev] More prefetch

Jed Brown jed at 59A2.org
Fri Dec 3 13:35:36 CST 2010


I just pushed some prefetch for (S)BAIJ kernels.  The biggest win is for
MatSolve_SeqSBAIJ_*_NaturalOrdering_inplace where this patch is showing 30
to 50% speedups on Core 2 and Opteron.  The other kernels tend to improve by
20 to 30% on Opteron with less consistent improvements on Core 2 but usually
near 20%.

I have not found any cases to be slowed down by this patch, provided the
matrix does not fit in cache.  If the matrix does fit in cache, then the
non-temporal hint is bad and will cause matrix entries to be unnecessarily
fetched from memory.  I think the scenarios in which end-to-end performance
is limited by matrix kernels where the entire matrix fits in cache are much
more rare than those where the matrix does not fit in cache, thus I consider
the non-temporal hint to be an unambiguous win.  If anyone sees a negative
performance impact, please report it.

Jed
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20101203/ac3c1531/attachment.html>


More information about the petsc-dev mailing list