[petsc-dev] More prefetch
Jed Brown
jed at 59A2.org
Fri Dec 3 13:35:36 CST 2010
I just pushed some prefetch for (S)BAIJ kernels. The biggest win is for
MatSolve_SeqSBAIJ_*_NaturalOrdering_inplace where this patch is showing 30
to 50% speedups on Core 2 and Opteron. The other kernels tend to improve by
20 to 30% on Opteron with less consistent improvements on Core 2 but usually
near 20%.
I have not found any cases to be slowed down by this patch, provided the
matrix does not fit in cache. If the matrix does fit in cache, then the
non-temporal hint is bad and will cause matrix entries to be unnecessarily
fetched from memory. I think the scenarios in which end-to-end performance
is limited by matrix kernels where the entire matrix fits in cache are much
more rare than those where the matrix does not fit in cache, thus I consider
the non-temporal hint to be an unambiguous win. If anyone sees a negative
performance impact, please report it.
Jed
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20101203/ac3c1531/attachment.html>
More information about the petsc-dev
mailing list