[petsc-users] Question about matrix permutation

Jed Brown jed at 59A2.org
Sun Jan 31 16:44:38 CST 2010


On Sun, 31 Jan 2010 16:02:17 -0600, Barry Smith <bsmith at mcs.anl.gov> wrote:
>    Check config/PETSc/Configure.py configurePrefix() for the current  
> support in PETSc. It is used, for example, in src/mat/impls/sbaij/seq/ 
> relax.h; you may have found better ways of doing this so feel free to  
> change the current support if you have something better.

Aha, but PETSC_Prefetch (naming violation?) just wraps
__builtin_prefetch or _mm_prefetch, which means it's only fetching one
cache line of the preceding row.  From what I can tell, it's bad to
software prefetch part of a row and rely on hardware prefetch to pick up
the rest.  If you're using software prefetch for a particular access
pattern, you want to ask for exactly what you need, which normally means
calling prefetch more than once.  The Intel optimization manual says
that you should overlap the prefetch calls with computation (because the
prefetch instruction occupies the same execution unit as loads) but
since we're so far beyond actually being CPU bound, I don't think there
is a real penalty to issuing several prefetch instructions at once and
then going to work on the block that should already be in cache while
the prefetch results trickle in.

Also, I'm pretty sure we want to be using _MM_HINT_NTA (0) instead of
_MM_HINT_T2 (1) since the latter brings the values into all levels of
cache.  Since it's the next row that we'll work with, it's very rare
(i.e. only matrices with an absurd number of nonzeros per row) that the
row would actually be evicted from L1 before we get to it.  We don't
want to pollute the higher levels because we want as much as possible
for the vector.  (At least, using _MM_HINT_T2 took several percent off
the MatMult flops, and I would anticipate the same effect for MatSolve
and MatSOR.)

Do we currently detect the cache line size anywhere?  I don't know how
to do that kind of thing portably, though this claims to do it

  http://www.open-mpi.org/projects/hwloc/


Jed


More information about the petsc-users mailing list