[petsc-dev] A closer look at the Xeon Phi
Karl Rupp
rupp at mcs.anl.gov
Tue Feb 12 18:54:27 CST 2013
Hi Matt,
> Karl, I am assuming that the places in the article where the Phi beats
> the K20 are for denser matrices
> where they have explicitly vectorized?
a quick check with the matrices in the paper showed that it is indeed
the matrices with a higher number of nonzeros per row for which the Xeon
Phi offers higher performance than the K20 (correlation, not causality).
There's still a bunch of impact from reordering dofs (and I think one
can also modify reordering algorithms to better suit accelerators/GPUs),
but overall I support your observation.
The CSR format used in the paper is not necessarily optimal for MIC and
GPUs, but that's a different story...
Best regards,
Karli
More information about the petsc-dev
mailing list