[petsc-dev] A closer look at the Xeon Phi
    Karl Rupp 
    rupp at mcs.anl.gov
       
    Tue Feb 12 18:54:22 CST 2013
    
    
  
Hi Matt,
> Karl, I am assuming that the places in the article where the Phi beats
> the K20 are for denser matrices
> where they have explicitly vectorized?
a quick check with the matrices in the paper showed that it is indeed 
the matrices with a higher number of nonzeros per row for which the Xeon 
Phi offers higher performance than the K20 (correlation, not causality). 
There's still a bunch of impact from reordering dofs (and I think one 
can also modify reordering algorithms to better suit accelerators/GPUs), 
but overall I support your observation.
The CSR format used in the paper is not necessarily optimal for MIC and 
GPUs, but that's a different story...
Best regards,
Karli
    
    
More information about the petsc-dev
mailing list