[petsc-dev] KNL MatMult performance and unrolling.

Barry Smith bsmith at mcs.anl.gov
Wed Sep 28 15:40:44 CDT 2016



   Mr Hong Zhang has found that removing the manual unrolling from MatMult_SeqAIJ_Inode() (at least with inode size 2) results in a good bump in performance on KNL and pointed me to the Intel gospel https://software.intel.com/en-us/articles/avoid-manual-loop-unrolling which we've always ignored in the past. It would be good try the unrolled and non-unrolled also on Xeon.

   We've never done a good job of managing our unrolling, where, how and when we do it and macros for unrolling such as PetscSparseDensePlusDot. Intel would say just throw it all away.

   Barry






More information about the petsc-dev mailing list