[petsc-users] questions about vectorization

Mon Nov 13 10:32:03 CST 2017

Most operations in PETSc would not benefit much from vectorization since they are memory-bounded. But this does not discourage you from compiling PETSc with AVX2/AVX512. We have added a new matrix format (currently named ELL, but will be changed to SELL shortly) that can make MatMult ~2X faster than the AIJ format. The MatMult kernel is hand-optimized with AVX intrinsics. It works on any Intel processors that support AVX or AVX2 or AVX512, e.g. Haswell, Broadwell, Xeon Phi, Skylake. On the other hand, we have been optimizing the AIJ MatMult kernel for these architectures as well. And one has to use AVX compiler flags in order to take advantage of the optimized kernels and the new matrix format.

Hong (Mr.)

> On Nov 12, 2017, at 10:35 PM, Xiangdong <epscodes at gmail.com> wrote:
> 
> Hello everyone,
> 
> Can someone comment on the vectorization of PETSc? For example, for the MatMult function, will it perform better or run faster if it is compiled with avx2 or avx512?
> 
> Thank you.
> 
> Best,
> Xiangdong