[petsc-dev] AVX kernels, old gcc, still broken

Lisandro Dalcin dalcinl at gmail.com
Thu Oct 24 17:33:04 CDT 2019


Yes, that should work, I coded it myself to workaround the issue, but you
were faster than me. Thanks!

On Thu, 24 Oct 2019 at 22:00, Zhang, Hong <hongzhang at anl.gov> wrote:

> Hi Lisandro,
>
> Can you please check if the following patch fixes the problem? I will
> create a MR.
>
> diff --git a/src/mat/impls/aij/seq/aijperm/aijperm.c
> b/src/mat/impls/aij/seq/aijperm/aijperm.c
> index 577dfc6713..568535117a 100644
> --- a/src/mat/impls/aij/seq/aijperm/aijperm.c
> +++ b/src/mat/impls/aij/seq/aijperm/aijperm.c
> @@ -12,7 +12,7 @@
>
>  #include <../src/mat/impls/aij/seq/aij.h>
>
> -#if defined(PETSC_HAVE_IMMINTRIN_H) && defined(__AVX512F__) &&
> defined(PETSC_USE_REAL_DOUBLE) && !defined(PETSC_USE_COMPLEX) &&
> !defined(PETSC_USE_64BIT_INDICES)
> +#if defined(PETSC_USE_AVX512_KERNELS) && defined(PETSC_HAVE_IMMINTRIN_H)
> && defined(__AVX512F__) && defined(PETSC_USE_REAL_DOUBLE) &&
> !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_64BIT_INDICES) &&
> !defined(PETSC_SKIP_IMMINTRIN_H_CUDAWORKAROUND)
>  #include <immintrin.h>
>
>  #if !defined(_MM_SCALE_8)
> @@ -301,7 +301,7 @@ PetscErrorCode MatMult_SeqAIJPERM(Mat A,Vec xx,Vec yy)
>  #if !(defined(PETSC_USE_FORTRAN_KERNEL_MULTAIJPERM) &&
> defined(notworking))
>    PetscInt          i,j;
>  #endif
> -#if defined(PETSC_HAVE_IMMINTRIN_H) && defined(__AVX512F__) &&
> defined(PETSC_USE_REAL_DOUBLE) && !defined(PETSC_USE_COMPLEX) &&
> !defined(PETSC_USE_64BIT_INDICES)
> +#if defined(PETSC_USE_AVX512_KERNELS) && defined(PETSC_HAVE_IMMINTRIN_H)
> && defined(__AVX512F__) && defined(PETSC_USE_REAL_DOUBLE) &&
> !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_64BIT_INDICES) &&
> !defined(PETSC_SKIP_IMMINTRIN_H_CUDAWORKAROUND)
>    __m512d           vec_x,vec_y,vec_vals;
>    __m256i           vec_idx,vec_ipos,vec_j;
>    __mmask8           mask;
> @@ -401,7 +401,7 @@ PetscErrorCode MatMult_SeqAIJPERM(Mat A,Vec xx,Vec yy)
>  #pragma _CRI prefervector
>  #endif
>
> -#if defined(PETSC_HAVE_IMMINTRIN_H) && defined(__AVX512F__) &&
> defined(PETSC_USE_REAL_DOUBLE) && !defined(PETSC_USE_COMPLEX) &&
> !defined(PETSC_USE_64BIT_INDICES)
> +#if defined(PETSC_USE_AVX512_KERNELS) && defined(PETSC_HAVE_IMMINTRIN_H)
> && defined(__AVX512F__) && defined(PETSC_USE_REAL_DOUBLE) &&
> !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_64BIT_INDICES) &&
> !defined(PETSC_SKIP_IMMINTRIN_H_CUDAWORKAROUND)
>              vec_y = _mm512_setzero_pd();
>              ipos = ip[i];
>              for (j=0; j<(nz>>3); j++) {
> @@ -436,7 +436,7 @@ PetscErrorCode MatMult_SeqAIJPERM(Mat A,Vec xx,Vec yy)
>             * worthwhile to vectorize across the rows, that is, to do the
>             * matvec by operating with "columns" of the chunk. */
>            for (j=0; j<nz; j++) {
> -#if defined(PETSC_HAVE_IMMINTRIN_H) && defined(__AVX512F__) &&
> defined(PETSC_USE_REAL_DOUBLE) && !defined(PETSC_USE_COMPLEX) &&
> !defined(PETSC_USE_64BIT_INDICES)
> +#if defined(PETSC_USE_AVX512_KERNELS) && defined(PETSC_HAVE_IMMINTRIN_H)
> && defined(__AVX512F__) && defined(PETSC_USE_REAL_DOUBLE) &&
> !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_64BIT_INDICES) &&
> !defined(PETSC_SKIP_IMMINTRIN_H_CUDAWORKAROUND)
>              vec_j = _mm256_set1_epi32(j);
>              for (i=0; i<((isize>>3)<<3); i+=8) {
>                vec_y    = _mm512_loadu_pd(&yp[i]);
>
>
> Thanks,
> Hong
>
> On Oct 24, 2019, at 2:47 PM, Lisandro Dalcin via petsc-dev <
> petsc-dev at mcs.anl.gov> wrote:
>
> This is with master, but I bet the issue is also in maint.
>
> * Running on Ubuntu 16
>
> $ uname -a
> Linux flamingo 4.4.0-104-generic #127-Ubuntu SMP Mon Dec 11 12:16:42 UTC
> 2017 x86_64 x86_64 x86_64 GNU/Linux
>
> * With system gcc 5.4
>
> $ mpicc -show
> /usr/bin/gcc-5
> -I/sw/workstations/apps/linux-ubuntu16.04-x86_64/mpich/3.3.1/gcc-5.4.0/nvejoe25snmak6a7fnjghabxjukjkuiu/include
> -L/sw/workstations/apps/linux-ubuntu16.04-x86_64/mpich/3.3.1/gcc-5.4.0/nvejoe25snmak6a7fnjghabxjukjkuiu/lib
> -Wl,-rpath
> -Wl,/sw/workstations/apps/linux-ubuntu16.04-x86_64/mpich/3.3.1/gcc-5.4.0/nvejoe25snmak6a7fnjghabxjukjkuiu/lib
> -lmpi
>
> $ mpicc --version
> gcc-5 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
> Copyright (C) 2015 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>
> * PETSc configured to NOT USE AVX512 kernels
>
> $ grep avx arch-gnu-opt/lib/petsc/conf/reconfigure-arch-gnu-opt.py
>     '--with-avx512-kernels=0',
>
> * Bang!
>
> $ touch src/mat/impls/aij/seq/aijperm/aijperm.c
> $ make -f gmakefile
> Use "/usr/bin/make V=1" to see verbose compile lines, "/usr/bin/make V=0"
> to suppress.
>           CC arch-gnu-opt/obj/mat/impls/aij/seq/aijperm/aijperm.o
> /home/dalcin/Devel/petsc/src/mat/impls/aij/seq/aijperm/aijperm.c: In
> function ‘MatMult_SeqAIJPERM’:
> /home/dalcin/Devel/petsc/src/mat/impls/aij/seq/aijperm/aijperm.c:426:22:
> warning: implicit declaration of function ‘_mm512_reduce_add_pd’
> [-Wimplicit-function-declaration]
>              yp[i] += _mm512_reduce_add_pd(vec_y);
>
>
> --
> Lisandro Dalcin
> ============
> Research Scientist
> Extreme Computing Research Center (ECRC)
> King Abdullah University of Science and Technology (KAUST)
> http://ecrc.kaust.edu.sa/
>
>
>

-- 
Lisandro Dalcin
============
Research Scientist
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20191025/859c5aae/attachment-0001.html>


More information about the petsc-dev mailing list