[petsc-dev] AVX kernels, old gcc, still broken
Jed Brown
jed at jedbrown.org
Fri Oct 25 13:54:36 CDT 2019
"Smith, Barry F. via petsc-dev" <petsc-dev at mcs.anl.gov> writes:
> This needs to be fixed properly with a configure test(s) and not with huge and inconsistent checks like this
>
> #if defined(PETSC_HAVE_IMMINTRIN_H) && defined(__AVX512F__) && defined(PETSC_USE_REAL_DOUBLE) && !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_64BIT_INDICES) or this
>
> #elif defined(PETSC_USE_AVX512_KERNELS) && defined(PETSC_HAVE_IMMINTRIN_H) && defined(__AVX512F__) && defined(PETSC_USE_REAL_DOUBLE) && !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_64BIT_INDICES) && !defined(PETSC_SKIP_IMMINTRIN_H_CUDAWORKAROUND)
>
>
>
> self.useAVX512Kernels = self.framework.argDB['with-avx512-kernels']
> if self.useAVX512Kernels:
> self.addDefine('USE_AVX512_KERNELS', 1)
>
> Here you should check that the needed include files are available, that it is 32 bit integers, that defined(__AVX512F__) exists and that appropriate functions exist.
What happens when one AVX512 kernel gets a 64-bit integer or
single-precision implementation? You'll have new macros for each
combination of criteria instead of using && on the normal macros?
We should investigate whether #pragma omp simd or ivdep can generate
comparable vectorization without directly using the intrinsics -- those
are quite a bit more portable when they work, and they often work with
some experimentation.
> Maybe you need two configure checks if there are two different types of functionality you are trying to catch?
>
> Continuing to hack away with gross #if def means this crud is unmaintainable and will always haunt you. Yes the incremental cost of doing a proper configure test is there but once that is done the maintenance costs (which have been haunting use for months) will be gone.
>
> Barry
>
>
>
>
>
>
>
>
>
>> On Oct 25, 2019, at 2:49 AM, Lisandro Dalcin via petsc-dev <petsc-dev at mcs.anl.gov> wrote:
>>
>>
>>
>> On Fri, 25 Oct 2019 at 01:40, Balay, Satish <balay at mcs.anl.gov> wrote:
>> I'm curious why this issue comes up for you. The code was unrelated to --with-avx512-kernels=0 option.
>>
>> Its relying on __AVX512F__and PETSC_HAVE_IMMINTRIN_H flags. And
>> assumes immintrin.h has a definition for _mm512_reduce_add_pd()
>>
>> Is the flag __AVX512F__ always set on your machine by gcc?
>>
>> And does this change based on the hardware? I just tried this build
>> [same os/compiler] on "Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz" - and
>> can't reproduce the issue.
>>
>> I do see _mm512_reduce_add_pd is missing from immintrin.h - but the
>> flag __AVX512F__ is not set for me.
>>
>>
>> Of course it is not set, you are just invoking the preprocessor. Try this way:
>>
>> $ cat xyz.c
>> #if defined __AVX512F__
>> #error "avx512f flag set"
>> #endif
>>
>> $ gcc -march=native -c xyz.c
>> xyz.c:2:2: error: #error "avx512f flag set"
>> #error "avx512f flag set"
>>
>> I forgot to mention my XXXOPTFLAGS, full reconfigure script below.
>> Do we have some Ubuntu 16 builder using system GCC?
>> Maybe we should use `-march=native -O3 -g3` in one of these builders?
>>
>> $ cat arch-gnu-opt/lib/petsc/conf/reconfigure-arch-gnu-opt.py
>> #!/usr/bin/python
>> if __name__ == '__main__':
>> import sys
>> import os
>> sys.path.insert(0, os.path.abspath('config'))
>> import configure
>> configure_options = [
>> '--COPTFLAGS=-march=native -mtune=native -O3',
>> '--CXXOPTFLAGS=-march=native -mtune=native -O3',
>> '--FOPTFLAGS=-march=native -mtune=native -O3',
>> '--download-metis=1',
>> '--download-p4est=1',
>> '--download-parmetis=1',
>> '--with-avx512-kernels=0',
>> '--with-debugging=0',
>> '--with-fortran-bindings=0',
>> '--with-zlib=1',
>> 'CC=mpicc',
>> 'CXX=mpicxx',
>> 'FC=mpifort',
>> 'PETSC_ARCH=arch-gnu-opt',
>> ]
>> configure.petsc_configure(configure_options)
>>
>>
>> --
>> Lisandro Dalcin
>> ============
>> Research Scientist
>> Extreme Computing Research Center (ECRC)
>> King Abdullah University of Science and Technology (KAUST)
>> http://ecrc.kaust.edu.sa/
More information about the petsc-dev
mailing list