[petsc-dev] AVX kernels, old gcc, still broken
Smith, Barry F.
bsmith at mcs.anl.gov
Fri Oct 25 23:24:09 CDT 2019
The proposed fix is #if defined(PETSC_USE_AVX512_KERNELS) && && && && && in https://gitlab.com/petsc/petsc/merge_requests/2213/diffs
but note that PETSC_USE_AVX512_KERNELS does not even do a configure check to make sure it is valid. The user has to guess that passing that flag will work. Of course a proper configure test is needed and since a proper test is needed it can handle all the issues in one place instead of having one issue in configure and n - 1 in the source code.
This is a basic implementation disagreement, I hate CPP and think it should be used minimally, you hate configure and think it should be used minimally.
Barry
> On Oct 25, 2019, at 1:54 PM, Jed Brown <jed at jedbrown.org> wrote:
>
> "Smith, Barry F. via petsc-dev" <petsc-dev at mcs.anl.gov> writes:
>
>> This needs to be fixed properly with a configure test(s) and not with huge and inconsistent checks like this
>>
>> #if defined(PETSC_HAVE_IMMINTRIN_H) && defined(__AVX512F__) && defined(PETSC_USE_REAL_DOUBLE) && !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_64BIT_INDICES) or this
>>
>> #elif defined(PETSC_USE_AVX512_KERNELS) && defined(PETSC_HAVE_IMMINTRIN_H) && defined(__AVX512F__) && defined(PETSC_USE_REAL_DOUBLE) && !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_64BIT_INDICES) && !defined(PETSC_SKIP_IMMINTRIN_H_CUDAWORKAROUND)
>>
>>
>>
>> self.useAVX512Kernels = self.framework.argDB['with-avx512-kernels']
>> if self.useAVX512Kernels:
>> self.addDefine('USE_AVX512_KERNELS', 1)
>>
>> Here you should check that the needed include files are available, that it is 32 bit integers, that defined(__AVX512F__) exists and that appropriate functions exist.
>
> What happens when one AVX512 kernel gets a 64-bit integer or
> single-precision implementation? You'll have new macros for each
> combination of criteria instead of using && on the normal macros?
>
> We should investigate whether #pragma omp simd or ivdep can generate
> comparable vectorization without directly using the intrinsics -- those
> are quite a bit more portable when they work, and they often work with
> some experimentation.
>
>> Maybe you need two configure checks if there are two different types of functionality you are trying to catch?
>>
>> Continuing to hack away with gross #if def means this crud is unmaintainable and will always haunt you. Yes the incremental cost of doing a proper configure test is there but once that is done the maintenance costs (which have been haunting use for months) will be gone.
>>
>> Barry
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>> On Oct 25, 2019, at 2:49 AM, Lisandro Dalcin via petsc-dev <petsc-dev at mcs.anl.gov> wrote:
>>>
>>>
>>>
>>> On Fri, 25 Oct 2019 at 01:40, Balay, Satish <balay at mcs.anl.gov> wrote:
>>> I'm curious why this issue comes up for you. The code was unrelated to --with-avx512-kernels=0 option.
>>>
>>> Its relying on __AVX512F__and PETSC_HAVE_IMMINTRIN_H flags. And
>>> assumes immintrin.h has a definition for _mm512_reduce_add_pd()
>>>
>>> Is the flag __AVX512F__ always set on your machine by gcc?
>>>
>>> And does this change based on the hardware? I just tried this build
>>> [same os/compiler] on "Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz" - and
>>> can't reproduce the issue.
>>>
>>> I do see _mm512_reduce_add_pd is missing from immintrin.h - but the
>>> flag __AVX512F__ is not set for me.
>>>
>>>
>>> Of course it is not set, you are just invoking the preprocessor. Try this way:
>>>
>>> $ cat xyz.c
>>> #if defined __AVX512F__
>>> #error "avx512f flag set"
>>> #endif
>>>
>>> $ gcc -march=native -c xyz.c
>>> xyz.c:2:2: error: #error "avx512f flag set"
>>> #error "avx512f flag set"
>>>
>>> I forgot to mention my XXXOPTFLAGS, full reconfigure script below.
>>> Do we have some Ubuntu 16 builder using system GCC?
>>> Maybe we should use `-march=native -O3 -g3` in one of these builders?
>>>
>>> $ cat arch-gnu-opt/lib/petsc/conf/reconfigure-arch-gnu-opt.py
>>> #!/usr/bin/python
>>> if __name__ == '__main__':
>>> import sys
>>> import os
>>> sys.path.insert(0, os.path.abspath('config'))
>>> import configure
>>> configure_options = [
>>> '--COPTFLAGS=-march=native -mtune=native -O3',
>>> '--CXXOPTFLAGS=-march=native -mtune=native -O3',
>>> '--FOPTFLAGS=-march=native -mtune=native -O3',
>>> '--download-metis=1',
>>> '--download-p4est=1',
>>> '--download-parmetis=1',
>>> '--with-avx512-kernels=0',
>>> '--with-debugging=0',
>>> '--with-fortran-bindings=0',
>>> '--with-zlib=1',
>>> 'CC=mpicc',
>>> 'CXX=mpicxx',
>>> 'FC=mpifort',
>>> 'PETSC_ARCH=arch-gnu-opt',
>>> ]
>>> configure.petsc_configure(configure_options)
>>>
>>>
>>> --
>>> Lisandro Dalcin
>>> ============
>>> Research Scientist
>>> Extreme Computing Research Center (ECRC)
>>> King Abdullah University of Science and Technology (KAUST)
>>> http://ecrc.kaust.edu.sa/
More information about the petsc-dev
mailing list