[petsc-dev] AVX kernels, old gcc, still broken

Smith, Barry F. bsmith at mcs.anl.gov
Fri Oct 25 23:24:09 CDT 2019


  The proposed fix is #if defined(PETSC_USE_AVX512_KERNELS)   && && && && && in https://gitlab.com/petsc/petsc/merge_requests/2213/diffs

but note that PETSC_USE_AVX512_KERNELS does not even do a configure check to make sure it is valid. The user has to guess that passing that flag will work. Of course a proper configure test is needed and since a proper test is needed it can handle all the issues in one place instead of having one issue in  configure and n - 1 in the source code. 

  This is a basic implementation disagreement, I hate CPP and think it should be used minimally, you hate configure and think it should be used minimally.

   Barry



> On Oct 25, 2019, at 1:54 PM, Jed Brown <jed at jedbrown.org> wrote:
> 
> "Smith, Barry F. via petsc-dev" <petsc-dev at mcs.anl.gov> writes:
> 
>>   This needs to be fixed properly with a configure test(s) and not with huge and inconsistent checks like this  
>> 
>> #if defined(PETSC_HAVE_IMMINTRIN_H) && defined(__AVX512F__) && defined(PETSC_USE_REAL_DOUBLE) && !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_64BIT_INDICES)  or this
>> 
>> #elif defined(PETSC_USE_AVX512_KERNELS) && defined(PETSC_HAVE_IMMINTRIN_H) && defined(__AVX512F__) && defined(PETSC_USE_REAL_DOUBLE) && !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_64BIT_INDICES) && !defined(PETSC_SKIP_IMMINTRIN_H_CUDAWORKAROUND)
>> 
>> 
>> 
>>   self.useAVX512Kernels = self.framework.argDB['with-avx512-kernels']
>>    if self.useAVX512Kernels:
>>      self.addDefine('USE_AVX512_KERNELS', 1)
>> 
>>     Here you should check that the needed include files are available, that it is 32 bit integers, that defined(__AVX512F__) exists and that appropriate functions exist. 
> 
> What happens when one AVX512 kernel gets a 64-bit integer or
> single-precision implementation?  You'll have new macros for each
> combination of criteria instead of using && on the normal macros?
> 
> We should investigate whether #pragma omp simd or ivdep can generate
> comparable vectorization without directly using the intrinsics -- those
> are quite a bit more portable when they work, and they often work with
> some experimentation.
> 
>>    Maybe you need two configure checks if there are two different types of functionality you are trying to catch? 
>> 
>>    Continuing to hack away with gross #if def means this crud is unmaintainable and will always haunt you. Yes the incremental cost of doing a proper configure test is there but once that is done the maintenance costs (which have been haunting use for months) will be gone.
>> 
>>  Barry
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>> On Oct 25, 2019, at 2:49 AM, Lisandro Dalcin via petsc-dev <petsc-dev at mcs.anl.gov> wrote:
>>> 
>>> 
>>> 
>>> On Fri, 25 Oct 2019 at 01:40, Balay, Satish <balay at mcs.anl.gov> wrote:
>>> I'm curious why this issue comes up for you. The code was unrelated to --with-avx512-kernels=0 option.
>>> 
>>> Its relying on __AVX512F__and PETSC_HAVE_IMMINTRIN_H flags. And
>>> assumes immintrin.h has a definition for _mm512_reduce_add_pd()
>>> 
>>> Is the flag __AVX512F__ always set on your machine by gcc? 
>>> 
>>> And does this change based on the hardware? I just tried this build
>>> [same os/compiler] on "Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz" - and
>>> can't reproduce the issue.
>>> 
>>> I do see _mm512_reduce_add_pd is missing from immintrin.h - but the
>>> flag __AVX512F__ is not set for me.
>>> 
>>> 
>>> Of course it is not set, you are just invoking the preprocessor. Try this way:
>>> 
>>> $ cat xyz.c
>>> #if defined __AVX512F__
>>> #error "avx512f flag set"
>>> #endif
>>> 
>>> $ gcc -march=native -c xyz.c
>>> xyz.c:2:2: error: #error "avx512f flag set"
>>> #error "avx512f flag set"
>>> 
>>> I forgot to mention my XXXOPTFLAGS, full reconfigure script below.
>>> Do we have some Ubuntu 16 builder using system GCC?
>>> Maybe we should use `-march=native -O3 -g3` in one of these builders?
>>> 
>>> $ cat arch-gnu-opt/lib/petsc/conf/reconfigure-arch-gnu-opt.py 
>>> #!/usr/bin/python
>>> if __name__ == '__main__':
>>>  import sys
>>>  import os
>>>  sys.path.insert(0, os.path.abspath('config'))
>>>  import configure
>>>  configure_options = [
>>>    '--COPTFLAGS=-march=native -mtune=native -O3',
>>>    '--CXXOPTFLAGS=-march=native -mtune=native -O3',
>>>    '--FOPTFLAGS=-march=native -mtune=native -O3',
>>>    '--download-metis=1',
>>>    '--download-p4est=1',
>>>    '--download-parmetis=1',
>>>    '--with-avx512-kernels=0',
>>>    '--with-debugging=0',
>>>    '--with-fortran-bindings=0',
>>>    '--with-zlib=1',
>>>    'CC=mpicc',
>>>    'CXX=mpicxx',
>>>    'FC=mpifort',
>>>    'PETSC_ARCH=arch-gnu-opt',
>>>  ]
>>>  configure.petsc_configure(configure_options)
>>> 
>>> 
>>> -- 
>>> Lisandro Dalcin
>>> ============
>>> Research Scientist
>>> Extreme Computing Research Center (ECRC)
>>> King Abdullah University of Science and Technology (KAUST)
>>> http://ecrc.kaust.edu.sa/



More information about the petsc-dev mailing list