[petsc-dev] error with flags PETSc uses for determining AVX

Barry Smith bsmith at petsc.dev
Sun Feb 14 12:04:20 CST 2021


   For our handcoded AVX functions this is fine, we can handle the dispatching ourselves. 

  But what about all the tons of regular code in PETSc, somehow we need to have the same function compiled twice and dispatched properly. Do we use what Hong suggested with fat binaries? So fat-binaries PLUS _may_i_use_cpu_feature together are the way to portable transportable libraries? 

  And we do this always --with-debugging=0 so everyone, packages and users get portable but also the best performance possible.

  Barry


> On Feb 14, 2021, at 11:50 AM, Jed Brown <jed at jedbrown.org> wrote:
> 
>> 
> 
> immintrin.h provides
> 
> if (_may_i_use_cpu_feature(_FEATURE_FMA | _FEATURE_AVX2) {
>  fancy_version_that_needs_fma_and_avx2();
> } else {
>  fallback_version();
> }
> 
> https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_may_i_use&expand=3677,3677
> 
> I believe this function is slightly expensive because it probably calls the CPUID instruction each time. BLIS has code to cache the result and query features with simple bitwise math.
> 
> https://github.com/flame/blis/blob/master/frame/base/bli_cpuid.h
> https://github.com/flame/blis/blob/master/frame/base/bli_cpuid.c
> 
> Of course this bit of dispatch should typically be done at object creation time, not every iteration.



More information about the petsc-dev mailing list