[petsc-dev] How are we deciding the memory alignment that BuildSystem should choose?
Barry Smith
bsmith at mcs.anl.gov
Wed Sep 6 18:33:23 CDT 2017
Is there a "tuning" part of BuildSystem that could be added.
We select things that would benefit from tuning to select a good value. For example with memalign it runs some "benchmark" with the various sizes and selects the smallest size that works best?
If this is just way to much intellectual working and coding to utilize, which I think it is, we can use the "predefined based on architecture" approach where configure looks in a table based on the arch/sub-arch etc to select a value (and of course the value is verified to work correctly). This approach may not always produce the optimal value but it is simple and will usually select a pretty good value, it only requires the discipline of adding new archs overtime (which often will not happen) but at least for some arches we may have good values.
Barry
> On Sep 5, 2017, at 12:55 PM, Jed Brown <jed at jedbrown.org> wrote:
>
> Richard Tran Mills <rtmills at anl.gov> writes:
>
>> Folks,
>>
>> I am wondering how PETSc's BuildSystem currently chooses the memory
>> alignment to use. In the example file I provide in the repo for the Cori
>> KNL nodes, I specify '--with-memalign=64' to match the cache line width. If
>> I don't do this, then configure chooses a 16 byte alignment. Do people
>> think that we should try to make a better effort to choose a more
>> appropriate alignment?
>>
>> I believe that all modern x86 CPUs use a 64 byte cache width, so should we
>> be defaulting to that? I don't know how much this matters on a Xeon
>> processor, but it can be important on Xeon Phi.
>
> The preference for 16 was for SSE2, prior to AVX and AVX512. With AVX,
> I believe there is no particular reason to prefer more than 32-byte
> alignment -- it just ensures that objects cannot share cache lines.
> Harmless with large arrays, but not ideal for smaller allocations like
> nodes in a linked list. (PETSc doesn't use these very often, but it
> would be nice if control structures don't soak up more cache than they
> need.)
More information about the petsc-dev
mailing list