-malign-double
Barry Smith
bsmith at mcs.anl.gov
Sun Nov 15 18:25:32 CST 2009
Agreed this would be a good option to have. The question is how to
do it without having a morass of nasty nested if-defs. Note that
portably getting alignment out of malloc() alone is already ugly and
not as simple a code as I would like.
To simplify things we could always require 16 byte alignment
everywhere? But is that a desirable?
Barry
On Nov 14, 2009, at 12:43 PM, Jed Brown wrote:
> Barry Smith wrote:
>>
>> On Nov 14, 2009, at 11:28 AM, Jed Brown wrote:
>>
>>> Barry Smith wrote:
>>>>
>>>>
>>>> Good point, I have removed it.
>>>>
>>>> I put it in because I wanted an easy way to test that PETSc double
>>>> arrays are always 8 byte aligned (and the unaligned struct values
>>>> were
>>>> giving me lots of false alarms).
>>>
>>> A related issue is PetscMallocN which (in optimized mode) gives
>>> unaligned arrays even if malloc always returns aligned memory.
>>> Consider
>>>
>>> PetscMalloc2(3,PetscInt,&ai,3,PetscScalar,&a);
>>
>> This call is illegal. You are required to pass the longest aligned
>> arrays first leading the shortest. Hence The PetscScalar should be
>> before the PetscInt.
>>
>> The debug mode should check that this requirement is satisfied; it
>> currently does not check anything.
>
> If nothing ever checks and the docs don't specify, it's valid.
>
>> In the real world when any of these beasts can be 32 bit or 64 bit
>> one cannot always put them in the right order; but for now they
>> should
>> be ordered, PetscScalar, pointer, PetscInt
>
> What advantage does this have over just aligning all the pointers
> (even
> to 16-byte to enable movapd and prevent many loads across a cacheline
> split, see http://x264dev.multimedia.cx/?p=8)? The arithmetic and few
> wasted bytes is trivial compared to the cost of the allocation, so I
> don't see a reason not to.
>
> Jed
>
More information about the petsc-dev
mailing list