-malign-double

Barry Smith bsmith at mcs.anl.gov
Sun Nov 15 18:25:32 CST 2009


    Agreed this would be a good option to have. The question is how to  
do it without having a morass of nasty nested if-defs.  Note that  
portably getting alignment out of malloc()  alone is already ugly and  
not as simple a code as I would like.

     To simplify things we could always require 16 byte alignment  
everywhere? But is that a desirable?

    Barry

On Nov 14, 2009, at 12:43 PM, Jed Brown wrote:

> Barry Smith wrote:
>>
>> On Nov 14, 2009, at 11:28 AM, Jed Brown wrote:
>>
>>> Barry Smith wrote:
>>>>
>>>>
>>>>  Good point, I have removed it.
>>>>
>>>>  I put it in because I wanted an easy way to test that PETSc double
>>>> arrays are always 8 byte aligned (and the unaligned struct values  
>>>> were
>>>> giving me lots of false alarms).
>>>
>>> A related issue is PetscMallocN which (in optimized mode) gives
>>> unaligned arrays even if malloc always returns aligned memory.   
>>> Consider
>>>
>>> PetscMalloc2(3,PetscInt,&ai,3,PetscScalar,&a);
>>
>>   This call is illegal. You are required to pass the longest aligned
>> arrays first leading the shortest. Hence The PetscScalar should be
>> before the PetscInt.
>>
>>   The debug mode should check that this requirement is satisfied; it
>> currently does not check anything.
>
> If nothing ever checks and the docs don't specify, it's valid.
>
>>   In the real world when any of these beasts can be 32 bit or 64 bit
>> one cannot always put them in the right order; but for now they  
>> should
>> be ordered, PetscScalar, pointer, PetscInt
>
> What advantage does this have over just aligning all the pointers  
> (even
> to 16-byte to enable movapd and prevent many loads across a cacheline
> split, see http://x264dev.multimedia.cx/?p=8)?  The arithmetic and few
> wasted bytes is trivial compared to the cost of the allocation, so I
> don't see a reason not to.
>
> Jed
>




More information about the petsc-dev mailing list