[petsc-users] AOSOA configuration using DMDA
Jed Brown
jedbrown at mcs.anl.gov
Sat Nov 23 15:48:26 CST 2013
Mani Chandra <mc0710 at gmail.com> writes:
> Hi,
>
> Is it possible to use an Arrays of Structs of Arrays (AOSOA) configuration
> using DMDAs? Something like
>
> struct node {
> float var1[16], var2[16], var3[16];
> }
Yes, you can manually manage this dimension/chunking, and use
DMDASetBlockFills() so that the resulting matrix retains proper
sparsity. Neighbor exchange will not automatically understand the
blocks, and you would have to use a different fringe layout if you want
to organize data as AoSoA.
> Instead of
>
> struct node {
> float var1, var2, var3;
> }
>
> as is the usual way of using DMDAs.
>
> The global grid size of say a 2D grid would then decrease from NxN to (N/16)xN
>
> I'm interested in doing this for ease of vectorization as described in
> http://software.intel.com/en-us/articles/memory-layout-transformations
Note that sparse iterative methods are overwhelmingly limited by memory
bandwidth rather than vectorization, so you'll get no speedup here.
Heavy optimization of stencil operations requires either unaligned loads
or a "roll" operation, at which point the benefit over register
transposition fades. So instead of trying to change the global memory
alignment, I recommend packing aligned representations at whichever
granularity makes sense (in registers, in L1-cache tiles, etc). Make
sure to benchmark the real memory access patterns before leaping to
conclusions about optimal memory layout.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20131123/3ac66f71/attachment.pgp>
More information about the petsc-users
mailing list