[petsc-dev] making DA more light weight

Thu May 15 08:03:35 CDT 2014

Barry Smith <bsmith at mcs.anl.gov> writes:
>    Hmm, why does VecScatter size depend on dof? Since it handles bs >
>    1 it should not. Perhaps it is things like localtoglobal (not block
>    version) that is taking the memory. Issues of collective
>    construction come up if we make those produced on demand.

The ltogmap has one integer per dof, but the scatter uses more memory
and is responsible for peak usage.

From the G+ thread:

  A nice tool for this is massif/massif_visualizer

  http://59A2.org/files/dmda-memory.png

  The memory is in two scatters: L2G defines the non-overlapping space
  and G2L defines the overlapping space.  These could be built lazily,
  but they have to be built collectively.  The memory spikes come from
  getting indices for the blocked spaces.  This could be optimized in
  VecScatter at the expense of slightly more special-case code.  Anyway,
  this is only relevant if you are not using matrices or Krylov, so
  there hasn't been much demand in the past.  We can optimize further if
  it is important.

We could make an ISLocalToGlobalMapping that stores only the blocks
while translating scalar indices.  It would require an integer division,
but you can do a lot of divisions for the cost of a cache miss so I
would expect reasonable performance.  In any case, once you have the
block version, the scalar version could be created non-collectively.

But the ISGetIndices() and other allocations in VecScatterCreate() are
the real killers, responsible for most of the memory usage and the peak
usage in particular.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20140515/eb4d39a8/attachment.sig>