[petsc-dev] Adding support memkind allocators in PETSc

Wed Jun 3 20:37:16 CDT 2015

On Wed, Jun 3, 2015 at 6:04 PM, Jed Brown <jed at jedbrown.org> wrote:

> Richard Mills <rtm at utk.edu> writes:
>
> > It's been a while, but I'd like to pick up this discussion of adding a
> > context to memory allocations again.
>
> Have you heard anything back about whether move_pages() will work?
>

move_pages() will work to move pages between MCDRAM and DRAM right now, but
it screws up memkind's partitioning of the heap (it won't be aware that the
pages have been moved).  (Which calls to mind the question I raised
somewhere back in this thread of whether we even need a heap manager for
the large allocations.)

>
> > hbw_preferred_str = (char *)memkind_malloc(MEMKIND_HBW_PREFERRED, size);
>
> How much would you prefer it?  If we stupidly ask for HBM in VecCreate_*
> and MatCreate_*, then our users will see catastrophic performance drops
> at magic sizes and will have support questions like "I swapped these two
> independent lines and my code ran 5x faster".  Then they'll hack the
> source by writing
>
>   if (moon_is_waxing() && operator_holding_tongue_in_right_cheek()) {
>     policy = MEMKIND_HBW_PREFERRED;
>   }
>
> eventually making all decisions based on nonlocal information, ignoring
> the advice parameter.
>
> Then they'll get smart and register their own malloc so they don't have
> to hack the library.  Then they'll try to couple their application with
> another that does the same thing and now they have to write a new malloc
> that makes a new set of decisions in light of the fact that multiple
> libraries are being coupled.
>
> I think we can agree that this is madness.  Where do you draw the line
> and say that crappy performance is just reality?
>
> It's hard for me not to feel like the proposed system will be such a
> nightmarish maintenance burden with such little benefit over a simple
> size-based allocation that it would be better for everyone if it doesn't
> exist.
>

Jed, I'm with you in thinking that, ultimately, there actually needs to be
a way to make these kinds of decisions based on global information.  We
don't have that right now.  But if we get some smart allocator (and
migrator) that gives us, say malloc_use_oracle() to always make the good
decision, we still should have something like a PetscAdvMalloc() that
provides a context to allow us to pass advice to this smart allocator to
provide hints about how it will be accessed, whatever.

I know you don't like the memkind model, and I'm not thrilled with it
either (though it's what I've got to work with right now), but the
interface changes I'm proposing are applicable to other approaches.

> For example, we've already established that small allocations should
> generally go in DRAM because they're either cached or not prefetched and
> thus limited by latency instead of bandwidth.  Large allocations that
> get used a lot should go in HBM so long as they fit.  Since we can't
> determine "used a lot" or "fit" from any information possibly available
> in the calling scope, there's literally no useful advice we can provide
> at that point.  So don't try, just set a dumb threshold (crude tuning
> parameter) or implement a profile-guided allocation policy (brittle).
>

In a lot of cases, simple size-based allocation is probably the way to go.
An option to do automatic size-based placement is even in the latest
memkind sources on github now, but it will do that for the entire
application.  I'd like to be able to restrict this to only the PETSc
portion: Maybe a code that uses PETSc also needs to allocate some enormous
lookup tables that are big but have accesses that are really latency-
rather than bandwidth-sensitive.  Or, to be specific to a code I actually
know, I believe that in PFLOTRAN there are some pretty large allocations
required for auxiliary variables that don't need to go in high-bandwidth
memory, though we will want all of the large PETSc objects to go in there.

>
> Or ignore all this nonsense, implement move_pages(), and we'll have PETSc
> track accesses so we can balance the pages once the app gets going.
>
> > Of course, we'll need some way to ensure that the "advanced malloc"
>
> I thought AdvMalloc was short for AdvisedMalloc.
>

Oh, hey, I do like "Advised" better.

--Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20150603/b8833241/attachment.html>