[petsc-dev] Adding support memkind allocators in PETSc

Richard Mills rtm at utk.edu
Wed Jun 3 21:11:14 CDT 2015

On Wed, Jun 3, 2015 at 6:54 PM, Jed Brown <jed at jedbrown.org> wrote:

> Richard Mills <rtm at utk.edu> writes:
> > On Wed, Jun 3, 2015 at 6:04 PM, Jed Brown <jed at jedbrown.org> wrote:
> >
> >> Have you heard anything back about whether move_pages() will work?
> >>
> >
> > move_pages() will work to move pages between MCDRAM and DRAM right
> > now,
> Great!
> > but it screws up memkind's partitioning of the heap (it won't be aware
> > that the pages have been moved).
> Then memkind is stupid or the kernel isn't exposing the correct
> information to memkind.  Tell them to not be lazy and do it right.

I believe that it really comes down to a problem with what the Linux kernel
allows right now.  To do this "right" we need to hack the kernel.  Memkind
is working within the constraints of what the kernel currently does.

> > Jed, I'm with you in thinking that, ultimately, there actually needs to
> be
> > a way to make these kinds of decisions based on global information.  We
> > don't have that right now.  But if we get some smart allocator (and
> > migrator) that gives us, say malloc_use_oracle() to always make the good
> > decision,
> The oracle has to see into the future.  move_pages() is so much more
> powerful.
> > we still should have something like a PetscAdvMalloc() that provides a
> > context to allow us to pass advice to this smart allocator to provide
> > hints about how it will be accessed, whatever.
> What does the caller know?  What good is the context if we always pass
> > In a lot of cases, simple size-based allocation is probably the way to
> go.
> > An option to do automatic size-based placement is even in the latest
> > memkind sources on github now, but it will do that for the entire
> > application.
> That's crude; I'd rather have each library use its own threshold.
> > I'd like to be able to restrict this to only the PETSc portion: Maybe
> > a code that uses PETSc also needs to allocate some enormous lookup
> > tables that are big but have accesses that are really latency- rather
> > than bandwidth-sensitive.  Or, to be specific to a code I actually
> > know, I believe that in PFLOTRAN there are some pretty large
> > allocations required for auxiliary variables that don't need to go in
> > high-bandwidth memory, though we will want all of the large PETSc
> > objects to go in there.
> Fine.  That involves a couple lines of code.  Go into PetscMallocAlign
> and add the ability to use memkind.  Add a run-time option to control
> the threshold.  Done.

Hmm.  That's a simpler solution that may be better.  I'm not sure that it
will always be the best thing to do, but in cases where it is appropriate,
that simple option sounds like something we should support.

I assume you'd also like an option to specify that the allocation should
fail if high bandwidth memory cannot be allocated, to avoid seeing very
confusing performance.

> If you want complexity to bleed into the library (and necessarily into
> user code if given any power at all), I think you need to demonstrate a
> tangible benefit that cannot be obtained by something simpler.  Consider
> the simple and dumb threshold above to be the null hypothesis.
> This is just my opinion.  Feel free to make a branch with whatever you
> prefer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20150603/25066dd0/attachment.html>

More information about the petsc-dev mailing list