<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Apr 28, 2015 at 10:47 PM, Jed Brown <span dir="ltr"><<a href="mailto:jed@jedbrown.org" target="_blank">jed@jedbrown.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">Richard Mills <<a href="mailto:rtm@utk.edu">rtm@utk.edu</a>> writes:<br></span><br>

<span class="">[...]<br>

> I think many users are going to want more control than what something like<br>

> AutoHBW provides, but, as you say, a lot of the time one will only care<br>

> about the the substantial allocations for things like matrices and vectors,<br>

> and these also tend to be long lived--plenty of codes will do something<br>

> like allocate a matrix for Jacobians once and keep it around for the<br>

> lifetime of the run.  Maybe we should consider not using a heap manager for<br>

> these allocations, then.  For allocations above some specified threshold,<br>

> perhaps we (PETSc) should simply do the appropriate mmap() and mbind()<br>

> calls to allocate the pages we need in the desired type of memory, and then<br>

> we could use things like use move_pages() if/when appropriate (yes, I know<br>

> we don't yet have a good way to make such decisions).  This would mean<br>

> PETSc getting more into the lower level details of memory management, but<br>

> maybe this is appropriate (an unavoidable) as more kinds of<br>

> user-addressable memory get introduced.  I think is actually less horrible<br>

> than it sounds, because, really, we would just want to do this for the<br>

> largest allocations.  (And this is somewhat analogous to how many malloc()<br>

> implementations work, anyway: Use sbrk() for the small stuff, and mmap()<br>

> for the big stuff.)<br>

<br>

</span>I say just use malloc (or posix_memalign) for everything.  PETSc can't<br>

do a better job of the fancy stuff and these normal functions are<br>

perfectly sufficient.<br>

<span class=""><br>

>> That is a regression relative to move_pages.  Just make move_pages work.<br>

>> That's the granularity I've been asking for all along.<br>

><br>

> Cannot practically be done using a heap manager system like memkind.  But<br>

> we can do this if we do our own mmap() calls, as discussed above.<br>

<br>

</span>In practice, we would still use malloc(), but set mallopt<br>

M_MMAP_THRESHOLD if needed and call move_pages.  The reality is that<br>

with 4 KiB pages, it doesn't even matter if your "large" allocation is<br>

not page aligned.  The first and last page don't matter--they're small<br>

enough to be inexpensive to re-fetch from DRAM and don't use up that<br>

much extra space if you map them into MCDRAM.<br>

</blockquote></div><br></div><div class="gmail_extra">Hmm.  That may be a pretty good solution for DRAM vs. MCDRAM.  What about when we further complicate things by adding some large pool of NVRAM?  One might want some sufficiently large arrays to go into MCDRAM, but other large arrays to go to NVRAM or DRAM.  I guess we can still do the appropriate move_pages() to get things into the right places, but I can also see wanting to do things like use a much large page size for giant data sets going into NVRAM (which you won't be able to do without a copy to a different mapped region).  And if there are these and other complications... then maybe we should be using a heap manager like memkind.  It would simplify quite a few things EXCEPT we'd have to deal with the virtual address changing when we want to change "kind" of memory.  But maybe this would not be so bad, using an approach like Karli outlined.</div><div class="gmail_extra"><br></div><div class="gmail_extra">--Richard</div></div>