<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Apr 29, 2015 at 2:39 AM, Karl Rupp <span dir="ltr"><<a href="mailto:rupp@iue.tuwien.ac.at" target="_blank">rupp@iue.tuwien.ac.at</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
<br>
> (...)<span class=""><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
If we want to move data from one memory kind to another, I believe that<br>
we need to be able to deal with the virtual address changing. Yes, this<br>
is a pain because extra bookkeeping is involved. Maybe we don't want to<br>
bother with supporting something like this in PETSc. But I don't know<br>
of any good way around this. I have discussed with Chris the idea of<br>
adding support for asynchronously copying pages between different kinds<br>
of memory (maybe have a memdup() analog to strdup()) and he had some<br>
ideas about how this might be done efficiently. But, again, I don't<br>
know of a good way to move data to a different memory kind while keeping<br>
the same virtual address. If I'm misunderstanding something about what<br>
is possible with Linux (or other *nix), please let me know--I'd really<br>
like to be wrong on this.<br>
</blockquote>
<br></span>
let me bring up another spin of this thought: Currently we have related issues with managing memory on GPUs. The way we address this topic there is that we have a plain host-buffer, and a buffer allocated on the GPU. A separate flag keeps track of which buffer holds the most recent data (host, GPU, or both). What if we extend this system slightly such that we can also deal with HBM?<br>
<br>
Benefits:<br>
- Changes to code base mainly in *GetArrayReadWrite(), returning the 'correct' buffer.<br>
- Command line options as well as APIs for enabling/disabling HBM can be easily provided.<br>
- DRAM fallback always available, even if HBM exhausted.<br>
- Similar code and logic for dealing with HBM and GPUs.<br>
<br>
Disadvantages:<br>
- Depending on the actual implementation, we may need extra memory (data duplication in HBM and DRAM). Since DRAM >> HBM, this may not be a big issue.<br>
- Some parts of PETSc allocate memory directly rather than using standard types. These will not use HBM then. May not be performance-critical, though...<br>
- Asynchronous copies between DRAM and HBM remain tricky.<br>
- 'Politics': The approach is not as fancy as writing heap managers and other low-level stuff, so it's harder to sell to e.g. program managers.<br></blockquote><div><br></div><div>I like several things about this proposal, and I think it might especially make sense for systems with very large amounts of NVRAM, and the example that Barry was talking about in which one might want to somehow mark an allocation as being a target for eviction if needed. At the expense of data duplication, it also helps address problems that might arise if one tries to access a data structure that is in the middle of being copied from one memory kind to another.</div><div><br></div><div>--Richard</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
Best regards,<br>
Karli<br>
<br>
</blockquote></div><br></div></div>