[petsc-dev] Adding support memkind allocators in PETSc

Barry Smith bsmith at mcs.anl.gov
Wed Jun 3 21:26:31 CDT 2015


   To follow up on this, going back to my "advise object" to malloc being a living object as opposed to just some flags. In the case where different vectors may have very different "importances" at different times in the runtime of the simulation one could "switch" some vectors from using slow to faster memory when one knows the code is switching to a different phase where the vector "importances" are different.

  Barry

  Note that even if Intel cannot provide a way to "switch" a  memory address between fast and slow it doesn't really mater from the PETSc point of view since inside any particular PETSc vector we would could switch the ->array pointer to a different memory location (and copy stuff over if needed) when changing a vector from important to unimportant or the opposite. (since no code outside the vector object knows what the pointer is).


> On Jun 3, 2015, at 9:18 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>> On Jun 3, 2015, at 8:55 PM, Richard Mills <rtm at utk.edu> wrote:
>> 
>> Ha, yes.  I'll try this out, but I do wonder what people's thoughts are on the best way to "tag" an object like a Vec or Mat for some particular treatment of its placement in memory.  Does doing this at the level of a Mat or Vec (e.g., VecSetAdvMallocCtx() ) sound appropriate?  We could actually make this a part of any PetscObject, but I think that's not necessary.
> 
>  No idea.
> 
>  Perhaps, and this is just nonsense off the top of my head, if you had some measure of the importance of a vector (or matrix; I would start with vectors for simplicity and since we have more of them) based on how often it's values would be "accessed". So a vector that you know is only used "once in a while" gets a lower "importance" than one that gets used "very often". Of course determining these vectors importances may be difficult. You could do it experimentally, add some code that measures how often each vector gets its values "accessed (whatever that means)/read write" and see if there is some distribution (do this for a nontrivial TS example) where some vectors are accessed often and others rarely. Now place the often "accessed" vectors in faster memory and see how much faster the code is.
> 
>  Barry
> 
> A related note is that "we" are not particularly careful about "reusing" work vectors; say a code has ten different work vectors for different phases of the computation; now imagine a careful "global analysis" that determined it could get away with three work vectors (since only at most three had relevant values at any one time), now pop those three work vectors into faster memory where the ten previous work vectors could not fit. Obviously I am being extreme here to make a point that careful memory decisions could potentially make a difference in complicated codes (and all we are about are complicated codes).
> 
> 
> 
> 
>> 
>> --Richard
>> 
>> On Wed, Jun 3, 2015 at 6:50 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>>  The beauty of git/bitbucket is one can make branches to try out anything they want even if some cranky old conservative PETSc developer thinks it is worse then consorting with the devil.
>> 
>>   As I said before I think that "additional argument" to advised_malloc should be a living object which one can change over time as opposed to just a "flag" type argument that only effects the malloc at malloc time. Of course the "living part" can be implemented later.
>> 
>>   Barry
>> 
>> Yes, Jed has already transformed himself into a cranky old conservative PETSc developer
>> 
>> 
>>> On Jun 3, 2015, at 7:33 PM, Richard Mills <rtm at utk.edu> wrote:
>>> 
>>> Hi Folks,
>>> 
>>> It's been a while, but I'd like to pick up this discussion of adding a context to memory allocations again.
>>> 
>>> The immediate motivation I have is that I'd like to support use of the memkind library (https://github.com/memkind/memkind), though adding a context to PetscMallocN() (or making some other interface, say PetscAdvMalloc() or whatever) could have much broader utility than simply memkind support (which Jed doesn't like anyway, and I share some of his concerns).  For the sake of having a concrete example, I'll discuss memkind here.
>>> 
>>> Memkind's memkind_malloc() works like malloc() but takes a memkind_t argument to specify some desired property of the memory being allocated.  For example,
>>> 
>>> hugetlb_str = (char *)memkind_malloc(MEMKIND_HUGETLB, size);
>>> 
>>> returns a pointer to memory allocated using huge pages, and
>>> 
>>> hbw_preferred_str = (char *)memkind_malloc(MEMKIND_HBW_PREFERRED, size);
>>> 
>>> allocates memory from a high-bandwidth region if it's available and elsewhere if not (specifying MEMKIND_HBW will insist on the allocation coming from high-bandwidth memory, failing if it's not available).
>>> 
>>> It should be straightforward to add a variant of PetscMalloc() that accepts a context: I'll call this PetscAdvMalloc(), for now, though we can come up with a better name later.  This will allow passing on the memkind_t via this context to the underlying memkind allocator, and we can have some mechanism to set a default context (in the case of Memkind, this is likely MEMKIND_DEFAULT) that gets used when plain PetscMalloc() gets called.
>>> 
>>> Of course, we'll need some way to ensure that the "advanced malloc" gets used to allocated the critical data structures.  As a low-level way to start, it may make sense to simply add a way to stash a context in Vec and Mat objects.  Maybe have VecSetAdvMallocCtx(), and if that context gets set, then PetscAdvMalloc() is used for the allocations associated with the contents of that object.  It would probably be better to eventually have a higher-level way to do this, e.g., support standard settings in the options database that PETSc uses to construct the appropriate arguments to underlying allocators that are supported, but I think just adding a way to set this context directly is an appropriate first step.
>>> 
>>> Does this sound like a reasonable thing for me to prototype, or are others thinking something very different?  Please let me know.  I'm getting more access to early systems I can experiment on, and I'd really like to move forward on trying things with high bandwidth memory (imperfect as our APIs for using it are).
>>> 
>>> Best regards,
>>> Richard
>>> 
>>> 
>>> On Wed, Apr 29, 2015 at 11:10 PM, Richard Mills <rtm at utk.edu> wrote:
>>> On Wed, Apr 29, 2015 at 1:28 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>> 
>>>  Forget about the issue of "changing" PetscMallocN() or adding a new interface instead, that is a minor syntax and annoyance issue:
>>> 
>>>  The question is "is it worth exploring adding a context for certain memory allocations that would allow us to "do" various things to the memory and "indicate" properties of the memory"? I think, though I agree with Jed that it could be fraught with difficulties, that is is worthwhile playing around with this.
>>> 
>>>  Barry
>>> 
>>> 
>>> I vote "yes".  One might want to, say
>>> 
>>> * Give hints via something like madvise() on how/when the memory might be accessed.
>>> * Specify a preferred "kind" of memory (and behavior if the preferred kind is not available, or perhaps even specify a priority on how hard to try to get the preferred memory kind)
>>> * Specify something like a preference to interleave allocation blocks between different kinds of memory
>>> 
>>> I'm sure we can come up with plenty of other possibilities, some of which might actually be useful, many of which will be useful only for very contrived cases, and some that are not useful today but may become useful as memory systems evolve.
>>> 
>>> --Richard
>>> 
>> 
>> 
> 




More information about the petsc-dev mailing list