itaps-parallel Notes from today's meeting
Mark Shephard
shephard at scorec.rpi.edu
Tue Apr 13 06:49:29 CDT 2010
The things Mark Beall indicates below sound about right with respect to
the RPI iMeshP without going into the implementation details of a "mesh
instance" since others will have to comment on implementation details.
Since there are a limited number of iMeshP implementations, I do think
that considering the experience of those implementations in refining the
iMeshP design does make some sense.
In terms of the machines of the future, they are not that far down the
line. We all hope to be working on the Argonne Blue Gene/Q early in
2012. That machine will have 16 cores per node and I expect that to get
good performance out of that machine that we will do something on the
order of defining parts on the node level and doing something different
across the cores of the node. (This is instead of having the parts at
the core level.)
Mark
Mark Beall wrote:
>
>
>
> On Apr 12, 2010, at 2:46 PM, Devine, Karen D wrote:
>> The first big question: Can an entity exist in an iMesh_instance that
>> has a
>> partition without being owned by any part?
>
> First, a disclaimer: somewhere in this I'm going to write partition
> instead of part (since what you call "parts" we call "partitions", what
> you call a "partition" we call a "partitioning"). I will try to avoid that.
>
> It seems to me that a, not unreasonable, implementation of a partitioned
> mesh is one where each part is equivalent to a mesh (when I say "mesh"
> in here I mean "mesh instance" in the iMesh sense). This is the way that
> our partitioned mesh works and, I believe, pretty much how RPI's works.
> So we have 2 out of (how many?) iMeshP implementations that are designed
> this way. From here on I'm going to assume that this is a reasonable
> implementation, so my conclusions are only valid if you agree with that.
>
> The pros of this implementation:
> - Since there is already a way of knowing what entities are in a mesh,
> there is zero overhead, for the single partition case, of knowing what
> entities are in a part. I think this is an important consideration since
> the vast majority of applications won't use multiple partitions.
> - There is no need for functions to loop over the contents of a part, as
> there are already those for a mesh - I would think that Tim would like
> this approach for this reason - smaller API :)
> - Code that works in serial on a mesh can be used for anything that can
> be done on just the interior of a part (or on the entire part if it
> doesn't modify the part boundary)
> - Since operations to create mesh entities are done on a mesh, there
> isn't ever any question on what part an entity belongs to.
>
> The cons of this implementation:
> - It's not obvious how to deal with multiple partitions (answer to this
> in second question below)
> - entities on the part boundary must be duplicated even if they are on
> the same process
> - (please feel free to add to this list since I can't think of any others)
>
> Some conceptual differences of this implementation:
> - rather than taking entities out of a part and putting them in another,
> you change their part ownership by telling them (or something doing the
> repartitioning), "I'm going to want you to be in this other part later".
> Then after you've done that for all the entities (in a consistent way),
> you repartition the mesh. The nice thing about this is that there is
> never a point in time (from the user perspective) that the mesh and the
> parts are inconsistent in any way.
>
> For this kind of implementation, allowing a situation where a mesh
> entity is not in a part is equivalent to it not being in a mesh. While
> one could certainly have implementations where that is allowed, in our
> implementation - which I think is consistent with the intents of iMesh -
> the mesh owns the mesh entities (in the OO sense - if the mesh is
> deleted it takes all of the mesh entities with it).
>
> As was mentioned in the paragraph I deleted below, when we have multiple
> parts per process, we have multiple meshes per process one per part. We
> can certainly make those look like a single iMesh_instance through the
> interface. In reality we could also have an additional hidden mesh that
> was the home for any mesh entities that aren't on a partition. The
> downsides to that are a) it's relatively expensive to move entities from
> one mesh to another for us (you have to copy it and delete the old one -
> which really gets messy when something off-process might be referring to
> it), b) there are rules that don't allow an entity to attach to lower
> order entities in a different mesh (for very good reasons).
>
> This isn't an answer to "the first big question" but is a explanation of
> why it's a problem for some implementations.
>
>>
>> The second big question: How should we enable multiple partitions of a
>> single mesh in iMeshP?
>
> My understanding from what's been discussed on this mailing list is that
> it's expected that different partitions don't have any correlation to
> each other (in other words, it's the rare exception that two different
> partitions will result in entities being on the same process in both).
> If you don't agree with that, then, again, my conclusions based on that
> are invalid.
>
> If partitions are not correlated then any entity in more than one
> partition must effectively be copied for each additional partition they
> are in. The only way for an entity to exist on more than one process is
> for there to be more than one copy of it (if someone has cases where
> there are multiple partitions but they don't actually need to know any
> information about the entities in one - other than that they are in it -
> that would be a counter example to this).
>
> Given this then a second (or nth) partition is really just a partition
> of a copy of a mesh (or a subset of entities in the mesh). So why try
> really hard to hide that fact?
>
> I've heard Mark S. say that there is a need for another mechanism for
> dividing up a mesh (I'm purposely avoiding the word partitioning here)
> on these currently mythical computers that will exist in 10 years. It's
> going to be scary if there are meshes, partitions and parts, and, umm,
> divisions and divs ...
>
>> The third question: Should iMeshP_destroyPart should return an error
>> if the
>> part to be destroyed is not empty?
>> The answer to this question depends, to some extent, on our answer to the
>> first big question above.
>> Cons: Requires users to explicitly remove entities from parts before
>> deleting the parts. This operation requires three function
>> calls: iMeshP_getEntities, iMesh_rmvEntFromSet (overloaded with
>> part handle), iMeshP_destroyPart.
>> Pros: Prevents users from shooting themselves in the foot.
>> Users would possibly call iMeshP_getEntities and
>> iMesh_rmvEntFromSet anyway to do migration to new part.
>
> This becomes a moot point in my world...
>
> mark
>
>
More information about the itaps-parallel
mailing list