itaps-parallel Proposal for handling queries with parts, sets, and partitions

Mon Dec 17 09:39:33 CST 2007

Carl Ollivier-Gooch wrote:
> Hello, all.
> 
> During our last telecon, I had what I thought was an idea that would 
> unify the parallel query for sets stuff.  At the time, the idea was 
> vaguely formed, and I hadn't thought through all of the consequences.  I 
> may still not have, but I've at least got it to the point of being ready 
> to drag it out into the light of day.
> 
> Essentially, the issue is that we want to have much the same set of 
> queries for mesh instances, parts, and partitions, but we haven't worked 
> out how to handle this without exploding the number of functions.
> 
> In the serial interface, we have lots of queries that take both a mesh 
> instance and an entity set argument.  In the now more-or-less deprecated 
> SIDL paradigm, the mesh instance in an object (in the C++ sense), and we
> unified global and entity set calls by creating a root set that is,
> essentially, shorthand for "everything in the mesh instance".
> 
> In parallel, the mesh instance resides at the process level in a
> partition / process / part heirarchy (black / red, green in the attached
> picture; yes, it's a very obvious illustration, but perhaps useful for
> reference) .  We also want to be able to query partitions and parts
> (although for partitions we only want to query for numbers of things,
> not lists of things).  Also, we'd like to able to query sets other than
> the root set (like that lovely blue set that spans a bunch of parts...).
> 
> I propose that we think about both parts and mesh instances as coverings 
> of the global mesh (they are), and that we also think about parts, mesh
> instances, and the partition as containing -all- data about some 
> (topologically and/or geometrically compact subset) or all of the global 
> mesh (as opposed to entity sets, which are deliberately more selective). 
>  Yes, I know they're all collections of entities, but stick with me 
> while I justify why I think this semantic distinction is worthwhile. 
> Also yes, we haven't decided yet (have we?) whether a part will 
> officially contain everything or just the entities that were 
> partitioned; I don't -think- that any of what I'm about to propose is 
> adversely affected, either way.
> 
> Finally, it's worth noting that, while a partition contains many parts
> distributed over many processes, each process is associated with a
> unique partition (MPI communicator or equivalent) and each part is
> associated both with a unique process and a unique partition.  So
> specifying both part handle and partition handle is actually redundant.
> 
> Given that backgroun, I propose that we overload our current query
> functions so that any place where a mesh instance is currently usable, a
> part handle or partition handle is also usable, and continue to use
> either a bona fide entity set or the (global placeholder) root set as
> the second argument.  In this scenario, we would have, for instance:
> 
>             / partition handle
> getNumOfType| mesh instance    , root set handle   , type, result, err )
>             \ part handle        entity set handle
> 
> The other iMesh function that is obviously of this type is getNumOfTopo.
> 
> A somewhat larger collection of functions would be able to take both
> mesh instances and part handles as the first argument, but not partition
> handles: getAllVtxCoords, getVtxCoordIndex, getEntities.
> 
> My guess is that iterator functions will fall into the second category.
> 
> Many of the set queries (number of sets, number of children, identities 
> of sets and children) are a bit more ambiguous here, to my surprise: 
> essentially, if you ask for the number of sets contained in a given set 
> for a particular part, you presumably mean the number of contained sets 
> that intersect that part.  While I don't have a problem with that 
> definition, I'm not sure it's an entirely straightforward one, either 
> conceptually or in implementation.  But I haven't thought about it in a 
> great deal of detail, either...  One way to finesse this would be to 
> create sets at the part level rather than the mesh instance level; then 
> all is easy again.
> 
> Now, there are a couple of beneath-the-hood requirements for
> implementations here:
> 
> 1.  All part handles must be unique, even in the presence of multiple
>     partitions.  Pointer-type handles will easily satisfy this;
>     integer-type handles may need to reserve some bits for partition ID
>     and some for part ID.

But then how do we find the data for the partition once we have its 'id'
from the part handle?  Look it up in a table?  How do we find the table?

> 
> 2.  In most (all?) implementations, finding the result for calls with
>     (part handle, entity set handle...) will require an implementation
>     to do some sort of intersection internally.  This will prove
>     especially challenging for iterator in the presence of mesh
>     modification and/or migration.
> 

I don't think your proposal is flawed in theory.  However, it is rather
difficult to implement in practice.  Your assumption that the
mesh_instance part of the interface is a unnecessary legacy from SIDL is
incorrect.  For id- or index-type handles, we need the mesh instance as
a pointer to the group of data that the handles reference.  We could
just as easily remove the mesh_instance argument from the serial
interface entirely.  It is possible to embed some "instance id" in the
handles, and look up the instance in some static table.  And further, to
be consistent, if we can remove the need for the mesh instance in some
cases, we should do so for all of them.  This brings us back around to
passing part and partition handles as the entity set argument, as that
is consistent with the serial interface.  Or to look at it from the
other direction:  if all handles are unique across mesh instances, what
is a mesh instance?

The multiplexer is an example where this would be unworkable.  In the
multiplexer, the mesh instance is as pointer to a function table.  It
must be able to determine which function from which shared library a
given handle is supposed to be passed to.  While we can guarantee that
handles are unique within an implementations, there is no way to
guarantee that they are unique /between/ implementations.

So as I said above, if one only considers the theoretical meaning of a
mesh instance, your proposal would work.  However, the mesh instance
means something specific for implementations, and is necessary for any
non-pointer handle implementation.

- jason