itaps-parallel Proposal for handling queries with parts, sets, and partitions

Tue Dec 18 13:13:05 CST 2007

On Tue, 2007-12-18 at 10:39 -0600, Tim Tautges wrote:
> As usual, I picked the wrong telecon to miss.  There should be some rule 
> about not allowing major interface change proposals coming out on 
> Thursday after COB (in the midwest) and essentially ratifying those 
> changes Monday morning.  But anyway...
> 
> Carl Ollivier-Gooch wrote:
> > Hello, all.
> > 
> > During our last telecon, I had what I thought was an idea that would 
> > unify the parallel query for sets stuff.  At the time, the idea was 
> > vaguely formed, and I hadn't thought through all of the consequences.  I 
> > may still not have, but I've at least got it to the point of being ready 
> > to drag it out into the light of day.
> > 
> > Essentially, the issue is that we want to have much the same set of 
> > queries for mesh instances, parts, and partitions, but we haven't worked 
> > out how to handle this without exploding the number of functions.
> > 
> > In the serial interface, we have lots of queries that take both a mesh 
> > instance and an entity set argument.  In the now more-or-less deprecated 
> > SIDL paradigm, the mesh instance in an object (in the C++ sense), and we
> > unified global and entity set calls by creating a root set that is,
> > essentially, shorthand for "everything in the mesh instance".
> 
> I think the above is a very useful thing to keep in mind.  The two 
> things that change when you go from there to parallel is that a) mesh 
> instances together form a collection that in some way describes a global 
> mesh, and b) some sets can be thought of as spanning processors, but 
> only implicitly (that is, we don't assume the handles are the same).
> 
> > 
> > In parallel, the mesh instance resides at the process level in a
> > partition / process / part heirarchy (black / red, green in the attached
> > picture; yes, it's a very obvious illustration, but perhaps useful for
> > reference) .  We also want to be able to query partitions and parts
> > (although for partitions we only want to query for numbers of things,
> > not lists of things).  Also, we'd like to able to query sets other than
> > the root set (like that lovely blue set that spans a bunch of parts...).
> > 
> 
> I think it's the other way around: the part/partition resides in the 
> mesh instance.  In parallel, there's an implicit relation between a 
> partition on one processor and on another processor, such that both/all 
> processors understand that their local partition information is a piece 
> of a collection over all processors.  This model reduces to the trivial 
> case for a single processor, both for partition(s) in mesh instances and 
> parts in partitions.
>

I thought that the mesh instance for a processor is what resides on that
processor alone.  This would make it impossible for a mesh instance to
contain a partition.

> > I propose that we think about both parts and mesh instances as coverings 
> > of the global mesh (they are), and that we also think about parts, mesh
> > instances, and the partition as containing -all- data about some 
> > (topologically and/or geometrically compact subset) or all of the global 
> > mesh (as opposed to entity sets, which are deliberately more selective). 
> >  Yes, I know they're all collections of entities, but stick with me 
> > while I justify why I think this semantic distinction is worthwhile. 
> > Also yes, we haven't decided yet (have we?) whether a part will 
> > officially contain everything or just the entities that were 
> > partitioned; I don't -think- that any of what I'm about to propose is 
> > adversely affected, either way.
> > 
> 
> I don't think the partition can be a covering of the whole mesh on a 
> given processor.  Given a collection of 2d and 3d elements, you need to 
> distinguish between the 2d elements that are just adjacent to objects in 
> the partition and the 2d elements which are in fact objects in the 
> partition.
> 
> I did not see anything following which justifies the semantic 
> distinction you're talking about.
> 
> > Finally, it's worth noting that, while a partition contains many parts
> > distributed over many processes, each process is associated with a
> > unique partition (MPI communicator or equivalent) and each part is
> > associated both with a unique process and a unique partition.  So
> > specifying both part handle and partition handle is actually redundant.
> > 
> 
> Each part is associated with a unique partition and process, but each 
> process is not associated with a unique partition.  There may be 
> multiple partitions in use at any given time, and a given process may be 
> responsible for one or more parts in each partition.  Take parallel 
> contact detection for instance.  One partition is over volume elements, 
> the other is over faces on the skin of those elements.  Processors 
> participate in calculations for both partitions at different times.
> 
> Specifying both a part and partition handle may not be strictly 
> redundant, and in terms of implementation I don't think it's useful to 
> to require that the partition be derivable from the part handle.  I do 
> think that in most cases, an application will know the partition that's 
> being dealt with.  For the few cases where a part is a member of 
> multiple partitions (if we want to allow that, and I don't see a major 
> reason why not to), there should be a function to get the partition(s) 
> that a part is a member of.
> 
> > Given that backgroun, I propose that we overload our current query
> > functions so that any place where a mesh instance is currently usable, a
> > part handle or partition handle is also usable, and continue to use
> > either a bona fide entity set or the (global placeholder) root set as
> > the second argument.  In this scenario, we would have, for instance:
> > 
> >             / partition handle
> > getNumOfType| mesh instance    , root set handle   , type, result, err )
> >             \ part handle        entity set handle
> > 
> > The other iMesh function that is obviously of this type is getNumOfTopo.
> > 
> > A somewhat larger collection of functions would be able to take both
> > mesh instances and part handles as the first argument, but not partition
> > handles: getAllVtxCoords, getVtxCoordIndex, getEntities.
> > 
> > My guess is that iterator functions will fall into the second category.
> > 
> > Many of the set queries (number of sets, number of children, identities 
> > of sets and children) are a bit more ambiguous here, to my surprise: 
> > essentially, if you ask for the number of sets contained in a given set 
> > for a particular part, you presumably mean the number of contained sets 
> > that intersect that part.  While I don't have a problem with that 
> > definition, I'm not sure it's an entirely straightforward one, either 
> > conceptually or in implementation.  But I haven't thought about it in a 
> > great deal of detail, either...  One way to finesse this would be to 
> > create sets at the part level rather than the mesh instance level; then 
> > all is easy again.
> > 
> 
> So again, I go back to asking: what are the core needs that prevent us 
> from using sets as both parts and partitions?  The entity set mechanism 
> was designed with this specific usage in mind.
> 
> I know some haven't fully implemented sets, but is that going to be more 
> difficult than the degree of interface changes being discussed here?
> 
> The things that are missing in sets for use as partitions and parts are 
> exactly the extensions we'll already need in sets for going to parallel, 
> no matter what we choose for partition and part representation. 
> Specifically, we'll need some notion of a correspondence between sets on 
> different processors, to handle boundary conditions on mesh spanning 
> processors.
> 
> I'll make a diagram equivalent to Carl's today, maybe that will make 
> things a bit clearer.
> 
> - tim
> 
> > Now, there are a couple of beneath-the-hood requirements for
> > implementations here:
> > 
> > 1.  All part handles must be unique, even in the presence of multiple
> >     partitions.  Pointer-type handles will easily satisfy this;
> >     integer-type handles may need to reserve some bits for partition ID
> >     and some for part ID.
> > 
> > 2.  In most (all?) implementations, finding the result for calls with
> >     (part handle, entity set handle...) will require an implementation
> >     to do some sort of intersection internally.  This will prove
> >     especially challenging for iterator in the presence of mesh
> >     modification and/or migration.
> > 
> > I recognize also that this paradigm shift potentially ambiguates the use
> > of calls like addEntToSet for adding entities to a part; we can either
> > overload those functions (probably renamed to addEntToCollection) or
> > create new addEntToPart functions (I think there are only going to be
> > four of these: add/rmv single/array).
> > 
> > Okay, so there's my proposal.  I'll now stand back and let people poke
> > holes in the idea; that may not prove to be too difficult.
> > 
> > Carl
> > 
> > 
> > ------------------------------------------------------------------------
> > 
>