itaps-parallel Part handles

Mon Jun 23 09:45:47 CDT 2008

Devine, Karen D wrote:
> 
> 
> On 6/22/08 8:13 PM, "Tim Tautges" <tautges at mcs.anl.gov> wrote:
> 
>> So I'm confused here - are we or are we not requiring that applications
>> be able to derive the rank from the part handle *without making an iMesh
>> function call*?
> 
> The processor information would be obtained from a part handle via a
> function call iMeshP_getRankOfPart.
> 
>> If we are requiring that, a number of issues haven't been address (F77,
>> which bits and how many).
>>
>> If we are not requiring that, and we already have a function for getting
>> the rank from a part handle, why are we discussing this at all?  Isn't
>> it then an implementation choice?
> 
> For any part handle, iMeshP_getRankOfPart cannot invoke communication;
> otherwise, processors would have to call it synchronously.
> 
>> Or, and I'm guessing this is the real issue, are we saying applications
>> should be able to pass part handles around, then pass them back into the
>> interface and get information from them?  If so, this sounds pretty
>> unique.  It also sounds pretty unlikely, considering parts can't be
>> created on remote processes.
> 
> Yes, I believe we are saying applications should be able to pass part
> handles around, then pass them back into the interface and get information
> from them.  I assume this is true for other remote handles; that is, if I
> pass a remote entity handle to iMesh_getEntType or iMesh_getEntTopo, I get a
> correct answer, right?  As Carl has said, for remote copies, we'll need
> either part handles that provide processor information or [part handle,
> processor rank] pairs.
> 

I don't believe it is required that applications be able to remotely 
evaluate *entity* handles, even to get type information.  Consider that 
some implementations will use pointers as entity handles, and these will 
be meaningless on remote processors.

- tim

> Karen
> 
> 
>> - tim
>>
>> Carl Ollivier-Gooch wrote:
>>> Tim Tautges wrote:
>>>> Hi all,
>>>>   I realize now why I missed this discussion in March, it was in the
>>>> middle of a 3 week/3 trip stretch for me, and I got overwhelmed and
>>>> didn't follow the discussion closely.
>>>>
>>>> Anyway, this email is an attempt to summarize my objection to embedding
>>>> the proc rank in the part handle.  As background, you should know that I
>>>> almost went this way in MOAB (to the point where it's still implemented
>>>> in MOAB) but ultimately decided not to, for many of the same reasons I
>>>> note below.
>>>>
>>>> And, sorry if this is dumb and/or offensive, but I still don't see a
>>>> logical, non-pejorative statement of the problem whose only, or most
>>>> logical, solution is the two primary requirements causing me such
>>>> heartburn (part handles must be unique, and processor must be
>>>> derivable from part handle).  And that's after reading the March email
>>>> thread at least three times, and thinking about this for at least 4
>>>> hours over the last couple of days.
>>>>
>>>> The remainder of this email consists of 3 parts:
>>>> Part I: my objections
>>>> Part II: questions about the requirements Karen sent out Wed afternoon
>>>> Part III: questions/comments arising while looking through current
>>>> interface draft
>>> Definitions:
>>>
>>> Name: Complete unique identifier of something in the data model.
>>> Handle:  Some series of bits of the same size as a void*; interpretation
>>> of those bits is implementation-dependent.
>>>
>>> Summary (50,000 foot view)
>>>
>>> What this debate seems to me to reduce to is this:  Should the name of a
>>> part consist of a local handle plus a process rank (or equivalents); or
>>> of a single global handle that is also used locally and can be used to
>>> deduce rank.
>>>
>>> 10,000 foot view
>>>
>>> Personally, I don't see the efficiency issue as a big one:  IMO, the
>>> overhead of having to translate a part handle to a set handle (for
>>> instance) can be reduced to no more than a couple of bitwise operations
>>> and an array lookup, and I think all the affected functions are way more
>>> expensive than this, especially once you account for the work done on
>>> the returned data.  That's obviously true for anything with
>>> communication, anything with a large set of return data, and iteration
>>> initialization, so I'll omit those functions in my analysis below as
>>> being completely irrelevant from an overhead point of view.
>>>
>>> The strongest argument for handle+rank (IMO) is that this is what's
>>> currently required for entities.  (Sets have no unique global ID, not
>>> even those that span processes and probably should have...).  On the
>>> other hand, a global entity handle space is Right Out for 32-bit
>>> machines: 4 billion entities / 100K processors = 40K entities/processor,
>>> which is on the thin side (and requires that all entities of all types
>>> be numbered consecutively... pointers need not apply).  I doubt that 4
>>> billion parts (or even a quarter billion, allowing for some wastage) is
>>> going to be a barrier...  (As an aside, are the huge parallel machines
>>> 32 bit or 64?)
>>>
>>> The strongest argument for global handle (IMO) is that the same handle
>>> is used both locally and globally for a part (perhaps only shifting
>>> rather than eliminating a translation burden, admittedly), and it looks
>>> and feels like all the other handles.  This is more elegant, and also
>>> eliminates some argument bloat in a number of functions.
>>>
>>> Single Global Handle
>>>
>>> Other advantages of the single global handle is that communication
>>> partners are clear from the single handle (whether for partitioning or
>>> for ghosting, etc), and that names for parts are congruent to names for
>>> all the other things in the data model.
>>>
>>> The disadvantage is that for some (probably for all, I'm guessing)
>>> implementations, there will have to be some intermediate work to get
>>> from a part handle to the actual memory location where that data is
>>> stored.  This is already a fact of life for all of us with entity
>>> handles (I know it is for GRUMMP, and I've seen enough of MOAB part
>>> handles while debugging the interaction between swapping and MOAB to
>>> know that it's an issue there, too; FMDB may manage to avoid this
>>> problem, though I can't imagine how).  Personally, I don't see this as a
>>> big deal (I've attached an example, based on my expectation that I'll
>>> probably implement parts as sets with extra stuff, but represent them
>>> externally as handles that are easily distinguishable from pointers,
>>> because the low-order bit will be set).
>>>
>>> Local functions this may invoke a cost that is not completely trivial:
>>>
>>> getNumPartNbors, getPartNbors, getNumPartBdryEnts, isEntOwner, getEntStatus
>>>
>>> Code that often asks for entity ownership or status is a potential
>>> problem from an efficiency point of view, but only if the search for an
>>> entity within a part has a cost that's really small (read: constant in
>>> the size of the part).
>>>
>>> I suspect that calls to get*PartNbors are going to typically be
>>> accompanied by communication with those neighbors and that calls to
>>> getNumPartBdryEnts are going to typically be accompanied by work
>>> proportional to the number of bdry entities.  If so, any translation
>>> overhead is lost in the noise for those functions.
>>>
>>> The part data access functions:
>>>
>>> getNumOfType, getNumOfTopo (both whole part and set within a part)
>>>
>>> The overhead for these funcs may be a noticeable percentage of the
>>> function cost.   Hard to see these being called enough to contribute
>>> significantly to overall runtime, though.
>>>
>>> Getting and setting tag data on parts.  (get/setEntSet*Data,
>>> getAllEntSetTags).  The relative cost here depends on how tags are
>>> stored.  Personally, I'm using hash tables (hashing the handle of the
>>> thing being tagged), and my hash function is way more expensive than an
>>> array lookup, so in my implementation, the overhead of the translation
>>> function will be a small percentage.  Other implementations may vary.
>>>
>>> As for various subsetting functions, those should never have a part
>>> handle passed to them, so they aren't affected.
>>>
>>> LOCAL HANDLE PLUS PROCESS RANK
>>>
>>> The advantage here is that the same handle space can be used for sets
>>> and parts (applies to MOAB, probably GRUMMP; not FMDB).
>>>
>>> The disadvantage is that now the full name of a part has two pieces
>>> (handle and rank), which makes it a completely different animal than all
>>> other handles.
>>>
>>> getRankOfPart:   This function would have to be removed.
>>>
>>> Functions that will require a rank input arg in addition to part handle:
>>>
>>> testPart (which will now require communication), getNumPartBdryEnts,
>>> getPartBdryEnts, initPartBdryEnt*Iter, getCopyOnPart
>>>
>>> Functions that will require a rank or array of ranks to be returned (in
>>> addition to the current part return data):  (Note that copying the extra
>>> data also incurs a small cost.)
>>>
>>> getNumPartNbors, getPartNbors, getEntOwnerPart, getCopyParts, getCopies,
>>> getOwnerCopy
>>>
>>> All comm functions will now require a rank as well as a part (yes,
>>> internally the implementation has to do this anyway):
>>>
>>> exchEntArrToParts, addGhostOf, rmvGhostOf, exchTags, exchTagsEnt,
>>> IExchTags, IExchTagsEnt
>>>
>>> Functions that are a little dicy (because they are currently defined to
>>> give errors for remote parts):
>>>
>>> isEntOwner, getEntStatus
>>>
>>> While adding args to functions isn't necessarily that big a deal, why
>>> expand the call list if it isn't necessary?
>>>
>>>
>>>> Part I
>>>>
>>>> 1. If we are required to embed the proc rank in the handle, then
>>>> either I cannot use entity sets to represent parts, or I have to embed
>>>> the proc in all my set handles, which allows at most 2**(32-n)-1 other
>>>> sets available on any processor, where n is the log of the number of
>>>> processors; if we need to support millions of processors, n >= 20,
>>>> leaving me only 4k sets.  Putting that kind of limitation on the
>>>> number of sets will not be acceptable for some important applications
>>>> of sets, e.g. geometric topology sets, sets to represent kdtree
>>>> decomposition of the local mesh (for point location), or OBB tree
>>>> decomposition of surface triangulations (for ray firing).  Saying we
>>>> must not implement parts as sets is at least as bad as saying we must
>>>> implement parts as sets; furthermore, it prevents the dual-use of a
>>>> given set as a part also (something which I already do, and which is
>>>> quite useful).
>>> Tim, as I understand the way MOAB manages its handle space (based on
>>> looking at entity handles), handles are partitioned integers:  the
>>> high-order bits tell what kind of entity, while the rest tell which of
>>> those entities it is.  So why not add another prefix specifically for
>>> parts?  Then you've got an ample space for dealing with globally unique
>>> part handles, the low order part of which can still be an index into an
>>> array of sets (now there will be two arrays of sets, a (typically) small
>>> one for parts and a large one for other sets).  It's no harder to grab
>>> the memory for a part/set than a regular set, -and- you have an easy way
>>> to identify erroneous calls to iMesh subsetting functions with part
>>> handles.  Now, it's possible (even likely) than my understanding of how
>>> MOAB does things is way off and it isn't nearly as easy as this.  But if
>>> it is, then your implementation issues for global part handles vanish,
>>> and we can lay this whole thing to rest without additional debate.
>>>
>>>> 2. Deriving the proc from a part handle will still need to be done
>>>> through a function in iMeshP, to avoid forcing applications to do
>>>> bit-wise operations on the handles (e.g. Fortran 77 doesn't have bitwise
>>>> operations).  If we need to do this interpretation in a function
>>>> anyway, why not also let the implementation decide how to track the
>>>> part/proc assignment?
>>>>
>>>> 3. If you allow applications to directly find the proc from the part
>>>> handle (without going through an iMesh function), then besides
>>>> specifying embedding the proc in the handle, you'll also
>>>> need to specify exactly how that's done, e.g. which bits hold the rank,
>>>> and whether that number of bits can change depending on the max # procs
>>>> you'd like to support with that mesh.  Are we confident we can get a
>>>> consensus on that?
>>> We've currently got a function for that in the interface.
>>>
>>>> 4. If the application is telling Zoltan about the parts, and a part
>>>> can reside on only one processor, then doesn't that mean that Zoltan
>>>> finds out where the parts reside at the same time?  If messages are
>>>> sent containing the part handle, is it so much trouble to add one
>>>> extra integer denoting the proc where that part resides?  If it's a
>>>> point to point message, you won't even have to do that, since you know
>>>> where the message came from (for MPI, anyway).
>>> Yes.  But (IMO) the main advantage to a global handle space for parts is
>>> in having all handles be congruent and in the iMeshP interface, not in
>>> how Zoltan massages data.
>>>
>>>> Part II
>>>>
>>>> Also, some comments on the proposed requirements and justification of
>>>> them:
>>>>
>>>>   + Part handles are globally unique (so that remote part handles make
>>>> sense and Zoltan doesn't go insane);
>>>> - remote part handles won't exist on a processor unless they were
>>>> communicated there somehow; in that case the implementation will be
>>>> able to store the proc # with (not in) the handle if it so chooses
>>> Right.  We're debating whether -passing- the rank through the iMeshP
>>> interface should be mandatory or forbidden (or at least, that's a
>>> consequence of the outcome).
>>>
>>>> - didn't I hear you say, Karen, that Zoltan computes its own 0..n-1
>>>> numbering of parts anyway?  Or, will you eliminate that if we have
>>>> unique part handles?
>>>> - are we assuming an application will allocate all the part handles
>>>> before calling Zoltan for the partitioning case?  Does it actually have
>>>> to migrate entities to the processor owning a part in order to assign
>>>> those entities to that part?  Or, can a given processor only assign
>>>> local entities to local parts?  That seems like a pretty severe
>>>> limitation (and would only allow you to partition for increasing #'s
>>>> of processors besides)
>>> For repartitionings with the same number of parts, when Zoltan hands
>>> back a list of part names (of whatever form) for each entity on each
>>> process, the local process has enough information to then spray data
>>> around to where it needs to go.
>>>
>>> For repartitionings with different numbers of parts, I can easily
>>> imagine a service that creates parts with no entities (for the case of
>>> the new partition having more parts), calls Zoltan, redistributes data,
>>> and deletes parts (for the case of the new partition having fewer parts).
>>>
>>> I don't see that either of these cases is materially affected (either in
>>> how it operates or in efficiency) by how the part name is represented,
>>> although Zoltan might have to be tweaked to accept a two-piece part name
>>> --- a longer collection of bits to go into the lookup table.
>>>
>>>>    +  Part handles allow identification (without communication or
>>>> O(#parts) storage) of what process the part lives on;
>>>>
>>>> - so let the implementation decide whether to store the process
>>>> alongside the part handle.  Even if an implementation decides to do
>>>> that, that still allows retrieval of the process assignment w/o
>>>> communication or O(#parts) storage, except in the case where you have
>>>> O(#parts) handles, in which case you're already storing that amount of
>>>> information.
>>> That's fine, but we have to either pass that rank through the interface
>>> or not, which is what we're trying to decide.
>>>
>>>> Part III: questions/comments arising from reading current draft:
>>>>
>>>> _createPartitionPar: do all processors have to assign the same handle to
>>>> the partition?  How does one designate partitions on the remote
>>>> processor?
>>> I thought we'd agreed that the partition handle isn't necessarily the
>>> same on all procs, but that since all processes will be running the same
>>> code, the partition handle will have the same variable name, so it won't
>>> matter.
>>>
>>>> _getNumPartsOnRank, _getPartsOnRank: having unique part handles and proc
>>>> rank embedded in the part handle is insufficient for satisfying these
>>>> functions.  These functions require either O(n) storage or communication
>>>> (or assumptions about how a total # parts is distributed).  I just
>>>> noted that that was discussed in the bootcamp and email thread.
>>>> However, these functions are still in the DraftInterface doc as of
>>>> 5/27 (with comments later about this issue)
>>> I agree with Tim here:  these funcs require either storage or
>>> communication.  And do we have a need for them?
>>>
>>> Carl
>>>
>> --
>> ================================================================
>> "You will keep in perfect peace him whose mind is
>>    steadfast, because he trusts in you."               Isaiah 26:3
>>
>>               Tim Tautges            Argonne National Laboratory
>>           (tautges at mcs.anl.gov)      (telecommuting from UW-Madison)
>>           phone: (608) 263-8485      1500 Engineering Dr.
>>             fax: (608) 263-4499      Madison, WI 53706
>>
>>
>>
> 
> 
> 
> 

-- 
================================================================
"You will keep in perfect peace him whose mind is
   steadfast, because he trusts in you."               Isaiah 26:3

              Tim Tautges            Argonne National Laboratory
          (tautges at mcs.anl.gov)      (telecommuting from UW-Madison)
          phone: (608) 263-8485      1500 Engineering Dr.
            fax: (608) 263-4499      Madison, WI 53706