itaps-parallel Part handles

Mon Jun 23 09:36:29 CDT 2008

On 6/22/08 8:13 PM, "Tim Tautges" <tautges at mcs.anl.gov> wrote:

> So I'm confused here - are we or are we not requiring that applications
> be able to derive the rank from the part handle *without making an iMesh
> function call*?

The processor information would be obtained from a part handle via a
function call iMeshP_getRankOfPart.

>
> If we are requiring that, a number of issues haven't been address (F77,
> which bits and how many).
>
> If we are not requiring that, and we already have a function for getting
> the rank from a part handle, why are we discussing this at all?  Isn't
> it then an implementation choice?

For any part handle, iMeshP_getRankOfPart cannot invoke communication;
otherwise, processors would have to call it synchronously.

>
> Or, and I'm guessing this is the real issue, are we saying applications
> should be able to pass part handles around, then pass them back into the
> interface and get information from them?  If so, this sounds pretty
> unique.  It also sounds pretty unlikely, considering parts can't be
> created on remote processes.

Yes, I believe we are saying applications should be able to pass part
handles around, then pass them back into the interface and get information
from them.  I assume this is true for other remote handles; that is, if I
pass a remote entity handle to iMesh_getEntType or iMesh_getEntTopo, I get a
correct answer, right?  As Carl has said, for remote copies, we'll need
either part handles that provide processor information or [part handle,
processor rank] pairs.

Karen

>
> - tim
>
> Carl Ollivier-Gooch wrote:
>> Tim Tautges wrote:
>>> Hi all,
>>>   I realize now why I missed this discussion in March, it was in the
>>> middle of a 3 week/3 trip stretch for me, and I got overwhelmed and
>>> didn't follow the discussion closely.
>>>
>>> Anyway, this email is an attempt to summarize my objection to embedding
>>> the proc rank in the part handle.  As background, you should know that I
>>> almost went this way in MOAB (to the point where it's still implemented
>>> in MOAB) but ultimately decided not to, for many of the same reasons I
>>> note below.
>>>
>>> And, sorry if this is dumb and/or offensive, but I still don't see a
>>> logical, non-pejorative statement of the problem whose only, or most
>>> logical, solution is the two primary requirements causing me such
>>> heartburn (part handles must be unique, and processor must be
>>> derivable from part handle).  And that's after reading the March email
>>> thread at least three times, and thinking about this for at least 4
>>> hours over the last couple of days.
>>>
>>> The remainder of this email consists of 3 parts:
>>> Part I: my objections
>>> Part II: questions about the requirements Karen sent out Wed afternoon
>>> Part III: questions/comments arising while looking through current
>>> interface draft
>>
>> Definitions:
>>
>> Name: Complete unique identifier of something in the data model.
>> Handle:  Some series of bits of the same size as a void*; interpretation
>> of those bits is implementation-dependent.
>>
>> Summary (50,000 foot view)
>>
>> What this debate seems to me to reduce to is this:  Should the name of a
>> part consist of a local handle plus a process rank (or equivalents); or
>> of a single global handle that is also used locally and can be used to
>> deduce rank.
>>
>> 10,000 foot view
>>
>> Personally, I don't see the efficiency issue as a big one:  IMO, the
>> overhead of having to translate a part handle to a set handle (for
>> instance) can be reduced to no more than a couple of bitwise operations
>> and an array lookup, and I think all the affected functions are way more
>> expensive than this, especially once you account for the work done on
>> the returned data.  That's obviously true for anything with
>> communication, anything with a large set of return data, and iteration
>> initialization, so I'll omit those functions in my analysis below as
>> being completely irrelevant from an overhead point of view.
>>
>> The strongest argument for handle+rank (IMO) is that this is what's
>> currently required for entities.  (Sets have no unique global ID, not
>> even those that span processes and probably should have...).  On the
>> other hand, a global entity handle space is Right Out for 32-bit
>> machines: 4 billion entities / 100K processors = 40K entities/processor,
>> which is on the thin side (and requires that all entities of all types
>> be numbered consecutively... pointers need not apply).  I doubt that 4
>> billion parts (or even a quarter billion, allowing for some wastage) is
>> going to be a barrier...  (As an aside, are the huge parallel machines
>> 32 bit or 64?)
>>
>> The strongest argument for global handle (IMO) is that the same handle
>> is used both locally and globally for a part (perhaps only shifting
>> rather than eliminating a translation burden, admittedly), and it looks
>> and feels like all the other handles.  This is more elegant, and also
>> eliminates some argument bloat in a number of functions.
>>
>> Single Global Handle
>>
>> Other advantages of the single global handle is that communication
>> partners are clear from the single handle (whether for partitioning or
>> for ghosting, etc), and that names for parts are congruent to names for
>> all the other things in the data model.
>>
>> The disadvantage is that for some (probably for all, I'm guessing)
>> implementations, there will have to be some intermediate work to get
>> from a part handle to the actual memory location where that data is
>> stored.  This is already a fact of life for all of us with entity
>> handles (I know it is for GRUMMP, and I've seen enough of MOAB part
>> handles while debugging the interaction between swapping and MOAB to
>> know that it's an issue there, too; FMDB may manage to avoid this
>> problem, though I can't imagine how).  Personally, I don't see this as a
>> big deal (I've attached an example, based on my expectation that I'll
>> probably implement parts as sets with extra stuff, but represent them
>> externally as handles that are easily distinguishable from pointers,
>> because the low-order bit will be set).
>>
>> Local functions this may invoke a cost that is not completely trivial:
>>
>> getNumPartNbors, getPartNbors, getNumPartBdryEnts, isEntOwner, getEntStatus
>>
>> Code that often asks for entity ownership or status is a potential
>> problem from an efficiency point of view, but only if the search for an
>> entity within a part has a cost that's really small (read: constant in
>> the size of the part).
>>
>> I suspect that calls to get*PartNbors are going to typically be
>> accompanied by communication with those neighbors and that calls to
>> getNumPartBdryEnts are going to typically be accompanied by work
>> proportional to the number of bdry entities.  If so, any translation
>> overhead is lost in the noise for those functions.
>>
>> The part data access functions:
>>
>> getNumOfType, getNumOfTopo (both whole part and set within a part)
>>
>> The overhead for these funcs may be a noticeable percentage of the
>> function cost.   Hard to see these being called enough to contribute
>> significantly to overall runtime, though.
>>
>> Getting and setting tag data on parts.  (get/setEntSet*Data,
>> getAllEntSetTags).  The relative cost here depends on how tags are
>> stored.  Personally, I'm using hash tables (hashing the handle of the
>> thing being tagged), and my hash function is way more expensive than an
>> array lookup, so in my implementation, the overhead of the translation
>> function will be a small percentage.  Other implementations may vary.
>>
>> As for various subsetting functions, those should never have a part
>> handle passed to them, so they aren't affected.
>>
>> LOCAL HANDLE PLUS PROCESS RANK
>>
>> The advantage here is that the same handle space can be used for sets
>> and parts (applies to MOAB, probably GRUMMP; not FMDB).
>>
>> The disadvantage is that now the full name of a part has two pieces
>> (handle and rank), which makes it a completely different animal than all
>> other handles.
>>
>> getRankOfPart:   This function would have to be removed.
>>
>> Functions that will require a rank input arg in addition to part handle:
>>
>> testPart (which will now require communication), getNumPartBdryEnts,
>> getPartBdryEnts, initPartBdryEnt*Iter, getCopyOnPart
>>
>> Functions that will require a rank or array of ranks to be returned (in
>> addition to the current part return data):  (Note that copying the extra
>> data also incurs a small cost.)
>>
>> getNumPartNbors, getPartNbors, getEntOwnerPart, getCopyParts, getCopies,
>> getOwnerCopy
>>
>> All comm functions will now require a rank as well as a part (yes,
>> internally the implementation has to do this anyway):
>>
>> exchEntArrToParts, addGhostOf, rmvGhostOf, exchTags, exchTagsEnt,
>> IExchTags, IExchTagsEnt
>>
>> Functions that are a little dicy (because they are currently defined to
>> give errors for remote parts):
>>
>> isEntOwner, getEntStatus
>>
>> While adding args to functions isn't necessarily that big a deal, why
>> expand the call list if it isn't necessary?
>>
>>
>>> Part I
>>>
>>> 1. If we are required to embed the proc rank in the handle, then
>>> either I cannot use entity sets to represent parts, or I have to embed
>>> the proc in all my set handles, which allows at most 2**(32-n)-1 other
>>> sets available on any processor, where n is the log of the number of
>>> processors; if we need to support millions of processors, n >= 20,
>>> leaving me only 4k sets.  Putting that kind of limitation on the
>>> number of sets will not be acceptable for some important applications
>>> of sets, e.g. geometric topology sets, sets to represent kdtree
>>> decomposition of the local mesh (for point location), or OBB tree
>>> decomposition of surface triangulations (for ray firing).  Saying we
>>> must not implement parts as sets is at least as bad as saying we must
>>> implement parts as sets; furthermore, it prevents the dual-use of a
>>> given set as a part also (something which I already do, and which is
>>> quite useful).
>>
>> Tim, as I understand the way MOAB manages its handle space (based on
>> looking at entity handles), handles are partitioned integers:  the
>> high-order bits tell what kind of entity, while the rest tell which of
>> those entities it is.  So why not add another prefix specifically for
>> parts?  Then you've got an ample space for dealing with globally unique
>> part handles, the low order part of which can still be an index into an
>> array of sets (now there will be two arrays of sets, a (typically) small
>> one for parts and a large one for other sets).  It's no harder to grab
>> the memory for a part/set than a regular set, -and- you have an easy way
>> to identify erroneous calls to iMesh subsetting functions with part
>> handles.  Now, it's possible (even likely) than my understanding of how
>> MOAB does things is way off and it isn't nearly as easy as this.  But if
>> it is, then your implementation issues for global part handles vanish,
>> and we can lay this whole thing to rest without additional debate.
>>
>>> 2. Deriving the proc from a part handle will still need to be done
>>> through a function in iMeshP, to avoid forcing applications to do
>>> bit-wise operations on the handles (e.g. Fortran 77 doesn't have bitwise
>>> operations).  If we need to do this interpretation in a function
>>> anyway, why not also let the implementation decide how to track the
>>> part/proc assignment?
>>>
>>> 3. If you allow applications to directly find the proc from the part
>>> handle (without going through an iMesh function), then besides
>>> specifying embedding the proc in the handle, you'll also
>>> need to specify exactly how that's done, e.g. which bits hold the rank,
>>> and whether that number of bits can change depending on the max # procs
>>> you'd like to support with that mesh.  Are we confident we can get a
>>> consensus on that?
>>
>> We've currently got a function for that in the interface.
>>
>>> 4. If the application is telling Zoltan about the parts, and a part
>>> can reside on only one processor, then doesn't that mean that Zoltan
>>> finds out where the parts reside at the same time?  If messages are
>>> sent containing the part handle, is it so much trouble to add one
>>> extra integer denoting the proc where that part resides?  If it's a
>>> point to point message, you won't even have to do that, since you know
>>> where the message came from (for MPI, anyway).
>>
>> Yes.  But (IMO) the main advantage to a global handle space for parts is
>> in having all handles be congruent and in the iMeshP interface, not in
>> how Zoltan massages data.
>>
>>> Part II
>>>
>>> Also, some comments on the proposed requirements and justification of
>>> them:
>>>
>>>   + Part handles are globally unique (so that remote part handles make
>>> sense and Zoltan doesn't go insane);
>>> - remote part handles won't exist on a processor unless they were
>>> communicated there somehow; in that case the implementation will be
>>> able to store the proc # with (not in) the handle if it so chooses
>>
>> Right.  We're debating whether -passing- the rank through the iMeshP
>> interface should be mandatory or forbidden (or at least, that's a
>> consequence of the outcome).
>>
>>> - didn't I hear you say, Karen, that Zoltan computes its own 0..n-1
>>> numbering of parts anyway?  Or, will you eliminate that if we have
>>> unique part handles?
>>> - are we assuming an application will allocate all the part handles
>>> before calling Zoltan for the partitioning case?  Does it actually have
>>> to migrate entities to the processor owning a part in order to assign
>>> those entities to that part?  Or, can a given processor only assign
>>> local entities to local parts?  That seems like a pretty severe
>>> limitation (and would only allow you to partition for increasing #'s
>>> of processors besides)
>>
>> For repartitionings with the same number of parts, when Zoltan hands
>> back a list of part names (of whatever form) for each entity on each
>> process, the local process has enough information to then spray data
>> around to where it needs to go.
>>
>> For repartitionings with different numbers of parts, I can easily
>> imagine a service that creates parts with no entities (for the case of
>> the new partition having more parts), calls Zoltan, redistributes data,
>> and deletes parts (for the case of the new partition having fewer parts).
>>
>> I don't see that either of these cases is materially affected (either in
>> how it operates or in efficiency) by how the part name is represented,
>> although Zoltan might have to be tweaked to accept a two-piece part name
>> --- a longer collection of bits to go into the lookup table.
>>
>>>    +  Part handles allow identification (without communication or
>>> O(#parts) storage) of what process the part lives on;
>>>
>>> - so let the implementation decide whether to store the process
>>> alongside the part handle.  Even if an implementation decides to do
>>> that, that still allows retrieval of the process assignment w/o
>>> communication or O(#parts) storage, except in the case where you have
>>> O(#parts) handles, in which case you're already storing that amount of
>>> information.
>>
>> That's fine, but we have to either pass that rank through the interface
>> or not, which is what we're trying to decide.
>>
>>> Part III: questions/comments arising from reading current draft:
>>>
>>> _createPartitionPar: do all processors have to assign the same handle to
>>> the partition?  How does one designate partitions on the remote
>>> processor?
>>
>> I thought we'd agreed that the partition handle isn't necessarily the
>> same on all procs, but that since all processes will be running the same
>> code, the partition handle will have the same variable name, so it won't
>> matter.
>>
>>> _getNumPartsOnRank, _getPartsOnRank: having unique part handles and proc
>>> rank embedded in the part handle is insufficient for satisfying these
>>> functions.  These functions require either O(n) storage or communication
>>> (or assumptions about how a total # parts is distributed).  I just
>>> noted that that was discussed in the bootcamp and email thread.
>>> However, these functions are still in the DraftInterface doc as of
>>> 5/27 (with comments later about this issue)
>>
>> I agree with Tim here:  these funcs require either storage or
>> communication.  And do we have a need for them?
>>
>> Carl
>>
>
> --
> ================================================================
> "You will keep in perfect peace him whose mind is
>    steadfast, because he trusts in you."               Isaiah 26:3
>
>               Tim Tautges            Argonne National Laboratory
>           (tautges at mcs.anl.gov)      (telecommuting from UW-Madison)
>           phone: (608) 263-8485      1500 Engineering Dr.
>             fax: (608) 263-4499      Madison, WI 53706
>
>
>