itaps-parallel Comments on parallel stuff so far.

Mon Oct 29 15:25:21 CDT 2007

Hello, all.

For a while there, things were moving too fast for me to keep up. :-) 
Now that I've read through all the email, I've got some comments, which 
I'm sure comes as a shock to all of you. :-)

First, some comments on the long thread about (essentially) using 
existing set functionality versus not:

Part of the problem may be that we currently have no clearly-defined 
top-level partition.  As a result, we don't have a way to create a new 
part (and keep track of it).  We also don't have a way to segregate two 
separate partitionings.  Some of Tim's comments (for example, about the 
use of a tag convention to identify which set handles are parts, and 
presumably to use the value of that tag to identify which partition they 
belong to) may stem from the need to carry multiple partitions at once, 
for which there's currently no proposal about functionality.

(Question as an aside for Tim:  Are you thinking about different 
partitionings of the same mesh, or different meshes on the same domain 
with different partitionings?  If the former, are the partitionings 
nested, or not necessarily?)

We should at least consider having a "partition" object that holds all 
the top-level data, including handles for all the parts, part-processor 
mapping, maybe an MPI communicator for the partition, etc.  Then we can 
create/destroy parts (including registering them with the partition; 
obviously creating the partition with some number of parts is a great 
piece of common shorthand), assign them to processors, report on the 
mapping of parts to processors, etc, which is functionality that may be 
(as I read it) only partially proposed at this point.  I recognize that 
some of these functions may turn out to be trivial, but conceptually 
they're still worth considering because it's functionality that pretty 
clearly needs to be there.  Obviously some of this functionality is 
already proposed in various pieces that are currently under discussion.

This level also seems to be the logical place for global calls, like 
those Tim proposes, as well as some of those in the RPI document (like 
getTotNumOfParts, getNumOfPartsArr, etc).  An advantage to this is that 
we can then use a partition handle of some sort in calls like 
getGlobalPartOnProc (one of numerous examples) that require the ability 
to distinguish between multiple partitions when multiple partitions exist.

Also, we need to be able to identify which partition a part belongs to. 
  This can be a function or a tag convention; I'm thinking of it just as 
functionality at this point; whether it eventually is a function or a 
conventional tag or something else is a question for a bit later.

Overall, what I'm proposing looks more or less like the following:

Partition level:

     Data:
     . IDs of all parts (every part belongs to a partition, like every
       serial set belongs to a root set)
     . mapping of parts to processors
     . the existence of a list of processors implies an MPI communicator,
       presumably; whether/how to expose this communicator to apps?
     New Functionality:
     . getTotNumOfParts
     . getNumOfPartsArr
     . global<->local part ID mapping and part<->processor mapping,
       supplementing or superceding what's currently proposed
       (getPartIDArrOnProc, isPartOnProc, getGlobalPartOnProc,
       getLocalPartOnProc)
     . addPartOnProc, rmvProcOnPart
     . createPart, destroyPart
     . getNumEntsPar, getNumEntSetsPar, getNumEntsProcPar,
       getNumEntSetsProcPar
     . getTagsPar, getTagsProcPar, get by tag
     . global reduce on tags
     . get/setGlobalIDSize

Processor level:

     Data:
     . IDs of all parts on that processor
     . iMesh instance
     New Functionality:
     . getNumOfPartsOnProc
     . getPartsOnProc (all part handles)
     . part iterator (could be replaced by mapping local ID to part
       handle)
     . local ID <-> (local) part handle
     . iMesh handle
     . what partition do I belong to?

Part level:

     Data:
     . owned entities
     . copies of non-owned interpart bdry entities
     New Functionality:
     . getNghbPartIds (if this returns global part IDs (proc rank + local
       ID), then get getNghbProcRanks is redundant)
     . part bdry iterators (both single and array versions)
     . part bdry size info
     . getOwnerOfEnt, getCopiesOfEnt, getCopyOfEnt, getNumOfCopiesOfEnt
     . entity categorization within part
     . addCopyOfEnt, rmvCopyOfEnt
     . disambiguate ownership of shared entity
     . what partition do I belong to?
     . get/set/createGlobalID[Arr]
     . getEntHandle[Arr] from global ID

======================================================================

On the question of how to add entities to parts, I think we need to 
examine the use cases for adding entities to parts.  I can think of 
several scenarios:

   1.  A new entity has been created by a modify operation and needs to 
be associated with a part.
   2.  An entity is being migrated from one part to another (small 
one-to-one migration due to change of ownership; data is almost 
certainly moving from one part to another).
   3.  A copy is being created (depending on the implementation, some 
data may have to copied from one part to another).
   4.  A new partition has been created and parts must be populated 
(massive all-to-all migration; vast amounts of data flying everywhere).

That list probably isn't exhaustive, but it'll do for now.  Of those 
four, I see one, possibly two, that -don't- involve parallel data 
transfer.  Also all (potentially) will affect not only ownership of the 
primary entities partitioned, but also the part boundaries.  At the very 
least, we need to be very clear that we're doing things that involve 
parallel communication here.   One possible way to handle this:

o We could handle scenario 1 cleanly with set syntax, with the 
-specific- caveat clearly documented that this function is for use only 
for local entities.  This is enforceable by requiring the entity args to 
be handles (necessarily local to the processor, though not the part) and 
mandating an error return when trying a back-door migration between 
parts on a single processor.  This is certainly possible, but is it 
common enough and unambiguous enough to justify doing it, especially in 
the face of a need for a different mechanism for other scenarios?

o As I just hinted at, I don't think the current set calls are 
appropriate for Scenarios 2-4 (and other migration scenarios), because 
the syntax then hides that there's parallel communication going on (as 
well as the issue of needing to be able to grab a communicator, 
possibly).  Among other viable approaches, the migration scenarios could 
be handled by an explicit sendEntToPart/receiveEnts call pair, which I 
envision using syntax that is qualitatively like this:

iPart_sendEntArrToParts(source_part_handle, ent_handle_array, 
target_part_ID_array, MIGRATE);
iPart_sendEntArrToParts(source_part_handle, ent_handle_array, 
target_part_ID_array, COPY);
iPart_receiveEnts;

Yes, I can envision potentially sending ghost data using this same 
approach.  And yes, the actual data packaged up and sent from part to 
part is going to need to be implementation dependent (and possibly 
application specifiable...).  And yes again, 
sendEntArrToParts(...MIGRATE) should delete local data; when is a good 
question... No, I don't think single-entity versions of these are likely 
to be a good idea for scalability reasons.

The last call (receiveEnts) would block waiting for data; a broadcast up 
front is probably needed to make sure every processor knows what 
messages it has to wait for.  Once all the data is received, then the 
receiveEnts call can also do whatever implementation-dependent things 
are needed to make sure the interior/bdry/ghost info has been properly 
updated internally.

I like the pair of calls not only because it gives a natural place to 
update things like part bdry info, but also because it allows at least 
some latency hiding, which an immediate, one-call migrate functionality 
doesn't.

================================================================

And now that I've no doubt said something that everyone will agree with 
and something that everyone will disagree with, I'll move on to some 
specific comments. :-)

RPI (Parts, etc)

I had some comments about iProcParts, iPart, iPartMesh that are now
covered by the discussion above.

part iterator: Natural analog to entities, BUT do we really need this, 
given that the array of part handles for a given processor is likely to 
be small enough that we won't mind just grabbing the whole array and 
iterating outside the interface?

getPartMesh: Given that we can call all applicable iMesh functions with
a part handle, do you see a role for this other than mesh modification?

[Comments on part<->processor mapping subsumed above.]

isEntOnPartBdry, isEntOwner, isEntGhost: I think we should combine these
into a single call with an enumerated "return" value.  Simpler now, and
allows for future expansion in the list of properties we care about, if
needed.

Also, need a way to set/negotiate ownership of shared entities on part
boundaries.

Tim's stuff:

getEntByTypeAndTagPar, getEntSetByTagPar: We don't have this
functionality in serial.  Maybe we should, but we currently don't.

getTag*Operate: Again, we haven't got this in serial.  Does the 
existence of such operations imply that we expect to implement fields as 
tags? (Because that wasn't what I was assuming about field 
implementations at all, personally...)  Note that I'm not opposed to 
this sort of global reduction operation, I just wonder whether it'll see 
use outside of field-like situations.  If not, then it should be in 
parallel fields, not parallel mesh, and usage for 
fields-implemented-as-tags should be handled there.

Global IDs:

One question here.  As I understand it, the intention that 
getEntHandle[Arr] will return a local handle (owned, copied, or ghosted) 
for the global ID if possible, and otherwise give an error.  Is that 
correct?

I now return you to your regularly scheduled afternoon.

Carl

-- 
------------------------------------------------------------------------
Dr. Carl Ollivier-Gooch, P.Eng.                   Voice: +1-604-822-1854
Associate Professor                                 Fax: +1-604-822-2403
Department of Mechanical Engineering             email: cfog at mech.ubc.ca
University of British Columbia              http://www.mech.ubc.ca/~cfog
Vancouver, BC  V6T 1Z4                  http://tetra.mech.ubc.ca/ANSLab/
------------------------------------------------------------------------