itaps-parallel Comments on parallel stuff so far.
Carl Ollivier-Gooch
cfog at mech.ubc.ca
Mon Oct 29 15:25:21 CDT 2007
Hello, all.
For a while there, things were moving too fast for me to keep up. :-)
Now that I've read through all the email, I've got some comments, which
I'm sure comes as a shock to all of you. :-)
First, some comments on the long thread about (essentially) using
existing set functionality versus not:
Part of the problem may be that we currently have no clearly-defined
top-level partition. As a result, we don't have a way to create a new
part (and keep track of it). We also don't have a way to segregate two
separate partitionings. Some of Tim's comments (for example, about the
use of a tag convention to identify which set handles are parts, and
presumably to use the value of that tag to identify which partition they
belong to) may stem from the need to carry multiple partitions at once,
for which there's currently no proposal about functionality.
(Question as an aside for Tim: Are you thinking about different
partitionings of the same mesh, or different meshes on the same domain
with different partitionings? If the former, are the partitionings
nested, or not necessarily?)
We should at least consider having a "partition" object that holds all
the top-level data, including handles for all the parts, part-processor
mapping, maybe an MPI communicator for the partition, etc. Then we can
create/destroy parts (including registering them with the partition;
obviously creating the partition with some number of parts is a great
piece of common shorthand), assign them to processors, report on the
mapping of parts to processors, etc, which is functionality that may be
(as I read it) only partially proposed at this point. I recognize that
some of these functions may turn out to be trivial, but conceptually
they're still worth considering because it's functionality that pretty
clearly needs to be there. Obviously some of this functionality is
already proposed in various pieces that are currently under discussion.
This level also seems to be the logical place for global calls, like
those Tim proposes, as well as some of those in the RPI document (like
getTotNumOfParts, getNumOfPartsArr, etc). An advantage to this is that
we can then use a partition handle of some sort in calls like
getGlobalPartOnProc (one of numerous examples) that require the ability
to distinguish between multiple partitions when multiple partitions exist.
Also, we need to be able to identify which partition a part belongs to.
This can be a function or a tag convention; I'm thinking of it just as
functionality at this point; whether it eventually is a function or a
conventional tag or something else is a question for a bit later.
Overall, what I'm proposing looks more or less like the following:
Partition level:
Data:
. IDs of all parts (every part belongs to a partition, like every
serial set belongs to a root set)
. mapping of parts to processors
. the existence of a list of processors implies an MPI communicator,
presumably; whether/how to expose this communicator to apps?
New Functionality:
. getTotNumOfParts
. getNumOfPartsArr
. global<->local part ID mapping and part<->processor mapping,
supplementing or superceding what's currently proposed
(getPartIDArrOnProc, isPartOnProc, getGlobalPartOnProc,
getLocalPartOnProc)
. addPartOnProc, rmvProcOnPart
. createPart, destroyPart
. getNumEntsPar, getNumEntSetsPar, getNumEntsProcPar,
getNumEntSetsProcPar
. getTagsPar, getTagsProcPar, get by tag
. global reduce on tags
. get/setGlobalIDSize
Processor level:
Data:
. IDs of all parts on that processor
. iMesh instance
New Functionality:
. getNumOfPartsOnProc
. getPartsOnProc (all part handles)
. part iterator (could be replaced by mapping local ID to part
handle)
. local ID <-> (local) part handle
. iMesh handle
. what partition do I belong to?
Part level:
Data:
. owned entities
. copies of non-owned interpart bdry entities
New Functionality:
. getNghbPartIds (if this returns global part IDs (proc rank + local
ID), then get getNghbProcRanks is redundant)
. part bdry iterators (both single and array versions)
. part bdry size info
. getOwnerOfEnt, getCopiesOfEnt, getCopyOfEnt, getNumOfCopiesOfEnt
. entity categorization within part
. addCopyOfEnt, rmvCopyOfEnt
. disambiguate ownership of shared entity
. what partition do I belong to?
. get/set/createGlobalID[Arr]
. getEntHandle[Arr] from global ID
======================================================================
On the question of how to add entities to parts, I think we need to
examine the use cases for adding entities to parts. I can think of
several scenarios:
1. A new entity has been created by a modify operation and needs to
be associated with a part.
2. An entity is being migrated from one part to another (small
one-to-one migration due to change of ownership; data is almost
certainly moving from one part to another).
3. A copy is being created (depending on the implementation, some
data may have to copied from one part to another).
4. A new partition has been created and parts must be populated
(massive all-to-all migration; vast amounts of data flying everywhere).
That list probably isn't exhaustive, but it'll do for now. Of those
four, I see one, possibly two, that -don't- involve parallel data
transfer. Also all (potentially) will affect not only ownership of the
primary entities partitioned, but also the part boundaries. At the very
least, we need to be very clear that we're doing things that involve
parallel communication here. One possible way to handle this:
o We could handle scenario 1 cleanly with set syntax, with the
-specific- caveat clearly documented that this function is for use only
for local entities. This is enforceable by requiring the entity args to
be handles (necessarily local to the processor, though not the part) and
mandating an error return when trying a back-door migration between
parts on a single processor. This is certainly possible, but is it
common enough and unambiguous enough to justify doing it, especially in
the face of a need for a different mechanism for other scenarios?
o As I just hinted at, I don't think the current set calls are
appropriate for Scenarios 2-4 (and other migration scenarios), because
the syntax then hides that there's parallel communication going on (as
well as the issue of needing to be able to grab a communicator,
possibly). Among other viable approaches, the migration scenarios could
be handled by an explicit sendEntToPart/receiveEnts call pair, which I
envision using syntax that is qualitatively like this:
iPart_sendEntArrToParts(source_part_handle, ent_handle_array,
target_part_ID_array, MIGRATE);
iPart_sendEntArrToParts(source_part_handle, ent_handle_array,
target_part_ID_array, COPY);
iPart_receiveEnts;
Yes, I can envision potentially sending ghost data using this same
approach. And yes, the actual data packaged up and sent from part to
part is going to need to be implementation dependent (and possibly
application specifiable...). And yes again,
sendEntArrToParts(...MIGRATE) should delete local data; when is a good
question... No, I don't think single-entity versions of these are likely
to be a good idea for scalability reasons.
The last call (receiveEnts) would block waiting for data; a broadcast up
front is probably needed to make sure every processor knows what
messages it has to wait for. Once all the data is received, then the
receiveEnts call can also do whatever implementation-dependent things
are needed to make sure the interior/bdry/ghost info has been properly
updated internally.
I like the pair of calls not only because it gives a natural place to
update things like part bdry info, but also because it allows at least
some latency hiding, which an immediate, one-call migrate functionality
doesn't.
================================================================
And now that I've no doubt said something that everyone will agree with
and something that everyone will disagree with, I'll move on to some
specific comments. :-)
RPI (Parts, etc)
I had some comments about iProcParts, iPart, iPartMesh that are now
covered by the discussion above.
part iterator: Natural analog to entities, BUT do we really need this,
given that the array of part handles for a given processor is likely to
be small enough that we won't mind just grabbing the whole array and
iterating outside the interface?
getPartMesh: Given that we can call all applicable iMesh functions with
a part handle, do you see a role for this other than mesh modification?
[Comments on part<->processor mapping subsumed above.]
isEntOnPartBdry, isEntOwner, isEntGhost: I think we should combine these
into a single call with an enumerated "return" value. Simpler now, and
allows for future expansion in the list of properties we care about, if
needed.
Also, need a way to set/negotiate ownership of shared entities on part
boundaries.
Tim's stuff:
getEntByTypeAndTagPar, getEntSetByTagPar: We don't have this
functionality in serial. Maybe we should, but we currently don't.
getTag*Operate: Again, we haven't got this in serial. Does the
existence of such operations imply that we expect to implement fields as
tags? (Because that wasn't what I was assuming about field
implementations at all, personally...) Note that I'm not opposed to
this sort of global reduction operation, I just wonder whether it'll see
use outside of field-like situations. If not, then it should be in
parallel fields, not parallel mesh, and usage for
fields-implemented-as-tags should be handled there.
Global IDs:
One question here. As I understand it, the intention that
getEntHandle[Arr] will return a local handle (owned, copied, or ghosted)
for the global ID if possible, and otherwise give an error. Is that
correct?
I now return you to your regularly scheduled afternoon.
Carl
--
------------------------------------------------------------------------
Dr. Carl Ollivier-Gooch, P.Eng. Voice: +1-604-822-1854
Associate Professor Fax: +1-604-822-2403
Department of Mechanical Engineering email: cfog at mech.ubc.ca
University of British Columbia http://www.mech.ubc.ca/~cfog
Vancouver, BC V6T 1Z4 http://tetra.mech.ubc.ca/ANSLab/
------------------------------------------------------------------------
More information about the itaps-parallel
mailing list