itaps-parallel One particularly difficult thing in one of the use cases...

Tue Jun 1 23:32:23 CDT 2010

On 10-06-01 08:52 PM, Tim Tautges wrote:
> Hi all,
> Here, I attempt to describe one of the particularly difficult parts of
> one of the use cases. I think this at least exposes a potential problem
> in the currently specified iMeshP; I think it also demonstrates the
> problem with entities always having to be in parts, but that's more of a
> subjective statement.
>
> The use case is radiation transport. In a nutshell, consider processes
> arranged in a 2d array, with each column representing an angle, and each
> row a subdomain in the spatial partition. The spatial partition of the
> mesh will be distributed across all processes in a column; a given
> spatial subdomain is copied onto each process in a row.

Oh, now I understand (at least sort of) what's going on with this case...

> Initializing the mesh onto this processor arrangement is done in 3 steps:
> 1. load the mesh onto the 1st column, in a domain-decomposed fashion
> typical of other DD-based codes
> 2. share the mesh from the 1st column across a row of processes
> 3. establish sharing relations in the other columns
>
> It's in step 3 that I see the problem. Here, the mesh for a spatial
> subdomain is already represented in the iMesh Instance on each process,
> but the Partition representing the column hasn't been created yet. After
> you create that Partition, and a Part in that Partition, you need to
> assign the mesh from the Part in the row-based Partition into that Part
> in the column-based Partition. How do you do that? The function
> iMeshP_exchEntArrToPartsAll is a collective call, and implies that
> you're moving entities from one Part to another. But, the entities we're
> talking about here aren't part of any Part in that column Partition. I'd
> prefer to use iMesh_addEntToSet, but I'm guessing that would break other
> implementations.

In fact, as I read it, the spec says specifically that iMesh_addEntToSet 
gets used here, with a part handle passed as the set.  This is no 
different, as I see it, from any other situation in which you're 
creating a partition rather than reading one.  And yes, it's true that 
in the transition of setting up this (or any similar) partition, there's 
necessarily a limbo period in which entities haven't yet been assigned 
to parts.  IMO, this is different than creating entities with no part 
assignment and leaving them in that state.

This is a common enough scenario that I'm sure FMDB and Simmetrix have 
ways of handling this, even though new entities created once a partition 
is established are created in a part.  (For the GRUMMP implementation 
that's on the drawing board, there's no problem here, either.)

My issue with the whole "entities must belong to a part" thing is a 
consistency argument.  I expect that, if we don't require all entities 
to be in parts, at least at the end of parallel services, some 
subsequent parallel operations will not be properly defined.  I don't 
see that outcome as reasonable.  Consider:

   Premise 1 (written in the spec): An entity -must- have an owner if 
it's going to be modified.  (Ownership == right to modify)

   Premise 2:  It's never safe for a service to assume that no other 
service will ever modify the mesh after it's done, nor that other 
services (whether or not they modify the mesh) can tolerate partless 
entities, even if they can be identified as such.  (Ghosting patterns 
could change, for instance, making it so that entities that weren't 
ghosted now are, and need an owner for that reason.)

   If those two premises hold up (and I obviously think they do), any 
parallel service that wants to work and play well with others needs to 
assign all entities to parts before completion.  Yes, I think this even 
applies to the parallel meshing scenario we discussed in the telecon: 
the parallel version of this must, IMO, at the very least post-process 
the mesh that the parallel-unaware mesher produced to get stuff into 
parts.  This would be a wrapper, not a mod to the mesher itself.

As an aside, if an entity isn't explicitly assigned to a part, what 
should a function like getEntOwnerPart return?  And does it pass the 
giggle test for this to -not- return the one part on a process when 
there is only one?  And if unowned parts get an inferred value in this 
function, how does one do that in the presence of multiple parts per 
process and still avoid giggling?

 From a (planned) implementation point of view, I don't think I'm really 
going to care whether entities are assigned to the right part initially, 
assigned to an arbitrary part and moved later, or not assigned to a part 
initially and assigned later.  These should all be cheap in time (O(1) 
with a small constant) and space (no more than an int per entity).

> This example also demonstrates the need either for another function, to
> negotiate shared entities between Parts, or to expand the definition of
> iMeshP_createGhostEntsAll to include the functionality (the latter would
> be most natural for MOAB, since the same function is used in MOAB's
> parallel stuff to do either; I distinguish by allowing the # layers
> specified to be zero, in which case you're requesting the resolution of
> shared entities at an interface).

I agree this functionality is needed when any mesh is partitioned 
instead of being read in parallel.

Carl

-- 
------------------------------------------------------------------------
Dr. Carl Ollivier-Gooch, P.Eng.                   Voice: +1-604-822-1854
Associate Professor                                 Fax: +1-604-822-2403
Department of Mechanical Engineering             email: cfog at mech.ubc.ca
University of British Columbia              http://www.mech.ubc.ca/~cfog
Vancouver, BC  V6T 1Z4                  http://tetra.mech.ubc.ca/ANSLab/
------------------------------------------------------------------------