itaps-parallel Micro-migration update
Mark Shephard
shephard at scorec.rpi.edu
Thu Mar 6 12:12:41 CST 2008
A couple of quick comments.
Carl Ollivier-Gooch wrote:
> Tim Tautges wrote:
>> Ok, so I'm still mulling over how to express exchanging tag data
>> between parts, and in doing so I've gone back and reviewed Carl's
>> email below. In thinking about migration, the following occurs to me:
>>
>> 1. We already need the ability to send ghost entities between parts,
>> and I'd argue the ability to remove ghost entities on a part
>> 2. We should have the ability for an application to change the
>> ownership of a given entity (at least, when it's shared between the
>> before and after owning parts)
>>
>> So, why not implement migration out of its constituent parts, e.g.
>> first create ghosts on the destination processor, then change the
>> ownership?
>
> Tim, I think your question is motivated by thinking about the push
> migration, right? In that scenario, I can see doing two calls as suggest:
>
> 1. Push this entity as a ghost (A: sends to B; B: confirms local handle
> to A)
>
> 2. Push ownership to B (A: sends ownership to B)
>
> That's one message more than you'd get with a push migration, but maybe
> that's okay, especially in bulk. So maybe for things like load balance
> / re-partition this is a reasonable choice (though I'm pretty sure it'll
> be slower and I -think- it actually increases the number of functions in
> the interface... ghosting and migration can be distinguished by a flag...)
>
> But in any case, I see the most common scenario for micro-migration as
> being a pull: a process wants to operate on something it owns but needs
> to own a star around it as well. In this scenario, if you don't already
> -have- the ghost, you don't know what to request, so the process would
> look something like:
>
> 1. Request a ghost star (A: sends to B; B: sends data; A: confirms
> local handles to B)
>
> 2. Request ownership change (A: requests; B: confirms (necessary in
> case some other part beats A to it...)).
This is correct - it is a pull.
>
> Whether or not you agree that my outline is capable of handling complex
> scenarios, I don't see how the two-request version is going to be
> competitive with the one-request version for pull-migration. So the
> syntax I propose below doesn't include this.
Considering that mesh migration is communication dominate even on a Blue
Gene, doubling the number of communication steps is a real bad thing.
>
> Note: I haven't tried to come up with good, compact names at this
> point, nor am I yet proposing syntax for array versions.
>
> A. Request in-migration of an entity (this is a pull migration). This
> entity must be on the part bdry and is identified by local handle, and
> the implementation handles the rest. If include_upward_adj is true,
> then stuff on the remote part also gets migrates (-all-
> higher-dimensional entities). This operation will require multiple
> rounds of communication, and at some times certain entities may be
> locked (unavailable for local modification) while info about their
> remote copies is still in question.
>
> void prefix_migrateEntity(iMesh_Instance instance,
> const prefix_PartitionHandle partition_handle,
> const entity_handle local_entity_handle,
> bool include_upward_adj, int *err);
>
> B. Update vertex coordinates. One could argue that we could overload
> the setVtxCoords function to do this, and maybe we should. But that
> obfuscates when communication could occur. The communication here is
> push-and-forget.
>
> void prefix_updateVtxCoords(iMesh_Instance instance,
> const prefix_PartitionHandle partition_handle,
> const entity_handle local_vertex_handle,
> int *err);
>
> C. Poll for messages. The internals of this function are going to have
> to cover a lot of ground. The array in the return is there as a
> placeholder to tell the application that something interesting / useful
> has been done to a handle. This might indicate successful in-migration,
> a recent change in vertex location, or successful completion of handle
> matching.
>
> void prefix_pollForRequests(iMesh_Instance instance,
> const prefix_PartitionHandle partition_handle,
> entity_handle **handles_available,
> int *handles_allocated,
> int *handles_size,
> int *err);
>
> D. Done with micro-migration. This is a blocking call, to get
> everything up-to-date and back in synch. Essentially, waits for all
> message traffic to clear, as well as (possibly) rebuilding a bunch of
> ghost info that was allowed to go obsolete.
>
> void prefix_synchParts(iMesh_Instance instance,
> const prefix_PartitionHandle partition_handle,
> int *err);
>
> E. Replace entities. This refers to changes on the part bdry where the
> application/service is responsible for ensuring that things are done
> identically on both sides and that the args are passed in an order that
> can be matched. (Specifically, matching new entities should appear in
> the same order in the call array.) Communication here could be a
> two-way push-and-forget, or some variant on push-and-confirm.
>
> void prefix_replaceOnPartBdry(iMesh_Instance instance,
> const prefix_PartitionHandle partition_handle,
> const entity_handle *old_entities,
> const int old_entities_size,
> const entity_handle *new_entities,
> const int new_entities_size,
> int *err);
>
> F. As Tim suggests, the ability to create and delete ghosts is likely
> to be useful, even though for common topologically-based cases, ghost
> maintainence can (and IMO should) be handled automagically, either at
> migration time or during prefix_synchParts. Communication here is
> push-and-confirm for creation (so that the original knows ID's of the
> ghosts), push-and-forget for deletion. I'm assuming here that the
> closure of a new ghost will be pushed automatically as part of the
> underlying communication, and that the remote part will clean up the
> closure as appropriate during deletion. Finally, note that createGhost
> could easily be tweaked to handle a micro-push migration: change the
> name and add a flag.
>
> void prefix_createGhost(iMesh_Instance instance,
> const prefix_PartitionHandle partition_handle,
> const prefix_PartHandle target_part,
> const entity_handle ghost_to_push,
> int *err);
>
> void prefix_deleteGhostOf(iMesh_Instance instance,
> const prefix_PartitionHandle partition_handle,
> const prefix_PartHandle target_part,
> const entity_handle ghost_to_purge,
> int *err);
>
> I think that'll do for now. At least we've got a starting point for
> iteration. :-)
>
> Carl
>
>
More information about the itaps-parallel
mailing list