<div class="gmail_quote">On Thu, Nov 24, 2011 at 17:09, Matthew Knepley <span dir="ltr"><<a href="mailto:knepley@gmail.com">knepley@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div>One key operation which has not yet been discussed is the "push forward" of a mapping as Dmitry put it. Here is a scenario:</div><div>We understand a matching of mesh points between processes.</div></blockquote>
<div><br></div><div>Can I assume that ghosting processes know (owner rank, offset) for each point?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div>
In order to construct a ghost communication (VecScatter), I</div>
<div>need to compose the mapping between mesh points and the mapping of mesh points to data.</div></blockquote><div><br></div><div>Can the mapping of mesh points to data be known through an array (of length num_owned_points) local_offset or, in the case of non-uniform data size, as the array (local_offset, size)?</div>
<div><br></div><div>In that case, the fact that remote processes have (owner rank, offset) means that I can broadcast (local_offset, size) with purely local setup (this is the first primitive which can be implemented using MPI_Get()).</div>
<div><br></div><div>Now the ghosting procs know (owner rank, offset, size) for each point, so again, with purely local setup, the scatter is defined (the forward scatter is implemented with MPI_Get(), the uses MPI_Accumulate() which does a specified reduction). Note that the actual communication has no user-visible packing because it uses an MPI_Datatype.</div>
<div><br></div><div>Alternatively, given (owner rank, offset, size), we can literally call VecScatterCreate() after just an MPI_Scan(), which is logarithmic, and local setup. but VecScatterCreate() does lots of unnecessary setup to build the two-way representation.</div>
<div><br></div><div><br></div><div>Redistributing a mesh after partitioning is slightly more demanding. First, senders are enumerated using a fetch-and-add initiated by the sending process which has the side-effect of counting the number of nodes that will be in the new partition and informing the senders of the offsets at which to deposit those nodes. Then we broadcast the (rank, offset) of each node on the sender to the receiver. Then we send connectivity using a non-uniform broadcast. Now, before moving data, we can reorder locally, inform the sender of the new ordering, and then move all the data.</div>
<div><br></div><div>I think these are all simple loops and single calls to the communication primitives.</div></div>