<div class="gmail_quote">On Sat, Nov 26, 2011 at 10:35, Mark F. Adams <span dir="ltr"><<a href="mailto:mark.adams@columbia.edu">mark.adams@columbia.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div><div>Thats what I'm asking. I assumed that there is some buffering going on here but that is a later question.</div></div></blockquote><div><br></div><div>I assumed that you would either (a) go in discrete rounds where every process does as much as it can or (b) do some predetermined amount of work, initiate the update, do another work unit, complete the last update and initiate a new one, do another work unit, etc. </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div><div class="im"><blockquote type="cite"><div>Remember that my first primitive was "broadcast", which is essentially global-to-local. The map is as a local-to-global map and the "get" is initiated at the local space.</div>
</blockquote></div></div><div><br></div>I was confused, and perhaps still am, the broadcast is the one sided pull of ghost data to the local space. So a broadcast is passive for the broadcaster whereas it implied an active process to me. The reduce is a push of ghost data with the same (only) map to the global vertex.<div>
<br><div>The way I'm writing the algorithm there is no need for a reduce, I think, but this is a detail, this algorithm broadcasts the selected vertex status (and deleted which I forgot in my psuedo-code) and then it infers the local deleted w/o having to have it sent (reduced).</div>
</div></blockquote><div><br></div><div>Sure, either way works. If the data volume for updates was big enough to be a performance limitation (instead of the bottleneck being latency), then we could convert to a message-based model initiated by the owner. It's easy to extract this when we want it, but it's simpler to skip it.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div><div><br></div><div>So I'm thinking a good way to implement this MIS with your model is after each vertex loop, you do what you call a "broadcast" of all new status.</div>
<div><br></div><div>Now I want to see exactly how that is implemented, here is a guess:</div></div></blockquote><div><br></div><div>As part of the distributed graph, you have (owner rank, index) for every ghosted point. As local setup, the communication API makes a local (where in my array the ghost points reside) and remote (where in the owner's array the points reside) MPI_Datatype. Store these in an array of length num_neighbor_ranks.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div><div><br></div><div>MPI_Win_fence</div><div>loop over all of your boundary vertices that are not "done" yet, and do an MPI_Get (I assume that these requests are buffered in someway) </div>
</div></blockquote><div><br></div><div>Not over boundary vertices, it loops over neighbor each neighbor rank and calls one MPI_Get() to fetch the status on all vertices owned by that rank that are ghosted on this rank.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div><div>MPI_Win_fence.</div></div></blockquote><div><br></div><div>The communication need only be finished once you need the result, so you can do more work before calling the second MPI_Win_fence().</div>
</div>