<div class="gmail_quote">On Sat, Nov 26, 2011 at 14:25, Mark F. Adams <span dir="ltr"><<a href="mailto:mark.adams@columbia.edu">mark.adams@columbia.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div>So each MPI_Get initiates a message and you pack up a message with an array of remote pointers or something?</div></blockquote><div><br></div><div>MPI_Get() has an MPI_Datatype for the remote process. That describes where in the remote buffer the values should be gotten from. This is much like how VecScatter can operate on just part of a Vec.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div><br></div><div>It sounds like you are saying that you have an auxiliary arrays of remote global indices for each processor that you communicate with and your broadcast code looks like:</div>

<div><br></div><div>for all 'proc' that I talk to</div></blockquote><div><br></div><div>for all proc that I _need_ an update from, I don't know or care who needs an update from me</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div> i = 0</div><div> for all 'v' on proc list</div><div>  if v.state == not-done</div></blockquote><div><br></div><div>I didn't think we would bother with communicating only those values that were actually updated. Since this should converge in just a few rounds, I figured that we would just update all ghost points. Note that communicating the metadata to send only those values that have actually changed may make the algorithm less latency-tolerant.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div>    data[i++] = [ &v.state, <a href="http://v.id" target="_blank">v.id</a>, STATE ]  //  vague code here...</div>

<div>  endif</div><div> endfor</div><div> MPI_Get(proc,i,data)</div><div>endfor</div></blockquote><div><br></div><div><br></div><div>I thought we would skip any loop over vertices and just</div><div><br></div><div>one-time setup:</div>

<div>remotes = {}</div><div>for each ghost vertex v:</div><div>    remotes[v.owner_rank] += 1 # add/increment entry for this rank</div><div>nremotes = remotes.size</div><div>build MPI_Datatype their_datatype[nremotes] (using the remote index stored for each ghosted vertex) and my_datatype[nremotes] (using the sequential locations of the ghost vertices in my array) for each rank in remotes, implement as one loop over ghost vertices</div>

<div><br></div><div><br></div><div>communication in each round:</div><div>MPI_Win_fence(0, window)</div><div>for (r,rank) in remotes:</div><div>    MPI_Get(my_ghosted_status_array, 1, my_datatype[r], rank, 0, 1, their_datatype[r], window)</div>

<div>MPI_Win_fence(0, window)</div><div><br></div><div>All the packing and unpacking is done by the implementation and I never traverse vertices in user code.</div></div>