<div class="gmail_quote">On Thu, Nov 24, 2011 at 09:34, Mark F. Adams <span dir="ltr"><<a href="mailto:mark.adams@columbia.edu">mark.adams@columbia.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div><div class="im"><blockquote type="cite"><div>The primitives I can readily provide are</div><div><br></div><div>broadcast from Y to X</div><div>reduce values on X into Y (min, max, sum, replace, etc)</div><div><br></div>

<div>gather values on X into arrays for each point in Y (in some ordering, rank ordering if you like)</div></blockquote><div><br></div></div><div>this is awful.  you would at least need to have a way of getting the processors that it came from, the system would have to have this info, right?</div>

</div></blockquote><div><br></div><div>You don't need it for the implementation. If the user wants it, they can get it trivially (gather the ranks over the same communication graph). Sometimes it doesn't matter, and these primitives won't store that by default. Not storing it also makes graph updates lighter weight because you only need to update on one side.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div><div><br></div><div>also, i think you are getting to general here, i'm not sure why anyone would want to do this but i'm sure someone would and they can just write directly to to MPI-3.</div>

</div></blockquote><div><br></div><div>My point is to have a more user-friendly way to offer this information. The point broadcast ("global to local") and reduce ("local to global with operation") are all you need for many operations.</div>

<div><br></div><div>My intent is to find an abstraction that is more general than VecScatter, higher level than MPI, and that can be used for all non-synchronizing ("neighbor collective") communication patterns in a library like PETSc. Having pointwise gather/scatter allows operations involving all copies of a point (instead of only those operations expressed as a reduction with a binary operation). I think you usually don't need it, but we can provide it cheaply and without extra complexity using this abstraction.</div>

</div>