<div class="gmail_quote">On Thu, Nov 18, 2010 at 22:50, Dmitry Karpeev <span dir="ltr"><<a href="mailto:karpeev@mcs.anl.gov">karpeev@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div id=":15v">In general, I don't see how ISAllGather can be any better than<br>
ISGetTotalIndices:<br>
the operations underneath are essentially the same.<br>
ISGetTotalIndices can take advantage of any special structure just as<br>
ISAllGather could:<br>
by specializing the implementation to those subtypes. This can save<br>
some communication<br>
(e.g., ISAllGather_Stride might need to gather the initial offsets,<br>
strides and the total sizes<br>
only). The minute the actual indices are requested from the<br>
resulting IS, it will have to allocate<br>
the same amount of storage that ISGetTotalIndices requires.<br></div></blockquote><div><br></div><div>Indeed, but there are a lot of operations you can do on an IS without getting the indices. Also, gathering strides is O(# procs) instead of O(global problem size), so there is a big communication difference. I'm not saying the difference will be immediately realized for the important use cases, but one leaves a straightforward way to optimize, the other offers no such thing.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div id=":15v"><div class="im">
</div>ISAllGather followed by ISDifference is NOT what I need from<br>
ISGetNonlocalIndices.<br>
That would eliminate the indices in the range of the local portion of<br>
the IS, which might<br>
also wipe out some of the remote indices just gathered: multiple ranks<br>
might contain the same<br>
global index. Furthermore, even if all of the local ranges within the<br>
IS are disjoint,<br>
ISDifference is a relatively complicated operation, while I can easily<br>
excise the local<br>
part after the MPI_Allgatherv occurred, because, as a byproduct, I<br>
obtain the offset<br>
of the local part in the gathered indices. Otherwise I'd have to<br>
reconstruct it using a scan.</div></blockquote></div><br><div>Okay, but what is the meaning of a parallel IS where a given global index shows up in multiple procs' owned parts? In particular when do you need the "nonlocal indices" of such an IS? I guess the answer is that it would be needed to produce a "blown up" global matrix where the local pieces were overlapping. I know I mentioned this as a dirty hack for building overlapping matrices on subcomms, but the thing you are actually looking for is a matrix on a subcomm, where there are no duplicate indices in the IS of the subcomm. If you are doing that by making MatGetSubMatrices() handle (isrow,iscol) on subcomms, then I think you would get the right thing with ISDifference. Note that ISDifference can be implemented very efficiently if the arrays are sorted (I feel like I wrote code for that, or something very similar, not too long ago.)</div>
<div><br></div><div>Jed</div>