[petsc-dev] ISRestoreNonlocalIS?

Thu Nov 18 14:44:42 CST 2010

  Jed,

    Dmitry and I have had a long conversation about this. I think this new route is correct though it may need adjustments and more thought as it evolves.

    You say that all the support is already there in ISAllGather() so there is no need for this stuff.  But we submit that ISAllGather() is a very ill-defined and clunky thing, see for example its use in 

PetscErrorCode MatGetSubMatrix_MPIAIJ(Mat mat,IS isrow,IS iscol,MatReuse call,Mat *newmat)
{
  PetscErrorCode ierr;
  IS             iscol_local;
  PetscInt       csize;

  PetscFunctionBegin;
  ierr = ISGetLocalSize(iscol,&csize);CHKERRQ(ierr);
  if (call == MAT_REUSE_MATRIX) {
    ierr = PetscObjectQuery((PetscObject)*newmat,"ISAllGather",(PetscObject*)&iscol_local);CHKERRQ(ierr);
    if (!iscol_local) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_WRONGSTATE,"Submatrix passed in was not used before, cannot reuse");
  } else {
    ierr = ISAllGather(iscol,&iscol_local);CHKERRQ(ierr);
  }

Cacheing information about the parallel structure of the IS input into MatGetSubMatrix() inside the gotten matrix is IMHO not a good model (my god, what if someone passed in a different IS next time? just a joke). 

Here is the model we are proposing. An IS is a collection of indices, often stored in parallel. Some operations on IS's only require information about the local part of the indices, other operations require information about parts of the indices on other processes. Ok, that information about remote indices can be obtained through a collective operation. Sometimes one may want that same information about the remote part several different times, well it could do a collective each time or it could cache on the first time and reuse later (whether it caches or recommunicates is internal to the object and merely an optimization and need not be actively visible to the user.) The previous solution to this need for remote information was to use an ISAllGather() and stash that information "somewhere else" for later use (and it is currently stashed elsewhere as a SEQUENTIAL IS), but since that information is ONLY information about that particular IS I submit that any stashing (should it occur) should be inside the IS not inside some other IS object. 

I had Dmitry code it by sticking the additional data directly into the IS instead of having methods for each subclass of IS just because it was simple and could be used for all cases immediately; it could be implemented as methods for the subclasses and maybe will be someday (though I suspect it is not needed).

On Nov 18, 2010, at 2:15 PM, Jed Brown wrote:

> What is going on here?  If IS is caching this nonlocal stuff, then every function causing mutation (e.g. ISGeneralSetIndices, and a few others) needs to be visited and made to discard the gathered and nonlocal caches.

    This is just a coding detail and could be added.  But actually we should probably decide once and for all if IS's are mutable, I tend to think they shouldn't be but we are not completely consistent yet.

>  But I think this whole idea is just dangerous and abuses the IS.  The user can always do ISAllGather if they really want everything, and it's potentially cheaper than this new ISGetTotalIndices because it can maintain ISStride or ISBlock structure.

   Just an implementation issue, not an interface issue, we could provide code for those special cases though I don't think it makes sense.

>  If you want all the indices, use ISAllGather and then ISGetIndices, it will cost the same and then the user has explicit control over when to free this memory.  In principle, ISGetNonlocalIS could be implemented to avoid the intermediate storage of ISAllGather followed by ISDifference, but it's not currently done this way and currently both copies are cached so they have to both exist, and since the IS is managing ownership of this other nonlocal IS, you can't keep the nonlocal part.

   I submit that "keeping the nonlocal part" without the local part is complete nonsense anyways, the "nonlocal information" is meaningless without the context of the original IS.

> 
> I just don't understand the point of all these new functions, they feel like an inferior interface to something that is currently available.

   They are a proper replacement to a previous inferior interface that creates this strange new IS from a given IS. All the new stuff is is a API for accessing nonlocal parts of the IS efficiently and consistently and simply.

    For now ISAllGather() still exists, but in the long run I think we will be able to toss it.

> 
> Same question for ISAllGatherIndices which has been around for a long time.  It looks to me like the user can do ISCreateGeneral(), ISAllGather(), ISGetIndices() to achieve exactly the same thing (no performance difference).  Why even call it IS when no IS (or PETSc object of any kind is used by this function)?

   It is called an IS because it involves the manipulation of sets of indices which is what IS does. It is a static method for the IS class :-)

  Barry

>  It is never used in PETSc.
> 
> Jed