[petsc-users] Why PetscDestroy global collective semantics?

Lawrence Mitchell wence at gmx.li
Mon Oct 25 06:34:36 CDT 2021


Hi all,

(I cc Jack who is doing the implementation in the petsc4py setting)

> On 24 Oct 2021, at 06:51, Stefano Zampini <stefano.zampini at gmail.com> wrote:
> 
> Non-deterministic garbage collection is an issue from Python too, and firedrake folks are also working on that.
> 
> We may consider deferring all calls to MPI_Comm_free done on communicators with 1 as ref count (i.e., the call will actually wipe out some internal MPI data) in a collective call that can be either run by the user (on PETSC_COMM_WORLD), or at PetscFinalize() stage.
> I.e., something like that
> 
> #define MPI_Comm_free(comm) PutCommInAList(comm)
> 
> Comm creation is collective by definition, and thus collectiveness of the order of the destruction can be easily enforced.
> I don't see problems with 3rd party libraries using comms, since we always duplicate the comm we passed them

> Lawrence, do you think this may help you?

I think that it is not just MPI_Comm_free that is potentially problematic.

Here are some additional areas off the top of my head:

1. PetscSF with -sf_type window. Destroy (when the refcount drops to zero) calls MPI_Win_free (which is collective over comm)
2. Deallocation of MUMPS objects is tremendously collective. 

In general the solution of just punting MPI_Comm_free to PetscFinalize (or some user-defined time) is, I think, insufficient since it requires us to audit the collectiveness of all `XXX_Destroy` functions (including in third-party packages).

Barry's suggestion of resurrecting objects in finalisation using PetscObjectRegisterDestroy and then collectively clearing that array periodically is pretty close to the proposal that we cooked up I think.

Jack can correct any missteps I make in explanation, but perhaps this is helpful for Alberto:

1. Each PETSc communicator gets two new attributes "creation_index" [an int64], "resurrected_objects" [a set-like thing]
2. PetscHeaderCreate grabs the next creation_index out of the input communicator and stashes it on the object. Since object creation is collective this is guaranteed to agree on any given communicator across processes.
3. When the Python garbage collector tries to destroy PETSc objects we resurrect the _C_ object in finalisation and stash it in "resurrected_objects" on the communicator.
4. Periodically (as a result of user intervention in the first instance), we do garbage collection collectively on these resurrected objects by performing a set intersection of the creation_indices across the communicator's processes, and then calling XXXDestroy in order on the sorted_by_creation_index set intersection.


I think that most of this infrastructure is agnostic of the managed language, so Jack was doing implementation in PETSc (rather than petsc4py).

This wasn't a perfect solution (I recall that we could still cook up situations in which objects would not be collected), but it did seem to (in theory) solve any potential deadlock issues.

Lawrence


More information about the petsc-users mailing list