[petsc-users] Why PetscDestroy global collective semantics?

Alberto F. Martín amartin at cimne.upc.edu
Tue Oct 26 06:48:58 CDT 2021


Thanks all for this second round of detailed responses. Highly appreciated!


I think that I have enough material to continue exploring a solution in 
our particular context.


Best regards,

  Alberto.


On 25/10/21 11:12 pm, Betteridge, Jack D wrote:
> Hi Everyone,
>
> I cannot fault Lawrence's explanation, that is precisely what I'm 
> implementing. The only difference is I was adding most of the logic 
> for the "resurrected objects map" to petsc4py rather than PETSc. Given 
> that this solution is truly Python agnostic, I will move what I have 
> written to C and merely add the interface to the functionality to 
> petsc4py.
>
> Indeed, this works out better for me as I was not enjoying writing all 
> the code in Cython! I'll post an update once there is a working 
> prototype in my PETSc fork, and the code is ready for testing.
>
> Cheers,
> Jack
>
>
> ------------------------------------------------------------------------
> *From:* Lawrence Mitchell <wence at gmx.li>
> *Sent:* 25 October 2021 12:34
> *To:* Stefano Zampini <stefano.zampini at gmail.com>
> *Cc:* Barry Smith <bsmith at petsc.dev>; "Alberto F. Martín" 
> <amartin at cimne.upc.edu>; PETSc users list <petsc-users at mcs.anl.gov>; 
> Francesc Verdugo <fverdugo at cimne.upc.edu>; Betteridge, Jack D 
> <j.betteridge at imperial.ac.uk>
> *Subject:* Re: [petsc-users] Why PetscDestroy global collective 
> semantics?
>
> *******************
> This email originates from outside Imperial. Do not click on links and 
> attachments unless you recognise the sender.
> If you trust the sender, add them to your safe senders list 
> https://spam.ic.ac.uk/SpamConsole/Senders.aspx 
> <https://spam.ic.ac.uk/SpamConsole/Senders.aspx> to disable email 
> stamping for this address.
> *******************
> Hi all,
>
> (I cc Jack who is doing the implementation in the petsc4py setting)
>
> > On 24 Oct 2021, at 06:51, Stefano Zampini 
> <stefano.zampini at gmail.com> wrote:
> >
> > Non-deterministic garbage collection is an issue from Python too, 
> and firedrake folks are also working on that.
> >
> > We may consider deferring all calls to MPI_Comm_free done on 
> communicators with 1 as ref count (i.e., the call will actually wipe 
> out some internal MPI data) in a collective call that can be either 
> run by the user (on PETSC_COMM_WORLD), or at PetscFinalize() stage.
> > I.e., something like that
> >
> > #define MPI_Comm_free(comm) PutCommInAList(comm)
> >
> > Comm creation is collective by definition, and thus collectiveness 
> of the order of the destruction can be easily enforced.
> > I don't see problems with 3rd party libraries using comms, since we 
> always duplicate the comm we passed them
>
> > Lawrence, do you think this may help you?
>
> I think that it is not just MPI_Comm_free that is potentially problematic.
>
> Here are some additional areas off the top of my head:
>
> 1. PetscSF with -sf_type window. Destroy (when the refcount drops to 
> zero) calls MPI_Win_free (which is collective over comm)
> 2. Deallocation of MUMPS objects is tremendously collective.
>
> In general the solution of just punting MPI_Comm_free to PetscFinalize 
> (or some user-defined time) is, I think, insufficient since it 
> requires us to audit the collectiveness of all `XXX_Destroy` functions 
> (including in third-party packages).
>
> Barry's suggestion of resurrecting objects in finalisation using 
> PetscObjectRegisterDestroy and then collectively clearing that array 
> periodically is pretty close to the proposal that we cooked up I think.
>
> Jack can correct any missteps I make in explanation, but perhaps this 
> is helpful for Alberto:
>
> 1. Each PETSc communicator gets two new attributes "creation_index" 
> [an int64], "resurrected_objects" [a set-like thing]
> 2. PetscHeaderCreate grabs the next creation_index out of the input 
> communicator and stashes it on the object. Since object creation is 
> collective this is guaranteed to agree on any given communicator 
> across processes.
> 3. When the Python garbage collector tries to destroy PETSc objects we 
> resurrect the _C_ object in finalisation and stash it in 
> "resurrected_objects" on the communicator.
> 4. Periodically (as a result of user intervention in the first 
> instance), we do garbage collection collectively on these resurrected 
> objects by performing a set intersection of the creation_indices 
> across the communicator's processes, and then calling XXXDestroy in 
> order on the sorted_by_creation_index set intersection.
>
>
> I think that most of this infrastructure is agnostic of the managed 
> language, so Jack was doing implementation in PETSc (rather than 
> petsc4py).
>
> This wasn't a perfect solution (I recall that we could still cook up 
> situations in which objects would not be collected), but it did seem 
> to (in theory) solve any potential deadlock issues.
>
> Lawrence
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211026/1ab5849f/attachment.html>


More information about the petsc-users mailing list