[petsc-users] Why PetscDestroy global collective semantics?
Patrick Sanan
patrick.sanan at gmail.com
Sun Oct 24 01:29:59 CDT 2021
I think Jeremy (cc‘d) has also been thinking about this in the context of
PETSc.jl
Stefano Zampini <stefano.zampini at gmail.com> schrieb am So. 24. Okt. 2021 um
07:52:
> Non-deterministic garbage collection is an issue from Python too, and
> firedrake folks are also working on that.
>
> We may consider deferring all calls to MPI_Comm_free done on communicators
> with 1 as ref count (i.e., the call will actually wipe out some internal
> MPI data) in a collective call that can be either run by the user (on
> PETSC_COMM_WORLD), or at PetscFinalize() stage.
> I.e., something like that
>
> #define MPI_Comm_free(comm) PutCommInAList(comm)
>
> Comm creation is collective by definition, and thus collectiveness of the
> order of the destruction can be easily enforced.
> I don't see problems with 3rd party libraries using comms, since we always
> duplicate the comm we passed them
>
> Lawrence, do you think this may help you?
>
> Thanks
> Stefano
>
> Il giorno dom 24 ott 2021 alle ore 05:58 Barry Smith <bsmith at petsc.dev>
> ha scritto:
>
>>
>> Ahh, this makes perfect sense.
>>
>> The code for PetscObjectRegisterDestroy() and the actual destruction
>> (called in PetscFinalize()) is very simply and can be found in
>> src/sys/objects/destroy.c PetscObjectRegisterDestroy(), PetscObjectRegisterDestroyAll().
>>
>> You could easily maintain a new array
>> like PetscObjectRegisterGCDestroy_Objects[] and add objects
>> with PetscObjectRegisterGCDestroy() and then destroy them
>> with PetscObjectRegisterDestroyGCAll(). The only tricky part is that you
>> have to have, in the context of your Julia MPI, make sure
>> that PetscObjectRegisterDestroyGCAll() is called collectively over all the
>> MPI ranks (that is it has to be called where all the ranks have made the
>> same progress on MPI communication) that have registered objects to
>> destroy, generally PETSC_COMM_ALL. We would be happy to incorporate such a
>> system into the PETSc source with a merge request.
>>
>> Barry
>>
>> On Oct 23, 2021, at 10:40 PM, Alberto F. Martín <amartin at cimne.upc.edu>
>> wrote:
>>
>> Thanks all for your very insightful answers.
>>
>> We are leveraging PETSc from Julia in a parallel distributed memory
>> context (several MPI tasks running the Julia REPL each).
>>
>> Julia uses Garbage Collection (GC), and we would like to destroy the
>> PETSc objects automatically when the GC decides so along the simulation.
>>
>> In this context, we cannot guarantee deterministic destruction on all MPI
>> tasks as the GC decisions are local to each task, no global semantics
>> guaranteed.
>>
>> As far as I understand from your answers, there seems to be the
>> possibility to defer the destruction of objects till points in the parallel
>> program in which you can guarantee collective semantics, correct? If yes I
>> guess that this may occur at any point in the simulation, not necessarily
>> at shut down via PetscFinalize(), right?
>>
>> Best regards,
>>
>> Alberto.
>>
>>
>> On 24/10/21 1:10 am, Jacob Faibussowitsch wrote:
>>
>> Depending on the use-case you may also find PetscObjectRegisterDestroy()
>> useful. If you can’t guarantee your PetscObjectDestroy() calls are
>> collective, but have some other collective section you may call it then to
>> punt the destruction of your object to PetscFinalize() which is guaranteed
>> to be collective.
>>
>>
>> https://petsc.org/main/docs/manualpages/Sys/PetscObjectRegisterDestroy.html
>>
>> Best regards,
>>
>> Jacob Faibussowitsch
>> (Jacob Fai - booss - oh - vitch)
>>
>> On Oct 22, 2021, at 23:33, Jed Brown <jed at jedbrown.org> wrote:
>>
>> Junchao Zhang <junchao.zhang at gmail.com> writes:
>>
>> On Fri, Oct 22, 2021 at 9:13 PM Barry Smith <bsmith at petsc.dev> wrote:
>>
>>
>> One technical reason is that PetscHeaderDestroy_Private() may call
>> PetscCommDestroy() which may call MPI_Comm_free() which is defined by the
>> standard to be collective. Though PETSc tries to limit its use of new MPI
>> communicators (for example generally many objects shared the same
>> communicator) if we did not free those we no longer need when destroying
>> objects we could run out.
>>
>> PetscCommDestroy() might call MPI_Comm_free() , but it is very unlikely.
>> Petsc uses reference counting on communicators, so in PetscCommDestroy(),
>> it likely just decreases the count. In other words, PetscCommDestroy() is
>> cheap and in effect not collective.
>>
>>
>> Unless it's the last reference to a given communicator, which is a
>> risky/difficult thing for a user to guarantee and the consequences are
>> potentially dire (deadlock being way worse than a crash) when the user's
>> intent is to relax ordering for destruction.
>>
>> Alberto, what is the use case in which deterministic destruction is
>> problematic? If you relax it for individual objects, is there a place you
>> can be collective to collect any stale communicators?
>>
>>
>> --
>> Alberto F. Martín-Huertas
>> Senior Researcher, PhD. Computational Science
>> Centre Internacional de Mètodes Numèrics a l'Enginyeria (CIMNE)
>> Parc Mediterrani de la Tecnologia, UPCEsteve Terradas 5, Building C3, Office 215 <https://www.google.com/maps/search/Esteve+Terradas+5,+Building+C3,+Office+215?entry=gmail&source=g>,
>> 08860 Castelldefels (Barcelona, Spain)
>> Tel.: (+34) 9341 34223e-mail:amartin at cimne.upc.edu
>>
>> FEMPAR project co-founder
>> web: http://www.fempar.org
>>
>> **********************
>> IMPORTANT ANNOUNCEMENT
>>
>> The information contained in this message and / or attached file (s), sent from CENTRO INTERNACIONAL DE METODES NUMERICS EN ENGINYERIA-CIMNE,
>> is confidential / privileged and is intended to be read only by the person (s) to the one (s) that is directed. Your data has been incorporated
>> into the treatment system of CENTRO INTERNACIONAL DE METODES NUMERICS EN ENGINYERIA-CIMNE by virtue of its status as client, user of the website,
>> provider and / or collaborator in order to contact you and send you information that may be of your interest and resolve your queries.
>> You can exercise your rights of access, rectification, limitation of treatment, deletion, and opposition / revocation, in the terms established
>> by the current regulations on data protection, directing your request to the postal address C / Gran Capitá, s / n Building C1 - 2nd Floor -
>> Office C15 -Campus Nord - UPC 08034 Barcelona or via email to dpo at cimne.upc.edu
>>
>> If you read this message and it is not the designated recipient, or you have received this communication in error, we inform you that it is
>> totally prohibited, and may be illegal, any disclosure, distribution or reproduction of this communication, and please notify us immediately.
>> and return the original message to the address mentioned above.
>>
>>
>>
>
> --
> Stefano
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211024/da2cc861/attachment-0001.html>
More information about the petsc-users
mailing list