<div dir="auto">I think Jeremy (cc‘d) has also been thinking about this in the context of PETSc.jl </div><div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Stefano Zampini <<a href="mailto:stefano.zampini@gmail.com">stefano.zampini@gmail.com</a>> schrieb am So. 24. Okt. 2021 um 07:52:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)"><div dir="ltr"><div>Non-deterministic garbage collection is an issue from Python too, and firedrake folks are also working on that.</div><div><br></div><div>We may consider deferring all calls to MPI_Comm_free done on communicators with 1 as ref count (i.e., the call will actually wipe out some internal MPI data) in a collective call that can be either run by the user (on PETSC_COMM_WORLD), or at PetscFinalize() stage.</div><div>I.e., something like that</div><div><br></div><div>#define MPI_Comm_free(comm) PutCommInAList(comm)<br></div><div><br></div><div>Comm creation is collective by definition, and thus collectiveness of the order of the destruction can be easily enforced.</div><div>I don't see problems with 3rd party libraries using comms, since we always duplicate the comm we passed them</div><br><div>Lawrence, do you think this may help you?</div><div><br></div><div>Thanks</div><div>Stefano<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Il giorno dom 24 ott 2021 alle ore 05:58 Barry Smith <<a href="mailto:bsmith@petsc.dev" target="_blank">bsmith@petsc.dev</a>> ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)"><div><div><br></div> Ahh, this makes perfect sense.<div><br></div><div> The code for PetscObjectRegisterDestroy() and the actual destruction (called in PetscFinalize()) is very simply and can be found in src/sys/objects/destroy.c PetscObjectRegisterDestroy(), PetscObjectRegisterDestroyAll(). </div><div><br></div><div> You could easily maintain a new array like PetscObjectRegisterGCDestroy_Objects[] and add objects with PetscObjectRegisterGCDestroy() and then destroy them with PetscObjectRegisterDestroyGCAll(). The only tricky part is that you have to have, in the context of your Julia MPI, make sure that PetscObjectRegisterDestroyGCAll() is called collectively over all the MPI ranks (that is it has to be called where all the ranks have made the same progress on MPI communication) that have registered objects to destroy, generally PETSC_COMM_ALL. We would be happy to incorporate such a system into the PETSc source with a merge request.</div><div><br></div><div> Barry<br><div><br><blockquote type="cite"><div>On Oct 23, 2021, at 10:40 PM, Alberto F. Martín <<a href="mailto:amartin@cimne.upc.edu" target="_blank">amartin@cimne.upc.edu</a>> wrote:</div><br><div>
<div><p>Thanks all for your very insightful answers. <br>
</p><p>We are leveraging PETSc from Julia in a parallel distributed
memory context (several MPI tasks running the Julia REPL each).</p><p>Julia uses Garbage Collection (GC), and we would like to destroy
the PETSc objects automatically when the GC decides so along the
simulation.<br>
</p><p>In this context, we cannot guarantee deterministic destruction on
all MPI tasks as the GC decisions are local to each task, no
global semantics guaranteed.</p><p> As far as I understand from your answers, there seems to be the
possibility to defer the destruction of objects till points in the
parallel program in which you can guarantee collective semantics,
correct? If yes I guess that this may occur at any point in the
simulation, not necessarily at shut down via PetscFinalize(),
right? <br>
</p><p>Best regards,</p><p> Alberto.<br>
</p><p> <br>
</p>
<div>On 24/10/21 1:10 am, Jacob
Faibussowitsch wrote:<br>
</div>
<blockquote type="cite">
Depending on the use-case you may also find
PetscObjectRegisterDestroy() useful. If you can’t guarantee your
PetscObjectDestroy() calls are collective, but have some other
collective section you may call it then to punt the destruction of
your object to PetscFinalize() which is guaranteed to be
collective.
<div><br>
</div>
<div><a href="https://petsc.org/main/docs/manualpages/Sys/PetscObjectRegisterDestroy.html" target="_blank">https://petsc.org/main/docs/manualpages/Sys/PetscObjectRegisterDestroy.html</a></div>
<div><br>
<div>
<div dir="auto" style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
<div dir="auto" style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
<div>Best regards,<br>
<br>
Jacob Faibussowitsch<br>
(Jacob Fai - booss - oh - vitch)<br>
</div>
</div>
</div>
</div>
<div><br>
<blockquote type="cite">
<div>On Oct 22, 2021, at 23:33, Jed Brown <<a href="mailto:jed@jedbrown.org" target="_blank">jed@jedbrown.org</a>> wrote:</div>
<br>
<div>
<span style="font-family:Menlo-Regular;font-size:11px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">Junchao Zhang <</span><a href="mailto:junchao.zhang@gmail.com" style="font-family:Menlo-Regular;font-size:11px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" target="_blank">junchao.zhang@gmail.com</a><span style="font-family:Menlo-Regular;font-size:11px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">> writes:</span><br style="font-family:Menlo-Regular;font-size:11px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
<br style="font-family:Menlo-Regular;font-size:11px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
<blockquote type="cite" style="font-family:Menlo-Regular;font-size:11px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">On Fri, Oct 22, 2021 at 9:13 PM Barry Smith
<<a href="mailto:bsmith@petsc.dev" target="_blank" style="font-family:Menlo-Regular">bsmith@petsc.dev</a>> wrote:<br>
<br>
<blockquote type="cite" style="font-family:Menlo-Regular"><br>
One technical reason is that
PetscHeaderDestroy_Private() may call<br>
PetscCommDestroy() which may call MPI_Comm_free()
which is defined by the<br>
standard to be collective. Though PETSc tries to limit
its use of new MPI<br>
communicators (for example generally many objects
shared the same<br>
communicator) if we did not free those we no longer
need when destroying<br>
objects we could run out.<br>
<br>
</blockquote>
PetscCommDestroy() might call MPI_Comm_free() , but it
is very unlikely.<br>
Petsc uses reference counting on communicators, so in
PetscCommDestroy(),<br>
it likely just decreases the count. In other words,
PetscCommDestroy() is<br>
cheap and in effect not collective.<br>
</blockquote>
<br style="font-family:Menlo-Regular;font-size:11px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
<span style="font-family:Menlo-Regular;font-size:11px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">Unless it's the last reference to
a given communicator, which is a risky/difficult thing
for a user to guarantee and the consequences are
potentially dire (deadlock being way worse than a crash)
when the user's intent is to relax ordering for
destruction.</span><br style="font-family:Menlo-Regular;font-size:11px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
<br style="font-family:Menlo-Regular;font-size:11px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
<span style="font-family:Menlo-Regular;font-size:11px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">Alberto, what is the use case in
which deterministic destruction is problematic? If you
relax it for individual objects, is there a place you
can be collective to collect any stale communicators?</span></div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<pre cols="72" style="font-family:monospace">--
Alberto F. Martín-Huertas
Senior Researcher, PhD. Computational Science
Centre Internacional de Mètodes Numèrics a l'Enginyeria (CIMNE)
Parc Mediterrani de la Tecnologia, UPC
<a href="https://www.google.com/maps/search/Esteve+Terradas+5,+Building+C3,+Office+215?entry=gmail&source=g" style="font-family:monospace">Esteve Terradas 5, Building C3, Office 215</a>,
08860 Castelldefels (Barcelona, Spain)
Tel.: (+34) 9341 34223
<a href="mailto:e-mail:amartin@cimne.upc.edu" target="_blank" style="font-family:monospace">e-mail:amartin@cimne.upc.edu</a>
FEMPAR project co-founder
web: <a href="http://www.fempar.org/" target="_blank" style="font-family:monospace">http://www.fempar.org</a>
**********************
IMPORTANT ANNOUNCEMENT
The information contained in this message and / or attached file (s), sent from CENTRO INTERNACIONAL DE METODES NUMERICS EN ENGINYERIA-CIMNE,
is confidential / privileged and is intended to be read only by the person (s) to the one (s) that is directed. Your data has been incorporated
into the treatment system of CENTRO INTERNACIONAL DE METODES NUMERICS EN ENGINYERIA-CIMNE by virtue of its status as client, user of the website,
provider and / or collaborator in order to contact you and send you information that may be of your interest and resolve your queries.
You can exercise your rights of access, rectification, limitation of treatment, deletion, and opposition / revocation, in the terms established
by the current regulations on data protection, directing your request to the postal address C / Gran Capitá, s / n Building C1 - 2nd Floor -
Office C15 -Campus Nord - UPC 08034 Barcelona or via email to <a href="mailto:dpo@cimne.upc.edu" target="_blank" style="font-family:monospace">dpo@cimne.upc.edu</a>
If you read this message and it is not the designated recipient, or you have received this communication in error, we inform you that it is
totally prohibited, and may be illegal, any disclosure, distribution or reproduction of this communication, and please notify us immediately.
and return the original message to the address mentioned above.</pre>
</div>
</div></blockquote></div><br></div></div></blockquote></div><br clear="all"><br>-- <br><div dir="ltr">Stefano</div>
</blockquote></div></div>