[petsc-dev] Remove calls to XXXReset() in XXXDestroy()

Barry Smith bsmith at mcs.anl.gov
Thu May 12 17:51:38 CDT 2011


On May 12, 2011, at 4:48 PM, Lisandro Dalcin wrote:

> On 12 May 2011 23:55, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>> On May 12, 2011, at 3:47 PM, Lisandro Dalcin wrote:
>> 
>>> I would like to remove the calls to XXXReset() in XXXDestroy(),
>>> probably refactoring the code to reuse a XXXReset_Private(). My
>>> intention is to have the two things separated, object destruction is a
>>> bit delicate and requires some care in handling Python subtypes. I
>>> really prefer subtype implementors not to rely on ops->reset() being
>>> called before ops->destroy(). Any oposition?
>> 
>>    Yes. You don't provide an explanation about why you want to do this, besides "object destruction is delicate". Please formulate a complete explanation why this change would be needed and what the alternatives are.
> 
> When you call XXXDestroy(), then eventually the reference count drops
> to zero, and continues to be zero while XXXDestroy() is executing. In
> petsc4py, I've decided long ago to make each Python object to own a
> ref to the PETSc handle. Then, in order to let users implement a "def
> destroy(self, mat/pc): ..." for MAT/PC/PYTHON,

   Why MAT/PC/PYTHON? What about Vec, KSP, SNES, ....?

   Why does a user need to ever implement a def destroy()? Doesn't python have a garbage collection mechanism for each python object so when that goes to zero you can call a XXXDestroy() on the underlying PETSc "handle"? Is this construct only needed to allow users to destroy some objects early to optimize memory? Or is only related to when an implementation of the subclass is done in python? 

> I need to create a new
> Python object owning a reference to the PETSc Mat/Vec, and then pass
> the object to the call. This causes double calls to XXXDestroy() while
> the first XXXDestroy() is being executing. Then I have to artificially
> increment PETSc refcounts, call Python method destroy(), then decref,
> etc.

   
>  And now, I have to do the same refcounting hacking for Reset().

   I don't understand? XXReset() doesn't ever have anything to do with reference counting, doesn't check it, doesn't use it, doesn't even care what it is.

> I had the same issue with MatPreallocated() hack at MatDestroy(), that
> I could fortunately fix in a recent commit.

   So this has nothing to do with XXXReset() in particular? It only has to do with calling a method on an object inside the object destroy? And is it for any object destroy, or only an object whose subclass is implemented in python?

   Why is calling a method inside XXXDestroy() such a problem? It sure isn't a problem for the C level, what causes the problem for python?




> Add to all this that I
> already had a lot of trouble to figure out how to trick Python's
> garbage collector to not cause segfaults when reference cycles are
> created at the Python level.
> 
> I'll say it again: object destruction is delicate, really delicate,
> and even more delicated when you add Python to the equation. And this
> stuff is not so easy to explain, though I tried. Perhaps you do not
> agree how much delicate is because PETSc just do not care about error
> recovery.
> 
>> 
>> Just because it makes your life a little easier without context why the change is needed isn't a reason to change it.
>> 
> 
> However, I think you added these calls to XXXReset() just to make YOUR
> life a little easier, saving code duplication :-). Additionally, don't
> you agree that Reset() and Destroy() have different intentions? Even
> if Python issues were not on the table, why it is good to mix these
> calls? Could you elaborate?

   I am not saying no to your request, I am trying to understand why you made it and I don't.

   Barry

> 
> Finally, grant me a comment: How many times I asked PETSc to change
> things just to make my life easier? I try hard to core PETSc ignorant
> about the existence of Python wrappers. I even added Python support
> using dynamic loading in order to remove any compile/link time
> dependecy on Python!!  IOW, when I ask to change something because of
> Python issues, that is because I'm really needing them.
> 
> Anyway, I have things mostly working on Python. If you decide this
> thing is not going to change, I'l have to live with that. But please,
> in the future try to not add any other call that dispatching
> ops->something() during the execution of XXXDestroy().
> 
>> 
>>> In some previous cleanup
>>> commits I took care of preparing the code to make this change, so I
>>> expect this one to be easy.
>>> 
>>> BTW, I think we should review the way we handle referrence counting
>>> regarding DM's and the global&local vector cache. I had to spend some
>>> time to figure out that dm->ops->get[global/local]vector() should
>>> forcibly compose the DM in the vector, if not, bad things occur at
>>> DMDestroy().
>> 
>>   Did you need to change something recently with this? It was working fine so I'd like to know what was changed and why.
>> 
> 
> Yes. Some time ago Matt added the IGA stuff from Nathan Collier at
> KAUST. Unfortunately, at the time that happened, I could not review
> the status of the code being pushed. DMIGA is a kind of wrapper around
> a DMDA, for the case of implementing get/global/local/vector*() all
> you need to ask a DMDA for the vector. BUT if you forget to compose
> the DMIGA in the new Vec, things get broken. Matt did not noticed this
> issue when he pushed, and it tooks me some time to figure out what's
> going on. IMHO, when that things happens (the implementor didn't
> noticed, the committer didn't noticed, and a 7-years experience
> user/developer like me have trouble to figure out), then there is
> something that smells bad.
> 
> And yes, this was working fine, but perhaps just for accident. Any DM
> subtypes implementing GetGlobal/LocalVector should be reviewed for
> this issue. Unfortunately, I didn't have the time when I fixed DMIGA.
> 
>>> Cyclic references do not play well in our refcounted
>>> model, and we cannot implement a garbage collector that works in
>>> parallel :-(
>> 
>>   yes, we all understand this.
>> 
> 
> However, we have been using PetscObjectRegisterDestroyAll() for ages
> to collect cached garbage, and such feature is really dangerous if you
> ever register an object created on a subcommunicator. You see, we all
> think that handling destruction is a piece of cake, but that's not the
> case. I figured out this issue just a few hours ago.
> 
> -- 
> Lisandro Dalcin
> ---------------
> CIMEC (INTEC/CONICET-UNL)
> Predio CONICET-Santa Fe
> Colectora RN 168 Km 472, Paraje El Pozo
> 3000 Santa Fe, Argentina
> Tel: +54-342-4511594 (ext 1011)
> Tel/Fax: +54-342-4511169




More information about the petsc-dev mailing list