[petsc-dev] [petsc4py] Vec.getArray()

Jed Brown jed at 59A2.org
Sat Aug 28 10:07:44 CDT 2010


On Sat, 28 Aug 2010 13:43:34 +0200, Jed Brown <jed at 59A2.org> wrote:
> On Fri, 27 Aug 2010 15:14:02 -0300, Lisandro Dalcin <dalcinl at gmail.com> wrote:
> > I cannot figure out how to implement a copy-free and safe
> > VecGetArray()/VecRestoreArray() pattern in Python (not even by using
> > the 'with' statement, it leaks the target variable!!!!).
> 
> What exactly leaks?

We discussed this on GChat, thanks to Lisandro for pointing out lots of
gotchas.

  with open('/tmp/tmp.txt','w') as f:
    g = f
  f, g  # Both f and g are in scope, Python's "with" is not scoped like in Lisp or Haskell

This is sort of okay because use of the closed f will raise an
exception, but with numpy arrays, there is no way to invalidate an
array.  A first step would be to nullify the pointer so that invalid
access would seg-fault instead of silently corrupting memory.

numpy.array is an extension type, there is one vtable per class (like
C++), not one per object (like PETSc).  So it would not be okay to
overwrite methods in the vtable.  But there is still a "type pointer"
(much like a C++ vptr) in each object that could perhaps be overwritten.
This would allow

  with X as x: pass
  x[0] = 1          # x is not valid

to actually raise an exception instead of doing something bad.

Now consider

  with X as x: y = x[1:]
  y[0] = 1

Since we don't have control of y's vtable, we can't invalidate it.  If x
and y are native numpy arrays, then after y=x[1:], y must carry a
reference to x (or rather, to the memory that backs x).  The gc package
is not aware of this reference, maybe it can't be queried from Python at
all, but it must be accessible from C.

I assume it won't hold full backward-links, so you couldn't use
gc.get_referrers() to rewrite the vptr of all array views.  But perhaps
it is still possible to verify that only one exclusive reference
remains.  This would cause

  with X as x:
    y = x[1:]
  # Exception in __exit__ because the reference is not exclusive, use of
  # y would be unsafe (could cause silent corruption).

  with X as x:
    y = x[1:]
    # use y
    del y
  # Good, no hanging references

Perhaps this is somewhat ugly, but I think it's better than silently
corrupting memory.  Attached is a short example code, it outputs

  $ python3 with.py 
  __exit__: [102], nrefs=3, refs=[<frame object at 0x23b16a0>, {'val': [102]}, [[100], [102]]]
  __exit__: [101], nrefs=2, refs=[<frame object at 0x23b16a0>, {'val': [101]}]
  __exit__: [100], nrefs=3, refs=[<frame object at 0x23b16a0>, {'val': [100]}, [[100], [102]]]

The opaque frame object is for the scope containing the with statement,
the next is the reference saved by the Dispenser object, the third is
the hanging reference (holding just a and c).

Note that gc.get_referrers(), used in this example, is probably not
available so we'd be raising the exception based purely on a reference
count.  Is there a case where a leak is unavoidable or where the
reference count would otherwise be artifically high at __exit__(), so
that it would be unacceptable to raise a usage exception when a
reference is leaked?

>   with Vec.getArrays(X,Y,Z) as (x,y,z):
>     x = y + z  # Numpy vectorized addition

This can be written

  with X as x, Y as y, Z as z:

in Python 2.7 and 3.1+

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: with.py
Type: text/x-python
Size: 713 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100828/7a369a5c/attachment.py>


More information about the petsc-dev mailing list