Delay GC on object

Hi, it seems Python’s GC runs FIFO, which is causing a slight problem with my AutoCAD wrappers

  • Database is a collection of dbobjects that may be opened for read or write
  • Database std::shared_ptr that is deleted in the dtor
  • Dbobject is a std::shared_ptr that calls ptr->close() in the dtor
  • Python GC’s the database, first, invalidating all the pointers first, so AutoCAD crashes when ptr->close() is called.

IMHO, GC should work like a stack (FILO), but I’m really not that familiar with Python.

Is there an elegant solution besides making the user create a new function just for a scope?

sample

import traceback
from pyrx import Db, Ed, Ge, Ap, Rx, Gs


@Ap.Command()
def doit():
    try:
        longest = 0
        for file in Ap.Application.listFilesInPath("E:\\temp", ".dwg"):
            
            # read the file
            sdb = Db.Database(False, True)
            sdb.readDwgFile(file)
            sdb.closeInput(True)
            
            #opened for read
            ms = sdb.modelSpace(Db.OpenMode.kForRead)
            
            #opened for read
            crvs = [
                Db.Curve(id, Db.OpenMode.kForRead) for id in ms.objectIds(Db.Curve.desc())
            ]
            
            for crv in crvs:
                longest = max(longest, crv.getDistAtParam(crv.getEndParam()))
                
            # Database is deleted first, kludge
            dummy = sdb #ugggg
        print(longest)
    except Exception as err:
        traceback.print_exception(err)
1 Like

I’m unfamiliar with “AutoCAD”, but it seems the problem is with std::shared_ptr use, not CPython’s GC.

Simply don’t allow GC to do so. :slight_smile: Wrapper destructor shouldn’t invalidate resources used by other wrappers.

Assuming destroying “Database” collection destroys it’s elements, elements wrappers should reference their collection (or bump its ref cont in some other way).

1 Like

Thanks for the reply.

in most every other programing language, variables are dropped in the reverse order of their initialization. I can’t be the first wrapper writer to run into this issue.

import traceback
from pyrx import Db, Ed, Ge, Ap, Rx, Gs

class A:
    def __init__(self, name):
        self.name = name
        print(f"Object {self.name} created")

    def __del__(self):
        print(f"Object {self.name} destroyed")

def my_function():
    obj1 = A("One")
    obj2 = A("Two")
    obj3 = A("Three")

@Ap.Command()
def doit():
    try:
        my_function()
    except Exception as err:
        traceback.print_exception(err)

Result

Command: DOIT
Object One created
Object Two created
Object Three created
Object One destroyed
Object Two destroyed
Object Three destroyed

Are there elegant patterns in Python to simulate the behavior one might get from other languages?
Cheers ~Dan

This is kind of the pattern I plan on putting in my sample, but it looks kludgy

import traceback
from pyrx import Db, Ed, Ge, Ap, Rx, Gs

class DbChild:
    def __init__(self, name):
        self.name = name
        print(f"Object {self.name} created")

    def __del__(self):
        print(f"Object {self.name} destroyed")


class Db:
    def __init__(self, name):
        self.name = name
        print(f"Object {self.name} created")

    def __del__(self):
        print(f"Object {self.name} destroyed")

    def makeChild(self, name):
        return DbChild(name)
    
    
def foo(db):
    c1 = db.makeChild("child1")
    c2 = db.makeChild("child2")

@Ap.Command()
def doit():
    try:
        db = Db("db1")
        foo(db)
    except Exception as err:
        traceback.print_exception(err)
Command: DOIT
Object db1 created
Object child1 created
Object child2 created
Object child1 destroyed
Object child2 destroyed
Object db1 destroyed

What if makeChild actually registered the new object as the caller instance’s child? What’s important here is that the parent holds a reference to its children:

def __init__(self, name):
    self.name = name
    self.children = []
    print(f"Object {self.name} created")

def makeChild(self, name):
    child = DbChild(name)
    self.children.append(child)
    return child

Anyway, you shouldn’t be depending on anything related to timing of calling destructors in Python. If you need resource management, the idiom for that is “context managers”.

3 Likes

Thanks for the reply!
I’m still learning Python, my sample was just to illustrate how the destruction order differs. I think adding the child as a reference to the parent won’t work in my case, It’s the opposite direction, maybe adding a reference of the parent to the child could work.

Not really, because pretty much everything you’re discussing here is an implementation detail. There’s nothing you can really do to guarantee the behavior you want. Note the following statements from the language reference:

An implementation is allowed to postpone garbage collection or omit it altogether — it is a matter of implementation quality how garbage collection is implemented, as long as no objects are collected that are still reachable.

Do not depend on immediate finalization of objects when they become unreachable

You can’t rely on objects being garbage collected in any particular order. You can ensure that they don’t get collected at all by keeping a reference to them, but beyond that all bets are off. The elegant solution in Python is to write your code such that it doesn’t depend on any details of how garbage collection is done.

3 Likes

Hi, thanks for the reply
This is really an edge case in the project
Python’s GC is actually very deterministic; I spent the first six months of the project studying the behavior so I would not end up in the same mess as .NET wrappers. There’s pretty good unit test coverage on the lifetime of AutoCAD’s objects, it works fantastic!

Normally, Databases are not “disposable”, for lack of a better term, but there is the edge case where users can create their own Databases, that are disposable. But in this edge case, the GC is still 100% predicable.

This pattern works. so I’ll base my unit tests off this:

I was hoping there was something more elegant that I was missing, something analog to .NET’s using statement

The Answer

Generally (non-memory) resources are managed using with blocks which are probably the closest analogy to c# using blocks.

I think that’s for a specific Python implementation for a specific Python version. E.g. if you were to run it on an alternative implementation like PyPy or GraalPython then it’ll fail horribly. And the experimental free-threaded version will also defer destruction under some circumstances.

The general rule is that if an object depends on another object still existing then it should either:

  • Hold a reference to that object to keep it alive
  • Have a weak-ref/callback that alerts it to the other object being described and “neutralises" the dependent object.

Most c++ binding libraries implement one or both of those options

2 Likes

Hi,

Yes, I’m running in an embedded context, inside AutoCAD’s process. I’m linked to 3.12.X, and wxWidgets(wxPython) for MFC wrappers
I used boost::python, all hand rolled with love, except I wanted to gouge my eyes out doing the ActiveX wrappers :grin:
Anyway, it’s very fast. Just the best scripting for CAD, … ever!

I’m am a bit worried about any changes to GC in future versions of Python. I’ll probably start making test builds with 3.15 as it progresses

~Dan

1 Like

In my opinion, you should definitely codify your dependencies.

Suppose that in your example, where you create A, B, and C, that you create a pointer to C. Now you say that the GC should not delete A and B? How is it supposed to know?

2 Likes

Hi,

Sorry for the confusion, I was trying to illustrate the behavior in the simplest terms

If you read my original post:

Database std::shared_ptr that is deleted in the dtor
only the Database is deleted when ref count reaches zero.

Dbobject is a std::shared_ptr that calls ptr->close() in the dtor

It’s not deleted; I’m only managing the state of the object in the dtor, Dbobject is an object that can live in multiple states

  • kClosed = no readers or writers
  • kForRead = the system allows for 256 simultaneous readers.
  • kForWrite = the system allows for 1 simultaneous writer.
  • kForNotify = DbObject can wait for a notification of a change in one of the above states

if you imagine drawing a bunch of lines and circles in CAD, the Database is the drawing, Dbobject is the line or arc, or block of text.
They are never deleted, only can be flagged as erased, this way the user can undo/redo.

If you look at my original sample, I’m opening a Database.
I open some DbObjects (Curve) kForRead, collect some data, then notify the Database the object wants to be closed.
The problem was the Database was already deleted because of the destruction order. (exception time)
Most of the time existing DbObjects are opened for read, since the system can have 256 readers, dtor or GC timing is not critical. Any writer can just wait for a notification

As I had mentioned, it’s an edge case. Most Databases are owned by the Document, That’s what’s visible to the user’s drawing area.
In this case, the Database dtor is a noop, it’s all managed in CAD. However, I want to allow users to scan through .dwg files to collect data, blocks of texts, then use the power of Python to do analysis on the data, pandas and what not.

Anyway, the issue has been solved, I hope it explains it a bit better

So DbObjects depends on Database.

1 Like

If you need deterministic destruction, don’t rely on destructors. Python’s garbage collection approach is largely unspecified, so you must not rely on a particular destruction order.

Instead, use context managers / with-statements in order to get deterministic cleanup when control flow leaves a scope. Context managers are closely related to C++ RAII or C# using-statements.

So instead of this, where you hope that destructors are called in a particular order:

db = create_database()
...

obj1 = db.get(1)
ob2 = db.get(2)
...
# hope that `obj1` and `ob2` are released before `db`

You could design a context-manager based API that guarantees when resources are released:

with create_database() as db:
    ...
    with (
        db.get(1) as obj1,
        db.get(2) as obj2,
    ):
        ...

However, be aware that after a context manager __exit__, the objects still exist. It probably makes sense to transition these objects into an empty but safe state, and to maybe raise if any methods are called on them in the future. Part of the problem here isn’t just the indeterminate destruction order, but also that you’re keeping the closed object around. On a C++ level, it might not be sufficient to ask an object managed by a shared pointer to close itself ptr->close(), but may also be necessary to destroy the pointed-to object: ptr.reset().

Resources for learning about Python context managers:

2 Likes

Don’t these also get destroyed/called in the order of introduction?

1 Like

Their __enter__ and __exit__ methods are guaranteed to be called in fifo order and are called no matter how the function exits. So you may safely do clean up in the __exit__ block you would do in a destructor in c++.

2 Likes

So in the case @latk demonstrated obj2.__exit__ is guaranteed to be called before obj1.__exit__and lastly dB.__exit__will be called.

Thanks for the reply.

Paul’s suggestion works perfectly,
tests pass, best of all, it’s easy from a user perspective
I added a link back to this discussion in my sample, just in case someone wants to explore other ideas

Just to clarify, that solution still does not make any guarantees about when any of the objects will be garbage collected. It just makes guarantees about when variables will go out of scope.

1 Like