I’ve talked to a couple of people about this at PyCon2019, and there was more interest than I expected, so I figure I should just post this here:
I’ve (hackishly, as a proof of concept) replaced refcounting with libgc (aka BDW GC, aka Boehm GC, the canonical generic, conservative mark-sweep (non-copying, non-moving) GC):
I initially just disabled pymalloc and linked against a libgc with --enable-replace-malloc
, which interposes malloc() and free(). I then commented out the free()
call in _PyMem_RawFree
, and added a libgc destructor (called when libgc thinks the object is no longer reachable) that warned when the object had a refcount that wasn’t 0. This showed me several things: libgc didn’t try to deallocate anything that shouldn’t have been deallocated, and (comparing against a regular build without pymalloc) the performance difference wasn’t really noticeable.
(Using libgc with --enable-replace-malloc
means libgc tracks all allocations, not just ones made with _PyMem_RawMalloc
. For libgc to work it needs to be able to see all memory that might potentially contain pointers to its allocations. Because with refcounting code can just store a PyObject *
anywhere and INCREF it to make it not go away, using --enable-replace-malloc
is the easy way to do that.)
I then disabled Py_INCREF
and Py_DECREF
, disabled the gc
module (because libgc already takes care of circular references), and fixed weakrefs to work with libgc. This is the current state of the thing. It works well enough to run setup.py and much of the testsuite (and passes a whole bunch of tests), but it has a tendency to segfault randomly while running tests – probably because of not locking correctly in the right places. (I haven’t seen it segfault while running setup.py, so I’m pretty sure it’s to do with threads.) I have not measured the performance of this at all, yet.
Just to be clear: this doesn’t remove the GIL – it doesn’t even remove the ob_refcnt struct field – it just removes refcounting, which is the main (but not only) reason for the GIL. I don’t think this approach is source-level-compatible enough for the real world. It’s just an experiment, and I pushed the branch to GitHub primarily so Larry could look at it for his Gilectomy work.
For a more backward-compatible approach at introducing real GC, I have something in mind like this:
- Introduce a new, completely separate PyGCObject API.
- Leave PyObject to be refcounted as normal. Don’t change the struct at all.
- Provide a proxy PyObject wrapper for PyGCObjects, so that anything taking (or returning) a PyGCObject can be called from PyObject-oriented code.
- Provide a proxy PyGCObject wrapper for PyObjects, so that anything taking (or returning) a PyObject can be called from PyGCObject-oriented code.
- Migrate CPython itself from PyObject to PyGCObject.
- Only require the GIL for PyObject-touching code (and many fine-grained locks for PyGCObjects that need locking).
I’m not yet sure how feasible this approach is (quite apart from the 2x explosion of exposed APIs) – I have the beginnings of an experiment with this in a different branch, but I won’t know how well it works until I make at least dicts, strings, ints and ceval.c use PyGCObjects… I may get to that in the next couple of weeks.