C-API: confusion when to implement GC—docs vs stdlib

I’m currently implementing an extension library for CPython, and I’m a bit confused about when GC support is required. To my understanding, the docs seem at odds with what I see in the stdlib (datetime, for example).

The GC docs say (emphasis mine)

Python’s support for detecting and collecting garbage which involves circular references requires support from object types which are “containers” for other objects which may also be containers. Types which do not store references to other objects, or which only store references to atomic types (such as numbers or strings), do not need to provide any explicit support for garbage collection.

However, looking at a module like datetime, I see that PyDateTime_DateTimeType is a container (of tzinfo), but doesn’t have the GC flags set.

Here’s how a circular ref could come about:

from datetime import datetime, tzinfo

class MyTzinfo(tzinfo): pass

tz = MyTzinfo()
dt = datetime(2000, 1, 1, tzinfo=tz)
tz.foo = dt  # circular ref introduced here

One explanation that I can think of is that because MyTzinfo is GC-tracked, this compensates for its omission in datetime.

I’m not familiar enough the the GC implementation to get to a satisfying answer, so I was hoping to find the missing pieces of the puzzle here

AFAICT, this is a bug in datetime and does indeed leak memory (i.e. objects are never cleaned up) from what I can tell using gc.get_objects.

from datetime import datetime, tzinfo
import gc, sys

class MyTzinfo(tzinfo): pass

tz = MyTzinfo()
dt = datetime(2000, 1, 1, tzinfo=tz)
tz.foo = dt  # circular ref introduced here
assert sys.getrefcount(tz) == 3 # 1 from dt, 1 variable, 1extra from passing it in
assert sys.getrefcount(tz) == 3 # 1 from tz, 1 variable, 1extra from passing it in
assert len([obj for obj in gc.get_objects() if isinstance(obj, MyTzinfo)]) == 1
del tz
assert len([obj for obj in gc.get_objects() if isinstance(obj, MyTzinfo)]) == 1
del dt
gc.collect()
# At this point, dt and tz should be gone
assert len([obj for obj in gc.get_objects() if isinstance(obj, MyTzinfo)]) == 0

If datetime is replaced with a dummy class, the above program works:

class datetime:
    def __init__(self, *args, **kwargs):
        self.args = args
        self.kwargs = kwargs
1 Like

Thanks for the quick reply. A bit silly of me to forget the gc module has these helpers to analyze the behavior of the example. Your code proves there’s at least something merits further digging in the datetime module. I’ll move the discussion to Github and mark this as “answered”.