I recently discovered that memoizing a class’s methods using functools.cache
in the obvious way has the unwanted side-effect that instances of that class will never be garbage-collected.
Here is a small example:
import functools
import objgraph # pip install objgraph
import gc
class Foo:
# This cache is held somewhere in global memory, and prevents Foo objects from being reaped
@functools.cache
def func(self, key):
return key + key
class Bar:
def __init__(self):
# This cache dies when Bar dies, so it doesn't prevent reaping
self._cache = {}
def func(self, key):
if key not in self._cache:
self._cache[key] = key + key
return self._cache[key]
for i in range(10):
Foo().func(i)
Bar().func(i)
gc.collect()
print(f"Foo: {objgraph.by_type('Foo')}") # 10 objects
print(f"Bar: {objgraph.by_type('Bar')}") # 0 objects
I have tested this in python 3.9 and 3.10 with identical results.
We figured this out at my $work
when a “memory leak” was observed (behavior: memory grows without bound as Foo
objects are created and destroyed; whether you want to call this a memory leak is debatable, but the end result is the same) and we eventually narrowed it down to the behavior of functools.cache
in this case. In our case, the result was the vanilla straightforward result of a typical memory leak - eventually we ran out of memory and crashed. Other unintended effects could happen, such as not performing any finalization at the expected time, etc.
Now, to be fair, functools.cache
does not advertise that it’s okay to decorate methods with it - it only uses the word “function” in its documentation. So maybe this use case was never expected to work.
But since methods are “just” a specific type of function (one that knows its associated self
, IIUC), one could expect that it’s okay to use functools.cache
on them, too.
So I would argue that the principal of least surprise, when a method is decorated with functools.cache
, is that the cache’s lifespan should be the same as as the method’s lifespan, which the same as the object’s lifespan, and that adding a cache should not affect the object’s lifespan.
This behavior has been observed before, for example in this SO thread: caching - Python functools lru_cache with instance methods: release object - Stack Overflow .
I propose that functools.cache
be amended so that when it decorates an instance method, it doesn’t have this behavior of altering the lifespan of that instance. How do people feel about that?
I have not attempted to create a solution yet. I’ve only looked briefly at the functools.cache
code to see where the cache lives, which I kind of expected to be in some kind of global, but that does not seem to be the case in a straightforward way.