Finding a path forward for `functools.cached_property`

Hello! Herein I briefly tell the sad tale of functools.cached_property, and request your input on next steps to give it a happier future.

Once upon a time, someone invented the clever idea of using a “non-data descriptor” (that is, one defining only __get__, which has lower precedence than an entry in an instance dictionary) to create “cached properties” which compute a value only once per instance and cache it in the instance dict itself. This has some neat properties (it’s very simple, the storage and lifetime of the cache is neatly and inherently tied to the lifetime of the instance, subsequent accesses of the cached value have exactly the same cost as any other normal attribute access) and some downsides (mostly that it only works on objects with a mutable __dict__.)

This technique became very popular, at least in some corners of the Python community (notably web frameworks), and eventually pretty much every popular web framework (and many other projects) had cargo-culted their own basically-identical implementation of it.

Back in the Python 3.8 timeframe, I had the bright idea to add this to functools so everyone wouldn’t have to maintain their own version of it. And I did just that.

Sadly, in implementing the PR, I strayed from the path of righteousness (that is, just port the same simple battle-tested implementation everyone was already using) and succumbed to feature creep. A commenter suggested that cached_property should be thread-safe via locking, and pointed to an existing implementation that included locking. So I added locking in the same manner, and it was reviewed and merged.

Some time later, people actually using it with threads observed that the locking has a critical performance problem: it locks the entire cached property across all instances, not just a single instance. As a result, across the web one can now find warnings to avoid functools.cached_property, and most of the projects that previously maintained their own version have just continued to do so.

How can we rescue cached_property? Some options that have been proposed, with pros and cons (thanks @Jelle for originally framing these 3 options):

------ OPTION 1: give a couple versions’ warning in deprecation docs and then just remove the locking

On the plus side, this allows us to arrive, with minimal churn and within a few versions, at the simplest behavior that empirically worked just fine for most users of cached_property before it was added to stdlib. Normal Python data structures and instance methods don’t inherently protect you from multi-threading data races, you have to provide your own synchronization that meets your specific needs; it’s not clear why cached_property should be any different.

On the minus side, if anyone out there is using stdlib cached_property, and depending on the current locking to guarantee synchronization, and they don’t notice any of the deprecation warnings in the docs, we may be gifting them with an unpleasant debugging puzzle. At this point this is hypothetical (despite some level of internet noise about locking issues with cached_property, nobody has yet turned up to say “I’m relying on locking”), but it does seem possible.

------ OPTION 2: add a lock=True kwarg or add a new cached_property_no_lock

We could add a keyword argument to allow turning off the locking, or just add a separate cached_property_no_lock. This could be done with or without deprecating the current locking behavior.

On the positive side, this avoids any breakage for current users; any behavior change is opt-in.

On the negative side, if we don’t deprecate this means that forever the probably-better-for-most-users version of cached_property has to live under the ugly name cached_property(lock=False) or cached_property_no_lock, and beginners are likely to stumble onto the version that is probably worse for them. (And unless we also do option 3, we forever keep under the most accessible name a basically-broken cached_property that doesn’t seem to meet anyone’s needs.) And if we do deprecate, we create a lot of thrash and churn for anyone who has already adopted cached_property: first we ask them to add lock=False, and then probably a few years down the road we switch the default, and then later we maybe ask them to stop using the lock keyword altogether.

We could also attempt to make the deprecation less noisy by only issuing the warning for users who appear to be relying on the locking (e.g. by issuing the warning only when there is lock contention.) But this is a pretty unreliable and non-deterministic heuristic to base a deprecation warning on.

------ OPTION 3: make the locking per-instance

We could try to update the locking code so that it locks per-instance instead of per-cached-property.

On the positive side, this preserves the guarantee that for a given instance a cached property will execute only once, even in the presence of threads. Someone out there may currently be using cached_property and depending on that.

On the negative side, this will likely be complex to implement and hard to get right (nobody has yet offered an implementation), it will probably continue to cause some issues with pickling, and the value of it is hypothetical, not user-requested. (In the typical use cases for cached_property where it’s just computing some cached value specific to the instance based on other immutable fields of the instance, it doesn’t really matter if on rare occasions it could be computed twice for the same instance.)

Also, this option doesn’t fully preserve compatibility either: it’s also theoretically possible that someone out there is currently implicitly protecting global state with their cached_property lock, and even just changing the lock to per-instance could cause subtle bugs for them.

If we did create a robust implementation of per-instance method locking, it would seem more flexible to put this implementation in its own lock decorator which can be optionally composed with cached_property for those who need it.

------ RFC

What do you think of these options? Which seems the least bad? Are there additional options, or additional pros/cons of the above options, that we should consider?

Thanks for your consideration!

10 Likes

Option 1 or option 2.

2 Likes

Thanks! Do you have a preference between these? And if it’s option 2, do you prefer it with deprecation and an eventually changed default, or with keeping the current default indefinitely (and maybe trying to fix it a la option 3)?

I’d prefer option 1 – every time we keep a good name for suboptimal behavior and we pick a worse name for the right way, I cringe a little. The less we can do of this kind of thing the better.

And 3.7 or 3.8 wasn’t that long ago – if this had been with us since 3.1 or 2.x we might not have a choice.

13 Likes

There is an option 4.

Someone with the time and skill can just fix the implementation. I took at a stab at it and one point but didn’t have the time, energy, or inclination to finish it. Perhaps this would be a good assignment for our developer in residence.

From a user point of view, the best possible outcome is that cached_property() does what is advertised, complete with thread safety. Almost certainly, some users are relying on the feature and it does work. The issue is that the performance can be catastrophic under some loads, but I think that performance issue is 100% fixable.

1 Like

That’s not option 4, it’s the already-listed option 3. Do you have any thoughts about the pros and cons mentioned?

I’m not convinced that “complete with thread safety” is, all things considered, the best eventual outcome. The available evidence suggests that most users don’t need thread safety, it has costs for all users, and if its implementable inside cached_property it would be just as implementable orthogonally so that the users who need it can compose it.

3 Likes

If the implementation had not been published yet, that would make sense. But once a standard library feature has been released, a deprecation and removal are unnecessarily disruptive. It is better for all to just fix the problem.

If I understand the proposal correctly, here is what disruptive means:

  • All current users, 100% of them, will encounter the deprecation warning and have to silence it.
  • None of the users, 0% of them, will get any benefit from this until after two release cycles.
  • In the meantime, the problem will remain unfixed and the voices on the internet will continue to discourage the use of @cached_property.
  • After the deprecation cycles, the small subset of users who needed the feature will be left in the cold with no reasonable alternative.

Other than the effort required, I really don’t understand the objection to making a bug fix.

Normally that would give us some comfort but @cached_property is a very popular feature.

I prefer Option 1. Very little of the Python standard library is thread-safe, so much so that I always find it weird when I stumble over these little corners with built-in locking. Why make single-threaded Python users pay for a feature they don’t need?

If we decide Option 1 isn’t viable, I prefer Option 3 to Option 2. But this is a distant second.

6 Likes

Option 1 is the most sensible as long as the number of impacted users are small or are already pinning their Python version. I think that is likely the case.

Option 2 adds too much bloat to stdlib.

Option 3 without a working implementation plan too speculative.

I vote option 1. If enough people do complain, then consider re-adding locking in a patch branch. But they probably wont and stdlib will be better for it.

1 Like

Option 1+

Step 1)

Implement option 1 as standard non-thread-safe but “simplest” for most users.

Rename current version to:
functools.cached_property_with_lock

Any current users needing the lock can very quickly change their code.

Step 2)
Fix the with_lock version (perhaps with a kwarg to turn it on if current behavior needs to live forever, with the default kwarg behavior changing after a cycle.)

3 Likes

Option 1 is the most sensible.

There are other issues with cached_property.

  1. It is not compatible with __slots__. You need a writable __dict__ to store the cached value, and its eliminates the purpose of __slots__ – saving memory. __dict__ adds 224 bytes of memory.
  2. It affects pickles. The value of the cached property can be evaluated from other attributes, so there is no need to save it in a pickle. And if depends also on some process globals, the saved value can be just wrong.

These issues require solutions which change the code outside the functools module – in the type constructor and in the pickle protocol.

2 Likes

I’d go for option 1, just have this nice feature ready. The thread safe existent implementation should not be used anyway, not only because of the perf issue, also because it is conceptually wrong, cached property is supposed to affect the instance’s property and the lock is affecting all the instances.

4 Likes

Option 1 - make it the same as the oft copy-pasted versions out there. Django uses its own cached_property heavily and I can’t recall any locking issues.

3 Likes

Perhaps a bit sneaky, but since @functools.cached_property() does not advertise being thread-safe in the documentation (unlike for example @functools.cache()), there’s another possible variant of options 2 and 3:

  • Add a lock=False kwarg, managing the locking mechanism. Have it default to False, since locking was never advertised.
  • Document the existing global locking mechanism and its downsides.
  • Work on a locking implementation which uses more fine grained per-instance locking and then advertise using this when done.
  • Finally, use a deprecation phase of two releases to change the lock=False default to lock=True.

What’s potentially sneaky about this ? Well, the locking would be disabled without deprecation period. But given that it was never advertised and apparently not even expected by users, this should well be within range. Users who want the global locking mechanism can still use lock=True to enable this (now that it’s a documented feature together with its known downsides).

9 Likes

Option 1. I would have never assumed that it is thread-safe in the first place. So would have probably implemented a separate lock for it if ever needed. Especially since the locking feature is undocumented.

2 Likes

Is it possible to put lock in instance dict? Maybe use event instead of lock.

    def __get__(self, instance, owner=None):
        if instance is None:
            return self
        if self.attrname is None:
            raise TypeError(
                "Cannot use cached_property instance without calling __set_name__ on it."
            )
        try:
            cache = instance.__dict__
        except AttributeError:  # not all objects have __dict__ (e.g. class defines slots)
            msg = (
                f"No '__dict__' attribute on {type(instance).__name__!r} "
                f"instance to cache {self.attrname!r} property."
            )
            raise TypeError(msg) from None
        val = cache.get(self.attrname, _NOT_FOUND)
        if val is _NOT_FOUND:
            sentinel = threading.Event()
            event = cache.setdefault(f"__{self.attrname}_event", sentinel)
            if event is sentinel:
                val = self.func(instance)
                try:
                    cache[self.attrname] = val
                    event.set()
                except TypeError:
                    msg = (
                        f"The '__dict__' attribute on {type(instance).__name__!r} instance "
                        f"does not support item assignment for caching {self.attrname!r} property."
                    )
                    raise TypeError(msg) from None
                finally:
                    cache.pop(f"__{self.attrname}_event")
            else:
                event.wait()
                val = cache[self.attrname]
        return val

I also create a gist: cached_property · GitHub

Given the options, I’d go for option 1. I think this will give users and package authors enough time to tweak their code.

I always found it rather curious that several packages used their own implementation, despite stdlib providing a drop-in replacement a few years ago, until I looked at the implementation, compared it to several others (Django, Werkzeug, Pyramid, Funcy, Boltons etc) and saw the locking mechanism.

One popular package I remember that makes use of functools.cached_property is Celery’s messaging lib, Kombu. The decorator is used by their transporters.
They use the threaded variant of the cached-property package if Python < 3.8, and even extend whatever decorator is available to implement the rest of the descriptor protocol: kombu/utils/objects.py.

Flask also has a virtually identical decorator based on the Werkzeug’s solution. Even though they don’t use the functools solution, it might be interesting to see if the The Pallets Projects and Flask maintainers have noticed or been warned about any issues, as Flask is one of the most popular web frameworks in the ecosystem.

1 Like

If I were an end-user of some library, I would try to “reason” about whether it is safe to share object instances between threads using the following logic:

  1. Is the object immutable? If so, I can share it.

  2. If not, can I put locks around mutating operations? If so, I can share it.

The problem is that a cached property is a hidden mutation that might be non-obvious in the interface. I just ask for invoice.total and little do I know that a mutation is happening behind the scene.

Maybe it still doesn’t matter, as long a the mutation is idempotent. If two threads compute it, perhaps that’s not a problem as long as they both end up writing the same result. Is there a case though where the internal implementation of the cached_property might trip itself up in a multi-threaded environment despite being idempotent?

Presumably so, or nobody would have implemented it with locks in the first place…

My main point is that the creator of the object might not know it is going to be used in a multi-threaded context and consumer of the library might have a reasonable expectation of immutable “behaviour” because fields are only being read, never written.

1 Like

This is indeed a major downside of option 2 (with deprecation.) Fortunately, option 1 seems overwhelmingly favored here, and doesn’t have this problem.

I agree that this is too bad, even with option 1. With option 1 we do have the choice to do things a little faster if we want to; we could even have the fix in 3.12. It’s just a tradeoff between how soon the fix is available vs how much time we give for anyone who might be currently depending on the locking to notice it’s going away.

I don’t see how anyone is left out in the cold. For anyone who was satisfied with the current stdlib behavior, it’s easily copyable. And if anyone wants per-instance locking instead, this is also quite easy to do if you control the specific cached_property – just keep a lock on each instance and acquire it within the cached_property getter function. The difficulty here is not in the locking logic, it’s in doing it generically on an instance method decorator in a non-invasive way. If this per-instance locking turns out to be a widely-shared need, it can make its case for generic stdlib inclusion like any other feature does.

Specific objections have been mentioned. I’ll reiterate them. Any per-instance locking feature inherently will require a lock per instance. Lock objects are not zero-size and zero-cost, and they can’t be pickled. This is completely dead-weight overhead for the vast majority of users who don’t need locking.

Given the implementation complexity and the cost tradeoffs, I don’t think it’s really accurate to describe changing per-class locking to per-instance locking as simply a “bugfix.” It’s more like replacing one (undesirable) feature with a more complex and costly feature, which is better for a few users than the existing feature, but strictly worse for the majority of users who don’t need locking (higher overhead than the per-class locking, with still zero benefit.)

3 Likes