Static-only `frozen` data classes (or other ways to avoid runtime overhead)

Making a Python data class frozen incurs a runtime performance penalty. There is of course an explanation for it, but the overall behaviour is a little counterintuitive – as a frozen data class can do less than its mutable counterpart.

Immutability is a very useful property in practical software applications. Especially when typecheckers like Pyright have started to rely on it to implement more accurate checks. [1]

I’d love to hear the communities feedback on two ideas:

  1. Add a static-only configuration option to data classes to declare them as immutable. This would be understood by the type checker, but wouldn’t actually prevent any mutations at runtime. Similar to [this hack] to make attrs-backed objects “immutable” without a runtime penalty.

It reminds of e.g. PEP 705 for TypedDict.

  1. Make it so that declaring a dataclass as frozen somehow does not incur a runtime penalty.

I understand that both options have significant drawbacks. Whatever the implementation of 1 is, it may certainly lead to confusion with the already existing frozen setting. Similarly, I imagine 2 would be a very significant runtime effort.

However, I wouldn’t underestimate the positive (and ultimately simplifying) effects of bringing immutability to more of the language. Curious what the folks here think of this!

[1] For an example, see changes made in Pyright’s 1.1.328 release. It’s no longer possible to narrow the type of a data class property when inheriting from the class, unless that data class is frozen. Otherwise, a reportIncompatibleVariableOverride type error is raised.

1 Like

For a practical example, see how making Strawberry’s data classes frozen by default introduces a 30% performance hit to the framework, making the change impractical: Make all transformed data classes frozen by default by kkom · Pull Request #3397 · strawberry-graphql/strawberry · GitHub (despite its positive effects on the architecture of applications using Strawberry)

A third idea comes to my mind. Could Python have a more generic way to mark things as ReadOnly / Immutable?

Perhaps similar to Final (PEP 591)? If this was broader than data classes, the risk of confusion with existing frozen configuration would be lower (I imagine the two qualifiers could coexist easily) and there would be no need to work about the runtime implementation.

Edit: I see this was discussed in: PEP 705 – TypedDict: Read-only items | peps.python.org as a possibility – so doesn’t seem off the table at first sight.

This works with pyright, though mypy rejects it:

from typing import TYPE_CHECKING
from dataclasses import dataclass

@dataclass(frozen=TYPE_CHECKING)
class A:
    b: int

a = A(1)
a.b = 3  # error

However, expanding ReadOnly[] to work outside TypedDict could be interesting. Someone needs to write a PEP with a concrete proposal.

It would also be good to reduce the runtime overhead from frozen dataclasses. Perhaps there’s a change we can make to CPython to make them cheaper.

1 Like

Here’s a version that should work for all type checkers.

from typing import dataclass_transform, TYPE_CHECKING

if TYPE_CHECKING:
    @dataclass_transform(frozen_default=True)
    def static_frozen_dataclass(__cls): ...
else:
    from dataclasses import dataclass as static_frozen_dataclass

@static_frozen_dataclass
class X:
    x: int

x = X(0)
x.x = 5
2 Likes

Maybe this will be interesting: recently I blogged on how to get statically frozen attrs classes: attrs iv: Zero-overhead Frozen attrs Classes. Redowan followed it up with Statically enforcing frozen data classes in Python | Redowan's Reflections, but other folks have mentioned this approach already in the thread.