Immutable classes

Hi.

Suggestion

There are many situations where it would be helpful to annotate a class as representing an immutable object, meaning all its properties may not be modified after the __init__() method completes.

Examples:

  1. many builtins are immutable, like str or tuple
  2. frozen dataclasses
  3. Go from (Type qualifiers — typing documentation)
class ImmutablePoint:
    x: Final[int]
    y: Final[int]  # Error: final attribute without an initializer

to

@Immutable
class ImmutablePoint:
    x: int
    y: int

There is already @final for classes but that means it cannot be subclassed.
I used @Immutable in the example, but other names could be used like @Frozen.

Use case

The use case is to clearly show to type checkers that the properties of an object may not be assigned, without having to annotate all individual members, e.g.

p = ImmutablePoint(1, 2)
p.x = 1 # Error

In contexts where mutable objects are not recommended, tagging as immutable silences the type checker, e.g.

def my_fn(p: ImmutablePoint = ImmutablePoint (0, 0)):
    # ^ currently type checkers warn about not using mutable objects as function defaults
    # but if it is a frozen/immutable object, it is fine and there should not be a warning.

Semantics

Of course, tagging a class as Immutable does not imply anything regarding the immutability of its members.

The semantics of @Immutable requires that neither properties (get/setattr) nor keys (get/setitem) may be modified (added, assigned, deleted); that any methods that might change the state of the object must return a copy, like newobj = dataclasses.assign(obj, **kwds).

But obviously, these are static type hints and no runtime checks would be performed. It would be up to the developer to decide how much runtime checking he/she needs to implement in the class.

Inheritance

Immutability would not be inherited, e.g.

class MutablePoint(ImmutablePoint):
   ...

here MutablePoint is not decorated so it can be mutated. Immutability is only applicable to the class being directly decorated, not to its ancestors and not its descendants. This is also why I prefered the @decorator syntax instead of a mix-in class.
And this is perhaps the main difference from member: Final[type] which sticks to properties even with inheritance. If it is desirable for a whole hierarchy of classes to be immutable, the whole chain of classes needs to be decorated.

2 Likes

What type checker(s) are you referring to?

For the use case you’ve put forward, immutability can be achieved (I think) with named tuple:

from typing import NamedTuple


class ImmutablePoint(NamedTuple):
    x: int
    y: int

While the example you gave might be a simplified one, I wanted to ask whether this would be enough for you or are you missing something?

1 Like

ReadOnly is currently restricted to be used only in TypedDict, but there has been some discussion for extending it to areas like this.

class ImmutablePoint:
    x: ReadOnly[int]
    y: ReadOnly[int]

This would mean (and I think it should mean) that x and y are only allowed to be defined in __init__ or __new__ (or to set the default instance value in the class scope).

We just need someone to do the work of specifying it.

In order to be type-safe, it can only be immutable/readonly if the ancestors are immutable/readonly.

example:

class A:
    x: int = 3


class B(A):
    x: ReadOnly[bool] = random.choice((False, True))


def foo(a: A) -> None:
    a.x = 4


b = B()
foo(b)
if b.x:
    assert b.x is True
    print("It's True")
else:
    reveal_type(b.x)  # Literal[False]
    assert not b.x,  "Crash!"

I don’t really see any future for this which is beneficial that doesn’t involve also adding runtime semantics. The value of immutability comes from what you can do when it’s actually guaranteed, and python’s type system doesn’t promise the soundness required to leverage this in this way.

I’d suggest looking into immutable structures available both in the standard library and in 3rd party for any uses you have for immutability, rather than relying on the type system to enforce this.

Some examples in the standard library: tuple, frozenset, namedtuple, frozen dataclasses.
Some examples of 3rd party libraries providing stronger immutability: immutables, msgspec

typing can accurately describe these, but typing can’t enforce immutability on arbitrary fields of python objects.

Another benefit I did not mention is that immutable objects are safe to share between threads and don’t need internal synchronization.

Replies

What type checker(s) are you referring to?

There is a flake8 check that reports this (mutable argument defaults)

While the example you gave might be a simplified one

It is just a simplified example yes. Not everything fits into a NamedTuple.

I wanted to ask whether this would be enough for you

I’m not trying to solve some problem I have. I’m suggesting a feature for the typing module.

In order to be type-safe, it can only be immutable/readonly if the ancestors are immutable/readonly.

Right, because of up-casting-

I don’t really see any future for this which is beneficial that doesn’t involve also adding runtime semantics.

This suggestion is for typing, not collections.abc. For collections.abc, yes, there would be some runtime implementation. For typing, no, since rules form the typing module are enforced by type checkers and not by the python VM. This suggestion is for the typing module.

The value of immutability comes from what you can do when it’s actually guaranteed, and python’s type system doesn’t promise the soundness required to leverage this in this way.

Well, I’m not suggesting to add any feature to the language to make objects immutable. This is a suggestion for the typing module, and therefore the developer as has the responsibility of implementing the class properly which does not mutate its state. Also, the type checker (e.g. mypy) would warn if the implementation of the class had methods that mutate its state.

typing can’t enforce immutability on arbitrary fields of python objects.

And again that is not the suggestion here.

I agree that an immutability marker will be useful. If we are going to formally bring the concept into part of the specification, I think we also need some consideration on the envisioned programming model.

Indeed immutability would be most helpful in conjunction with free-threading and functional programming. Right now immutability in Python has the status of a gentleman’s agreement between consenting adults: it is usually written in the documentation of libraries warning the users not to do nasty stuff to them, but it’s essentially undefined behavior if one tries to call {set,del}{attr,item} on it (does the {set,del}{attr,item} attribute exist? Is it None? Does calling it raise NotImplementedError or AttributeError or TypeError? Or does it just silently does nothing? Or is it actually mutable?)

dataclass

Frozen dataclasses raise a subclass of AttributeError, and I think this is the best approach in terms of runtime behavior, but can be very hard to introspect, especially in the case of non-dataclasses.

Some of these conditions can probably be detected automatically, some cannot, and a marker for immutability will be useful if not just to aid coding and introspection.

The immutability marker as it is currently proposed can serve as a first step to gauge usage, and in my opinion this (along with Final) is one the few proposed features that are worthy of exploring core syntax support in the future. Runtime guarantees on immutability will likely open up benefits in terms of optimization for free-threading / JIT.

Could you provide an example which doesn’t fit into a NamedTuple, and where your proposal would? I’m failing to see what gap this feature would fill. Thanks

And that puts too much burden on developers to get things right, hence this feature to help with code checking tools.

Mappings are not tuples, sets are not tuples, dataclasses are not tuples, immutable sequences might not have named properties, and generic classes with public and private properties are not tuples.
Also, this is not about language features (e.g. collections.abc). This is about the typing module and type checkers. A proposed feature for collections.abc could follow once one has a prototype or beta implementation on typing.

I don’t think this belongs in typing. The type system should describe python objects accurately. Unless runtime has immutability, the type system shouldn’t care.

The type system will never describe python objects accurately. As far as I can tell having a fully accurate representation of the runtime semantics was never even a goal. The very first typing PEP already declared that int is a subtype of float which might be pragmatic but is clearly not accurate.

The runtime ignores all type hints e.g. this code runs fine:

x: int = 1.0

The type hints should be understood as a statement of intent that is only enforced through a type checker rather than any statement about what is enforced by the runtime itself.

Making immutable, hashable etc classes in Python is a perfectly valid thing to do and the type system needs to understand that. There are plenty of examples of immutable Python classes including in the stdlib (grep for def __hash__). Even the typing module has them!

Maybe I’m misunderstanding the request in this thread but I am trying to add type hints to immutable Python classes (is SymPy) and it seems to fail because I can’t communicate immutability somehow. I have a superclass Basic and a subclass Expr. Both classes have a ._args attribute which is a tuple of Basic for Basic and Expr for Expr:

from __future__ import annotations

class Basic:

    _args: tuple[Basic, ...]

    def __new__(cls, *args: Basic):
        obj = super().__new__(cls)
        obj._args = args
        return obj

    def __repr__(self):
        return f'Basic({", ".join(map(repr, self._args))})'

    @property
    def args(self):
        return self._args

class Expr(Basic):

    _args: tuple[Expr, ...]

    def __new__(cls, *args: Expr):
        return super().__new__(cls, *args)

Although mypy accepts this, it is rejected by pyright because _args is mutable:

$ pyright t.py 
/stuff/current/active/sympy/t.py
  /stuff/current/active/sympy/t.py:23:5 - error: "_args" overrides symbol of same name in class "Basic"
    Variable is mutable so its type is invariant
      Override type "tuple[Expr, ...]" is not the same as base type "tuple[Basic, ...]" (reportIncompatibleVariableOverride)
1 error, 0 warnings, 0 informations 

What is the expected way to declare that the attribute is supposed to be immutable and/or that instances of a subclass have attributes that are subtypes of those of the superclass?

Clearly pyright has a notion of mutability/immutability here and it apparently affects how the attribute can be overloaded in the subclass. There should be some way to declare that it is supposed to be immutable from a typing perspective if that affects the variance.

2 Likes

I 100% agree with you. However, for now, would it work to replace _args with __args, and then expose it privately using a read-only property named _args?

I haven’t tested it, but is __args tested for incompatible override?

We already have args as a read-only property for the private _args so presumably we don’t need a third level. I guess what you are suggesting is:

from __future__ import annotations

class Basic:

    _args: tuple[Basic, ...]

    def __new__(cls, *args: Basic):
        obj = super().__new__(cls)
        obj._args = args
        return obj

    @property
    def args(self) -> tuple[Basic, ...]:
        return self._args

class Expr(Basic):

    def __new__(cls, *args: Expr):
        return super().__new__(cls, *args)

    @property
    def args(self) -> tuple[Expr, ...]:
        # Basic is not assignable to Expr:
        return self._args # type:ignore

e = Expr()
reveal_type(e.args) # tuple[Expr, ...]

Is there a way to do that without duplicating the body of the args property method? I think in a stubs file you could just declare the property method with different types but not provide a body so is there a way to do that with inline annotations?

The type:ignore is needed because the type checker does not understand that the _args attribute of Expr is a tuple of Expr rather than Basic. Declaring that it is of type Expr is rejected by pyright because it thinks that _args is mutable and therefore invariant in the type parameter.

For context there are around 1000 Basic subclasses plus many more downstream. We don’t want to duplicate any actual code across the 1000 classes which each inherit around 50 methods from Basic. Each subclass has different invariants for how many args there are and what types (which Basic subclasses) are in the args tuple:

class Add(Expr):
    _args: tuple[Expr, ...]
class Mul(Expr):
    _args: tuple[Expr, ...]
class Pow(Expr):
    _args: tuple[Expr, Expr]

class Boolean(Basic):
    ...
class And(Boolean):
     _args: tuple[Boolean, ...]
class Not(Boolean):
    _args: tuple[Boolean]
class Equation(Boolean):
    _args: tuple[Expr, Expr]

This does not break any invariance rules because all classes are immutable. There doesn’t seem to be a way for a type checker to understand that though.

Apparently a typechecker can understand immutability when a frozen dataclass is used:

from __future__ import annotations

from dataclasses import dataclass

@dataclass(frozen=True)
class Basic:
    args: tuple[Basic, ...]

@dataclass(frozen=True)
class Expr(Basic):
    args: tuple[Expr, ...]

e = Expr(args=())
reveal_type(e.args) # tuple[Expr, ...]

There are many reasons we can’t use dataclass though and type checkers should not special case things like this because dataclass is just a code generator: there should not be any distinction between using dataclass or using the equivalent code.

You need higher kinded types for this, because what you essentially want here is for args to be generic at the type level, rather than the instance level.

dataclasses were specifically decided to get special casing here, and some of the things (via dataclass_transform) end up with actual immutability rather than bypassable immutability. It’s worth pointing out here that the mutability here is not with respect to the value but of the class field. (which is why frozen works).

that also means that this proposal should likely work for your use here: Expanding `ReadOnly` to normal classes & protocols

Do you mean something with type parameters?

I might have oversimplified the example a bit too much…

I don’t see any necessity here for “actual immutability” when standard methods have worked just fine for a long time under the consenting adults principle. You can of course do this:

>>> import fractions
>>> F = fractions.Fraction(3, 6)
>>> F
Fraction(1, 2)
>>> F._numerator = 3
>>> F._denominator = 6
>>> F
Fraction(3, 6)

Obviously you should not do that though and ideally a type checker would disallow it. Part of the attraction of using a type checker is that it can enforce things that are not enforced by the runtime.

1 Like

There’s a few equivalent ways that arent actually allowed because all of them result in HKTs either directly or indirectly, type vars with bounds that are themselves type vars and generic metaclass are the two most obvious ways here.

I actually don’t agree with this in the way you have presented it. I want a type checker to reflect what works at runtime, specifically reflecting the consenting adults principle you mentioned above. If I choose to reach into a typed internal, that’s now my problem, but tooling can still show me the invariants that need to be upheld if it is typed correctly. Typing this correctly requires HKTs to be done via generics, or if it’s actually an invariant, even internally, that these are set-once, another way to spell out that this value should never change once set. I don’t think it’s a type checker’s place to tell me not to touch an internal, nor is it the type checker’s place to lie about the type information of that internal.

I do with free-threading work in progress, but this wasn’t really the main focus of that. I don’t actually mind if there’s a way for users to spell out specific program invariants like “once set, this value should never be changed” in the type system, and am broadly in favor of the linked expansion of ReadOnly, I just think that people would be better served by a runtime that could enforce it given various improvements to concurrency in python on the horizon.

How would that work in this case?

>>> A: int = 'asd'
>>> A
'asd'

Any declaration of a type is not literally enforced by the runtime.

The original example that I had here can be reduced to:

from __future__ import annotations
class A:
    _arg: A
class B(A):
    _arg: B  # pyright rejects because _arg is mutable

The runtime enforces nothing here: I could make instances of A or B that have no _arg attribute or that have it as any type of object. Although nothing is enforced at runtime, pyright accepts the first hint but rejects the second. There needs to be something that allows me to communicate the fact that A is not invariant with respect to the type of _arg because it is supposed to be immutable/read-only.