Adding Deep Immutability

Will type() and isinstance() and repr() on frozen objects reflect the limitation that they can no longer be mutated?
Will users be confused by frozenset vs a frozen regular set? A tuple vs a frozen list?

Both Why “frozenmap” and not “frozendict” in PEP 603 and the rejection thread of PEP 351 – The freeze protocol | peps.python.org critisized the usability of one-size-fits-all freezing, instead recommending separately designed APIs.

I don’t know if those arguments should stand in the way of safer large-scale data sharing :rocket:; but I think “this looks like a dict, why is .update() raising an exception?” category of user confusion needs addressing, and a “How to teach this” section will be important. There is a whole “do we now have 2x types?” mental area to cover…

1 Like

I’m not personally too bothered by this. To give a concrete example, the Rust language tracks mutability independently of type, and it doesn’t seem to cause any issues. And while Python doesn’t have the language mechanisms to track mutability automatically, developers can (and will!) still do so. After all, for most of its history Python didn’t even track types, leaving that to the developer (and even now, type verification is a separate process, outside of the language itself).

3 Likes

“This dict” doesn’t just appear out of nowhere. The developer who passes the dict to a function that calls .update() should be fully aware of the mutability of the dict before passing it to the function already, so if it raises an exception, that’s great–it just means that Python has helped the developer prevent a dict that isn’t supposed to be modified from getting updated, and from getting a mysterious, hard-to-debug misbehavior downstream.

Mutability as proposed is going be an additional object state that’s tracked indepedently of type. I don’t think it’s going to be hard to teach because as I said before, we are already mentally tracking mutability ourselves. The new feature just formalizes and codifies it.

C’mon now. Rust forces the use of a type system, and it has essentially the most advanced type system of any mainstream language, and that type system is capable of expressing this. Python isn’t this. Saying “the developers can just use their brains to track this, like they did for types before we had typing” is a half truth. Yes, but types wouldn’t be so prevalent in the industry today if that was the whole story.

I don’t think the PEP should be rejected because the type system cannot express it. Multiprocessing is an example - we cannot express whether objects are pickleable in the type system, and multiprocessing is still useful and no one is arguing for its removal.

However - let’s consider who this feature is for. Is it for advanced users? Folks dabbling with Python, like a high school math teacher trying to read an Excel file, probably aren’t going to know about this or really be capable of using this effectively. Is it for my coworkers? I’m going to disable it via lint rules as soon as it causes a production issue (just like I did multiprocessing :wink: ) which is probably going to be very quickly. The proposed technique of “we’ll just know that the arguments have been frozen” doesn’t work well in large codebases.

I don’t think the authors need to flesh out the typing story here so the PEP doesn’t get rejected. I think the authors need to flesh out the typing story so the PEP provides the maximum amount of value in places where it’s likely to be used. Otherwise it’s kind of a missed opportunity, right?

4 Likes

So, if one freezes a set following this new proposal, will one obtain a frozenset? Or a frozen set that isn’t a frozenset?

1 Like

I think you may have misunderstood what I was trying to say. I completely support improving how immutability is tracked - whether that’s via types or some other method doesn’t really matter to me, but having some way to flag when an immutable object is passed to a function that expects a mutable object would obviously be a worthwhile improvement.

I do want to push back on people who suggest that without such tracking, the PEP should be rejected. That was the point of my comment - “tracking immutability” doesn’t mean a complete rework of the typing system, it can be added on as a separate attribute to check (like Rust did, although I concede that Rust has a lot of machinery that Python’s type checkers don’t), or as a last resort it can be tracked manually. Tracking manually isn’t ideal, but it’s also not a disaster, in the same way that untyped Python isn’t a disaster.

I do think the PEP should cover immutability tracking (or typing, if you prefer to frame it that way). Although it may be that such tracking has to be left as an open question for the typing community to consider further, with a relatively minimal solution in the initial implementation.

4 Likes

I’m not pushing for rejection at all; I hope that’s been clear. I don’t think @Tinche has been either.

But I think the proto-PEP can be improved. Not everyone in the community maintains libraries,[1] and it gives them blind spots in terms of how we discuss changes, since they think primarily about how they will use a new tool in their application code. Addressing how this will be rolled out to developers and how it will impact the ecosystem is what I’d expect in the Backwards Compatibility section. That section is already very good, with details for native code, etc. But we can spare a sentence or two for how this impacts all packages and contracts between code from different authors.

I also wonder if this could be narrowly targeted for the free threading builds under the same experimental status? I’ve read more now and I see a bit of how the family of proposals fits together. But if there’s a problem halfway through, we’ll be sort of stuck. Free threading having a separate build gives us a unique opportunity.


  1. he says to @pfmoore, of all people! :wink: ↩︎

2 Likes

Collected thoughts on static typing and immutability

This is an attempt to collect answers to a lot of different points made above in one coherent place. Hope this makes sense!

Regarding static type checking — this is a very interesting question, and several of us have been working on (or are working on) type systems that include deep immutability in a different context. I will update the PEP with a discussion about typing under the Deferred Ideas section. As that suggests, we don’t think that typing is on the critical path for this work (although we’d love to explore this more in follow-up work). Below is an attempt to respond to multiple posts above regarding both typing of immutability and its relation to when things can become immutable and what that means for type-based reasoning.

Safety and when things can become immutable

if any mutable object can become immutable at any time, reasoning about what is safe is going to be real tricky, typing or not

This is a good point, and to some extent part of the very problem we are trying to solve. Types — as was pointed out elsewhere in the discussion — do not capture everything. For example, with Python’s reflective powers it is possible to remove a method from a class or change an object’s type in ways that will break type safety. But more along the lines we were thinking when we started on this PEP — types capture neither thread-safety nor whether objects are shared across threads (etc.), so reasoning about whether a call to a method on a mutable object is safe or not is already very tricky! At least with immutability, mistakes will lead to exceptions — not silent errors, so while it is not perfect that an object can become immutable, we feel it is a step in the right direction.

We envision that an exception thrown by an attempt to mutate an immutable object will show the place in the code where the object was frozen, as part of the exception.. This should help tracking down inconsistency bugs like this one.

Can freeze(x) make a deep copy, or would that cost too much?

Freezing by copy avoids the problem of mutable objects possibly becoming immutable by a non-local operation. However, because freezing objects also freeze types, there are some challenges with this approach. Consider the following:

>>> class Foo: pass
>>> 
>>> f = Foo()
>>> ff = freeze_by_copy(f)
>>> 
>>> f.__class__ == ff.__class__
false

...

>>> fff = freeze_by_copy(f)
>>> 
>>> ff.__class__ == fff.__class__
???

First, if we are to follow the principle that freezing never turns an existing immutable object mutable, then we have to copy the Foo class before making it immutable above. Maybe this is natural since Foo and ImmutableFoo (or however we might represent them) are different types.

The second time we freeze f, what is the class of the resulting copy fff? We could keep track of the frozen copy of the Foo class, but the in the example may have altered the Foo type, so there is no guarantee that ff.__class__ and fff.__class__ are the same. Thus, having types such as Foo and ImmutableFoo does not suffice since the shape of the type is fixed at the time of freezing. This is of course a hard thing to capture in type systems, and maybe that’s fine? I mean, there is no guarantee in Python that a type will not be changed at run-time. A pragmatic solution could be to allow certain objects to be ”frozen in-place”, such as type objects for example. Just like the PEP proposes a type that prevents freezing, we could have a type that opts in to supporting freezing in place.

Freezing by copy would open the door for optimising things. For example, we can move all frozen objects and lay them out nicely — or more compactly perhaps — in memory. For example, all immutable objects in a cycle will have the same life-time so we could make a single allocation for them, rather than individual allocations for the individual objects.

Also, freeze by copy might have unexpected effects for users of id(). The thread about hashing pointed out that the default hash implementation of Python uses the object id. This would mean that hash(x) would return something different from hash(freeze(x)). This might be the correct solution, but could also be confusing to users.

Another approach to freeze(x) would be one that ensures that all references — except x — to the objects being frozen come from within the object graph being frozen. If we detect that freezing would make an externally accessible object immutable, we could raise an exception (or possibly try to solve the problem by making a copy, but this can become tricky depending on the shape of the object graph). With this approach, we are never able to freeze a type if it has multiple incoming references (for example, more than one instance). We would have to make type immutable (which is probably good) to be able to freeze types at all.

One possibility is to support multiple styles of freezing and let the programmer decide. If copying is mostly put in to serve static typing, it seems wrong (IMO) to push this on all programmers that do not use static typing. So maybe there is space for both a version of freezing that is efficient but hard on the static type system and a version which is more easily integrated with static typing?

In summary (albeit a bit vague):

Kinds of freezing Performance Static Typing Notes
by copy worst better Copies of types?
only isolated object graph worse better How to handle types? By copy? Freeze in-place? Fail?
in-place (like proposed in the PEP) better worse Solves types problem

Immutability vs. Read-Only

The point about tracking ”mutability state” of variables vs. objects is essentially immutability vs. read-only — unless we add some extra uniqueness tracking (or similar) to ensure that variables that point to the same object always share the same immutability state. This is possible in e.g. Rust because of how Rust places very strict limits on an object graph, but not in Python because Python’s object graphs are full of cycles and pointer aliasing.

Read-only references are a lot less powerful than immutable objects and most importantly not thread-safe. If we weakened immutability to shallow or read-only, we could no longer safely share immutable objects across threads. A read-only type is therefore not strong enough to capture what we need for this PEP.

Arguably, if operations on a type T can fail because the T object has been made immutable, that is similar to type systems where T can always be ”null”.

Challenges of typing immutability

There are several challenges when adding immutability to a type system for an object-oriented programming language. First, self typing becomes “more important” — some methods require that self is mutable, some require that self is immutable (e.g. to be thread-safe), and some methods can operate on either self type. We would need a way of expressing this in the type system. Furthermore, deep immutability requires some form of ”view-point adaption”, meaning that when x is immutable, x.f is also immutable, regardless of the declared type of f. Neither is (we believe) supported yet in Python’s type system. These challenges for typing are orthogonal to the design considerations above such as whether freezing happens in-place or by copy, on isolated object graphs, and the handling of types.

In conclusion

We believe that freezing objects in-place (like in the PEP) is a good starting point for adding immutability to Python (but it need not be the end point). Freezing in-place does not make type annotations fundamentally less safe than currently – admittedly it does add another foot gun in term of the ability to freeze objects in-place. However, at the same time, it makes sharing objects between threads safe, in particular avoiding problems which are harder to debug than the potential problems due to in-place freezing.

Down the line, we would be interested in looking into extending Python’s type system with support for immutability and as part of that it may make sense to also look at adding new variants of freeze, e.g. by copy. Maybe by that time we will have made some types and other objects immutable by default (e.g. type and all integers) which might make it easier to add versions of freeze which are more amenable to static typing.

6 Likes

Immutable objects become cyclic because we first create a mutable object graph and then make it immutable. Cycles are pretty common in Python — for example if you freeze the None object, you end up with several cycles.

The SCC additions are an optimisation when you are using free-threaded Python and necessary when you are using subinterpreters (because they are possible, just as you say). The reason for the latter is how subinterpreters do memory management. They assume they are in complete control of all the (non-immortal) objects they can reach, which will no longer be the case when immutable objects become shared between subinterpreters by reference.

We handle this by removing immutable objects from the GC’s of their creating subinterpreters. This means that immutable objects are managed completely using reference counting. So if we did not handle cycles, we would leak memory.

Hope this makes things clear!

1 Like

Cycles in Python are not specific to freezing. Types in python generally have cycles.

Here is an object graph of None in 3.12 in case someone is interested

If someone wants to do a deepdive into the SCC stuff @stw mentioned, I can recommend: https://www.youtube.com/watch?v=qaWNmdVYi_s&t=1s and the connected paper: https://dl.acm.org/doi/10.1145/3652024.3665507

Edit: This is meant as a reply to this post above.

Super good point! Will do!

Nor sure if I understand what you mean but what we are proposing will work with free-threading and subinterpreters both. There are some more or less subtle details — for example subinterpreters is fundamentally data-race free which means that freezing objects will not have to worry about other threads mutating the object graph while frozen etc. And as I wrote in my reply to Beni Cherniavsky-Paskin (Adding Deep Immutability - #49 by stw), subinterpreters needs additional work to handle memory management of objects share between subinterpreters.

Re Naming

Several people have made good points about naming – frozen modules, frozenset, etc.

I am not sure how this should be reflected in the PEP. Is a good way to add a naming section where we discuss appropriate names, or how are things like that usually handled?

I just want to confirm that if an object in some tree is gc and can’t be frozen but the rest of the tree leading to it is able to be frozen that traversably on the generation 2 is being maintained. To will be very important fir my criss langauge garbage collect proposal.

My reading of the original proposal was that nothing changes and I can feel safe. But deep in the conversation the is discussion of frozen trees being ref counted only (losing gc identity and thus no longer traversable in my gc search.) I may just be totally misreading, but I just want to make sure that an object can opt out of freezing and if it does those objects that connect through to it will always maintain their gc identity and be on the candidate list up until the end of reachablity phase (not skipping a traverse).

I know mine is an edge case so please bare with me.

(Re. freeze-by-copy:)

A new type object. In this paradigm, freezing is expensive: if you freeze something, you should hang on to it & reuse it.

Well… a new type object unless the type is a immutable built-in (or extension) type, like dict or type(None). Those can be frozen in-place, keeping their id(), if the __subclasses__ issue is solved. So can int or string instances.

Can’t do that it general; this would need to freeze the class attributes in-place, recursively.

It has the same caveats as __deepcopy__, or pickling/unpickling. Those have their edges, but aren’t extremely surprising.

I meant that this feature could be included first in the free-threaded build of CPython – the ones with the t – without being in the main one. So, imagine the PEP is written and accepted for 3.15 (an ambitious timeline, but possible!): we could then have Python 3.15.0 without immutability, and 3.15.0t with immutability. The entire feature can be feature-flagged in that way.

The SC would have to agree to that plan, but I think it’s worth considering. This feature will be super-useful for various applications, but it obviously has a relationship with free-threading.

I’m not sure there’s any utility to restricting this to one build–they share the same codebase so it doesn’t reduce effort there, does it? And anything that works on free threading is going to be fine in a single threaded context (I would hope!).

1 Like

I think it would be a lot nicer if isinstance(x, Hashable) simply gave the correct answer.

I am not sure that I follow the above correctly. In this proposal, a tree can only be frozen if all objects in the tree can be frozen. For example, if we have a tree with three objects like this a --> b --> c we can freeze either c, b --> c or all of it a --> b --> c. If c cannot be frozen, then neither can a or b unless you e.g. sever the connection between b and c first.

Let me know if that answered your question!

In the proposal, opting out of freezing is possible by either subclassing NotFreezable (means no instance of class can be frozen) or by adding a reference to a NotFreezable object.

To share immutable objects between subinterpreters (which this PEP does not do but follow-up PEP will propose this), we take the objects out of the GC responsibility. This is important with subinterpreters because each subinterpreter assumes it will be the only GC touching any object it has created, which does not work well with sharing. If your GC proposal removes this problem for subinterpreters, then we would not need to do this.

Can you share any more details about your proposal?

1 Like

Thank you for answering like this. This is great.

Already, the type system doesn’t capture anything like this, right? Why is it important to support this kind of “type surgery”?

That’s good. If they’re in the same set or dictionary, they will suffer from fewer collisions. A copy has a different id, so I don’t see why someone would be confused that hashing-by-id gives a different hash.


Freeze by factory

I want to propose an alternative to freeze-by-copy and freeze-isolated-object-graph. I propose: freeze-by-factory.

The idea is that you would apply freeze to a factory:

@frozen_factory
def contruct_object(...) -> X: ...

The frozen_factory would somehow change CPython’s state so that every variable that was constructed between when construct_object is called and returned (and has not been deleted) would be frozen. This prevents the extra work of copying since nothing is copied, while maintaining all of the static typing. During construction, the components of X that are being assembled remain mutable. And finally, the returned X can be recognized by static type checkers as being frozen!

Edit: The arguments of such a constructor would have to have already themselves been frozen. The decorator could check that.

Kinds of freezing Performance Static Typing Notes
by factory best better

Does this work for your applications?

Hm, maybe my example was bad. The point was not freezing the same object a second time but rather that mutable types can evolve and immutable types cannot. As proposed, freezing will make the type immutable which “solves this problem”. We might end up with multiple versions of the same time frozen at different stages. How to handle that with static typing?

The __subclasses__ issue is solved. This list stays mutable and protected by a lock to ensure thread-safety of e.g. concurrent subclassing.

Is there a specific reason why the class attribute must be frozen in-place? Could that not be copied? Regardless, I am not in favour of the “freeze objects by copy but freeze types in-place” solution myself. I think it is possible to do — if you make a type dependent on a mutable object, it probably makes sense to freeze that mutable object too when you freeze the type. So we would have to keep track of the reason for freezing.

Very good point!