PEP 795: Add deep immutability to Python

Hello, we have just created PEP 795 to add deep immutability to Python:

Abstract

This PEP proposes adding a mechanism for deep immutability to Python. The mechanism requires some changes to the core language, but user-facing functions are delivered in a module called immutable.

[…]

Deep immutability provides strong guarantees against unintended modifications, thereby improving correctness, security, and parallel execution safety.

[…]

Immutability in action:

from immutable import freeze, isfrozen

class Foo:
    pass

f = Foo()
g = Foo()
h = Foo()

f.f = g
g.f = h
h.f = g # cycles are OK!
del g # Remove local ref to g, so g's RC = 1
del h # Remove local reg to h, so h's RC = 1

g.x = "African Swallow" # OK
freeze(f) # Makes, f, g and h immutable
g.x = "European Swallow" # Throws an exception "g is immutable"
isfrozen(h) # returns True
h = None # Cycle detector will eventually find and collect the cycle

The whole pep can be found online:

We already had a pre-pep discussion here.

Let us know what you think!

@mjp41 @stw @matajoh @xFrednet

7 Likes

What changes have been made based on feedback in the pre pep discussion? Skimming the document, I couldn’t find anything.

Especially worries about the fact that this mechanism can’t guarantee immutability (see my counterexample), the fact that it breaks very simple assumption (an instance of list can be appended to) and that trying to freeze arbitrary objects could result in your interpreter becoming unusable. I can’t find anything that considers these points.

10 Likes

These are questions that went ignored in the previous thread so I’ll restate here.

When I tested the reference implementation I found that I could not freeze a fractions.Fraction. Is that just a limitation of the reference implementation or is it expected that the final implementation would not be able to freeze a Fraction? I think the error message was something like “cannot freeze module object” which seems like a significant limitation. Also from testing it seems that a class that has @classmethod results in instances that cannot be frozen so there are perhaps several things that would make Fraction unfreezable.

Given f = fractions.Fraction(1, 2) is it expected that f would be freezable?

If the answer is “yes” then what is the full set of objects that would be frozen by freeze(f) and what are the implications of all of those things being frozen?

If the answer is “no” then what restrictions would the author of a class like Fraction need to follow if they wanted the instances to be freezable?

2 Likes

Was there any discussion of an immutable proxy alternative? I see it’s not mentioned in the rejected section.

The basic idea is to wrap an object graph with a new object that provides recursive, synchronised access to the graph and supports moving between contexts. On the inside, we can pull more tricks as internal implementation details than are possible with mutating semantics like these (e.g. it becomes valid to pickle/move/unpickle the graph behind a proxy object, or to simply share the memory, or to use a proxy-level lock rather than individual object level flags, or let each object define its own proxy type, or handle it differently for processes vs. threads. vs interpreters, etc.).

It seems unavoidable that programmers under this approach will have to deal with very infectious failures - passing any user-defined object into a library that uses freeze could result in my entire program being frozen and hence broken. Having to opt out of someone else’s optimization is a pretty terrible experience.

(Also, I suspect the module will probably be pushed to collections.immutable, based on current precedent. Not saying you should, just be aware of that possibility. The name is the least of my concerns right now :wink: )

5 Likes

First of all, impressive work everyone!

High-level, my two main complaints are:

  1. Instinctively, the uses of del and = None feel really foreign and “non-Pythonic”. Part of Python’s appeal to many is that you don’t have to think about garbage collection (either cyclic or with reference counting), and I feel like the current proposal hurts that idea. I’m not sure what the best way around this is yet.
  2. To freeze() an object, you have to inherently mutate it, which is a little odd in something that’s supposed to make something immutable. I think this leads to a lot of additional complexity, such as deep recursion that causes random things to become immutable. Thinking out loud, an alternative could be to allow objects to opt-in to themselves being immutable in __new__ or __init__ (think super().__new__(frozen=True)), so any state that needs to be frozen can be validated right then and there.

To make its instances freezable, a type that uses C extensions that adds new functionality implemented in C must register themselves using register_freezable(type) .

I would expect this to be a dunder __freezeable__ attribute, rather than some arbitrary function imported from a module. Better yet, this could be a type flag in C. Was this inspired by ABC’s register?

  1. Modify object mutation operations (PyObject_SetAttr, PyDict_SetItem, PyList_SetItem, etc.) to check the flag and raise an error when appropriate.

Hm, won’t this do bad things to performance? Seemingly, there would be an extra load and check for every attribute write, container store, etc. Extensions that want to support this will also have to put these everywhere. This will be even worse on FT, where that check will probably have to be atomic.

As a necessary requirement for the extension Sharing Immutable Data Across Subinterpreters, we will add support for atomic reference counting for immutable objects.

Atomic reference counting has been tried in previous GIL-removal attempts, and failed due to the performance hit. Is the plan to make all reference counting atomic, or only some magical cases?

Yeah, I think this is a much more robust approach, because the wrapped object doesn’t even have to be immutable, or follow strict Py_CHECKWRITE contracts. Immutability is nice sometimes, but you still generally want some shared data between threads. Several months ago, I threw together a POC for this exact idea with subinterpreters via an immortal object proxy. There are definitely some issues with it, but I think this PEP doing something similar would play better with the ecosystem (particularly because you wouldn’t need to modify every C type out there).

3 Likes

If you do copy.deepcopy(..) on an immutable object is the result also immutable?

Edit:

Similarly if you pickle and unpickle does it maintain immutability? (Even if there is a custom setstate, etc.). In particular I’m thinking about multiprocessing and if immutability would make it to other processes.

2 Likes

This just strikes me as too huge a change to Python semantics. The fact that arbitrary sections of the object graph could be made immutable from any given call to freeze seems like an unacceptable risk. The fact that you can just call freeze(obj) on something and thereby essentially turn it into something quite different (in terms of supported operations) is similarly disruptive (e.g., the example of freezing a list and then being unable to mutate it).

There also appear to be some gaps and inconsistencies in the PEP.

Consider the object graph o1 --> o2 --> o3 where o1 and o3 can be made immutable, but o2 cannot. What are the possible behaviours of freeze(o1)?

  1. Freeze fails partially. All subgraphs which could be made immutable entirely remain immutable. Remaining objects remain mutable. In our example, o3 remains immutable but o1 and o2 remain mutable.

The other numbered items are marked as “rejected”, so apparently this is the accepted one. This suggests that o3 would remain frozen as a side-effect of a failed attempt to freeze o1. But then later:

Following the outcomes of the design decisions discussed just above, the freeze(obj) function works as follows:
[…] If obj cannot be made immutable, the entire freeze operation is aborted without making any object immutable.

Now it says that the freeze will not make any objects immutable.

Then there is this:

Although we have not seen this during our later stages of testing, it is possible that freezing an object that references global state (e.g., sys.modules, built-ins) could inadvertently freeze critical parts of the interpreter.

This is not very comforting. :slight_smile: What kind of testing has been done? If I have this:

import sys

class Foo:
    def __init__(self):
        self.x = sys.modules

f = Foo()
freeze(f)

. . . what is the defined behavior? What if it is sys.path instead of sys.modules?

Mitigation: Avoiding accidental freezing is possible by inheriting from (or storing a pointer to) the NotFreezable class.

This mitigation doesn’t seem adequate, since if I have an object that references sys.modules, I don’t know if someone else might wind up trying to freeze my object indirectly via some chain of references.

In general I’m not convinced by the argument that freezing is opt-in so everything will be fine. There could be rarely-used code paths that mutate an object, which under this PEP could then unexpectedly fail due to distant code that freezes some other object that was indirectly linked in the object graph. The problem is compounded if it can affect critical global state (like sys.modules). We don’t want a situation where you install some library and everything appears fine for a long time and then later on you call some function in a seemingly innocuous way and suddenly your interpreter is ice-nined because the freeze made its way to sys.path or the like.

4 Likes

So, the first replies here echo the concerns I stated on the ideas thread, and which are not as of yet addressed.

TL;DR: the freezing of an instance’s class and all its hierarchy (superclasses, metaclasses) seens to be too over-reaching, and may not only cause severe problems as stated by @steve.dower , @ZeroIntensity and @BrenBarn posts (so far).

So, for Python since its inception up to know, if I create a class - I can come “later” and change it - by replacing a method, or changing an attribute. That may or not be “good practice”, but it is inherently allowed in Python.

And with this feature as currently stated, if between the two events, an instance of my class happens to be in a frozen graph, all of a suden an exception will be raised - in my code.

To keep it visual, let’s say we have these:

Project A: my project, project B: another 3rd party lib which makes use of freeze and Project C: a “final user” project consuming both A and C

Project C comes to add an instance of A.Klass() to a list - and pass that list to B.communicate() which freezes it, and therefore freezes A.Klass.

The remedy with the PEP as is is to mark A.Klass as “NotFreezable” - which could be feasible (more like a workaround) for new code - but what for already existing code?

What if the “not good programming practice” A.Klass mutation is restricted to a new subclass being created?

The PEP text supposedly addresses this, but doesn’t feel like complete:

Summary

Because this reference [to subclasses] does not get exposed to the programmer in any dangerous way, we permit immutable classes to be subclassed (by mutable classes).

What is “any dangerous way”? The fact is that type(A).__subclassess__() would allow one to retrieve a “non-frozen” subclass of a frozen class: there is no semantic difference if __subclasses__ were a direct attribute instead of being a callable.

If the protocol will resort to raise an error when super(None, type(self).__subclasses__()[0]).mutate_class() it could as well just raise the same error when calling self.mutate_class() -
Maybe assuming that classes are not freezable at all is a worth (and required) compromise on the non-strict immutability side.

Other than rejecting class freezing outright, I’d pursue the proxy idea: That is, instead of eagerly freezing everything in a subject graph, guard the attribute/item retrievals on a frozen object so that nested items are either lazily frozen, or retrieved as a “frozen proxy” (which seems suitable for classes) - in this scenario, a 'frozen proxy" for a class would have the same guards.

(the original object owner could be able to bypass that by having a reference to an item in an object graph prior to freezing, ang them mutating that branch after freezing the root object - but, I think for cases like this “consenting adults” apply)

Have you considered my suggested consenting-adult approach of limiting the recursion to only the subordinates of an object being frozen to make the proposal a lot less prone to freezing random objects?

It’s hard for me to imagine how a proxy approach can work when a proxy object cannot possibly be of the same type as the actual object. It may be able to behave like the actual object, but it will fail if its exact type is checked on or if direct access to a slot of the actual object is needed.