PEP 795 revamped: Deep Immutability for Efficient Sharing and Concurrency Safety

After the previous round of discussions here at DPO, we have completely revamped PEP795: Deep Immutability for Efficient Sharing and Concurrency Safety. This rewrite:

  • Focuses more on programming model, not just implementation details
  • Aims to be clearer about what the costs are for different parts of the design, both in terms of complexity for programmers and implementers of this PEP
  • Aims to make the semantics of freezing clearer, and provide more examples of how freezing propagates
  • Adds a new design that provides control over freeze propagation
  • Adds escape hatches and clearly points out who implements them
  • Motivates the design by linking it clearer to sharing objects across sub-interpreters
  • Aims to be clearer about the semantics of immutable functions (and why they are unavoidable)
  • Discusses the role of types and future plans for types
  • Includes direct sharing of immutable objects across sub-interpreters, rather than making a separate PEP for this
  • Removes a lot of the rejected alternatives from the original PEP; this is motivated by this PEP already being very long, and because the inclusion of direct sharing across sub-interpreters motivate many of the design decisions

We are very grateful to the many comments and threads on DPO that have contributed to this PEP, and offline discussions with members of the Python community.

7 Likes

A heads up: The link in the PEP to the previous discussion thread appears to be broken.

Thank you!

And now fixed.

1 Like

The PEP looks good right now, I personally have far fewer concerns than before. But a few notes anyway:

from collections import namedtuple
Person = namedtuple('Person', ['name'])

# Create deeply immutable tuple of named tuples
monty = (Person("Eric"), Person("Graham"), Person("Terry"))
is_frozen(monty) # returns True -- all sub-objects are immutable

Are you proposing modifications to namedtuple? Currently the returned type is mutable, but AFAIK for the examples to work it would need to be changed to not be. That would be a breaking changing. Making it freezable shouldn’t be an issue, but that’s not what is currently described in the PEP. Also, IIRC I have seen people modifying the returned type in real code. (also, for parity, anything that applies to namedtuple should also apply for NamedTuple if possible)

I didn’t check the other types in the stdlib list, there may be others as well.


The two example classes in “escape hatches” sections (Cell and PrimeFactoriser) are neither marked as freezable nor are explicitly frozen within the examples. I suspect this was a simple mistake, but that IMO can cause confusion.


Maybe I missed it, but was is the motivation for providing the Proxy option for modules? The random module is IMO poor example: IMO it shouldn’t be used directly by properly written code that is meant to be shared, each execution environment should create it’s own Random object locally.


Is there a good reason/precedence for violating PEP8 by naming the Constants like Yes, No in Titlecase instead of SCREAMINGCASE? On my first reading I thought immutable.Proxy was some kind of class.


I think interaction with typing, ABC and enum can’t be postponed to a later python versions, although I guess this could be hashed out after the PEP itself lands since none of those seem like unsolvable issues.

No we are not. A named tuple is shallow immutable today and if the items in a named tuple are all deeply immutable, then so is the named tuple.

Good catch! We have been going back and forth about the defaults for various things and forgot to change these examples to align with the current defaults. I’ll update this. Thanks!

The main motivation is that it simplifies refactoring code to use immutability by treating code whose function objects capture module objects the same as code whose function objects dynamically import module objects. (I personally disagree about random — I think it is a good example of a simple module that needs mutable state and it highlights a challenge for immutability in Python.)

No — thanks for pointing that out!

But it’s type isn’t:

Person = namedtuple('Person', 'name')
p = Person('Tobias')
Person.foo = "bar"
print(p.foo) # succeeds

Unless I am missing something, this is an issue.

1 Like

Oh yes, you are correct! Thanks for pointing that out. Indeed we need to explicitly freeze a named tuple’s type in order for the is_frozen() check to return True.

Yes, and this is a breaking change that needs to be carefully considered by checking real world examples if the types are currently being modified.


Also, discussion of type.__subclasses__ is currently absent outside of it being used as an example motivation for interpreter local and shared fields, but IMO it’s semantics should be clearly defined. I suspect this just got overlooked in a rewrite because the paragrah begins with we revisit the list of references from a superclass object to its subclasses.

2 Likes

I don’t see that. Named tuples will not change. However, if you want to shared objects directly across sub-interpreters, then you need to make the types of those objects immutable. Regardless of whether these types were defined by the namedtuple constructor or just a class.

Thanks! I’ll take a look at this after the weekend.

Aha, I thought you meant that the stdlib function namedtuple would call freeze itself which would be needed to make the examples currently in the PEP function and which would be a breaking change. I support this not being done, but it means that the examples in the PEP need to be adjusted.

Thanks! I already adjusted the one we were discussing now a couple of minutes ago. I’ll make a consistency pass on Monday when I address your comment on __subclasses__.

It looks great now!
Taking fo all the changes and taking the several issues we raised here in account!

1 Like

Just finished reading the updated doc and it’s got great improvements! I feel it explains the semantics and demonstrates the results (which freezing attempts fail, what objects end up frozen after a given call, etc.) much better.

I still have a couple notes on clarity:

  • There are a few places where the terminology gets confusing and I had to rethink my initial interpretation. In particular, the way that user-defined types can be frozen is described as “types are explicitly freezable by default” which in a normal context readers might reasonably parse as always freezable or a special subset of freezable. Really what you mean is “types are explicit-freeze-only”, which is pretty different; IMO different names would make the distinction between implicitly-freezable and explicit-only clearer.
  • Related to the above, perhaps it could be made clearer that e.g. “object d is freezable” often means after the user explicitly freezes its class object; it does not necessarily mean one can call freeze(d) out-of-the-box.
  • Ideally, example code would be provided showing how to use the proposed implementation together with the current concurrent.interpreters (or InterpreterPoolExecutor) API.

As far as the technical content itself, some minor thoughts - what if there was a special exception class FreezeError that was always used for exceptions from freeze()? And/or can freeze errors provide information about which sub-object encountered during propagation caused the error? Secondly, you report timing results on the test suite; I believe the pyperformance benchmark suite would be more informative.

I also have some more pragmatic reservations about how easily practical adoption could scale at the moment, given that a) sub-interpreter support in extensions is kind of stalled at the moment and b) to freeze an object depends on the types of all the objects within its graph, which in real-world programs will include various libraries the caller does not control and may only be used as a transitive dependency. Those libraries may be slow to make their types freezable or mark them as such.

Overall, I’m excited about the proposal and believe it would be really useful! Both for sub-interpreter usage and as a simple way to enforce safety in multithreading patterns or other cases.

1 Like

The concept of the state of the type already exists in Python for example, a file cannot be read from or written to when it is closed. Thus, making an object immutable in-place can be viewed as consistent with existing Python mental model.

I think this example is weak.
When a file is closed, most Python programmers treat that object as if the object doesn’t exist anymore.

Unless you expect people to treat immutable objects as if they don’t exist, it’s not a good parallel, and does not show consistency with existing Python mental model.

1 Like

That’s a good suggestion. I am making a consistency pass later today so I will include this in my bag of things to do. Thank you!

I see your point, but this is always the nature of freezability that it hinges on everything reachable being either freezable by freeze propagation or frozen already. I will try to make this clearer but I also want to avoid making the PEP (even) longer. Thanks!

Yes good point. I’ll take that up with the rest of the team.

Thanks for the suggestion. I vaguely recall design discussions we had about precisely this but that’s 12-6 months ago so my memory is hazy. I’ll get back to you after taking this up with the rest of the team.

Thanks for the suggestion! I will look into that.

Thanks for the encouragement (and all the suggestions)!

Hi Doug — irrespective of whether the example is weak or not, shall I read this as a comment on the rejected alternative that is being discussed in that context (“create a new type for the immutable version of each mutable type”) or a statement just about this particular example?

Thank you Joao!

1 Like

I also like this design much more, but I have a few questions/concerns about the implementation.

First, the PEP doesn’t address atomic reference counting for free-threading, which won’t work as currently proposed (because an object might become immutable after another thread has passed the check for the immutability flag).

Second, Py_CHECKWRITE looks like it will get hairy very quickly, because developers will need to call it every time that something escapes into Python. For example, imagine if a developer wanted to make the following code safe for immutable objects:

static PyObject *
my_method(PyObject *op, PyObject *arg)
{
    MyObject *self = MyObject_CAST(op);
    PyObject *result = PyNumber_Add(arg, _PyLong_GetOne());
    if (result == NULL) {
        return NULL;
    }

    long value = PyLong_AsLong(result);
    Py_DECREF(result);
    if (value == -1 && PyErr_Occurred()) {
        return NULL;
    }

    self->last_result = value;
    Py_RETURN_NONE;
}

In the above code, a single Py_CHECKWRITE call at the top is not sufficient, because the object might be made immutable in another thread (since calling Py_DECREF/PyNumber_Add might release the GIL) or by re-entrant code (such as in the __del__ of self->last_result). Thus, the deeply immutable version of the above would look like such:

static PyObject *
my_method(PyObject *op, PyObject *arg)
{
    if (!Py_CHECKWRITE(op)) {
        PyErr_WriteToImmutable(op);
        return NULL;
    }

    MyObject *self = MyObject_CAST(op);
    PyObject *result = PyNumber_Add(arg, _PyLong_GetOne());
    if (result == NULL) {
        return NULL;
    }

    if (!Py_CHECKWRITE(op)) {
        PyErr_WriteToImmutable(op);
        return NULL;
    }

    long value = PyLong_AsLong(result);
    Py_DECREF(result);
    if (value == -1 && PyErr_Occurred()) {
        return NULL;
    }

    if (!Py_CHECKWRITE(op)) {
        PyErr_WriteToImmutable(op);
        return NULL;
    }

    self->last_result = value;
    Py_RETURN_NONE;
}

Can you see how quickly this can get out of hand?

Finally, I don’t understand the PEP’s explanation about interpreter lifetimes. It says that objects are removed from the heap of the interpreter that created them, but how is that done, specifically? How does this deal with fields that are also allocated under an interpreter’s allocator? For example, list objects have an ob_item field that’s allocated via PyMem_*alloc, which can’t be deleted from any interpreter.

1 Like