PEP 795: Add deep immutability to Python

As a user who has been following (if not comprehensively this thread) and followed the subinterpreters work before, I just want to add my opinion, which is that I’m super excited about the potential of this proposal and the overall direction your group is heading, and I think having some version of deep immutability + sharing across interpreters would be a major improvement to Python.

There’s one caveat, which is that while the capability is the first piece, making sure it can be used with popular libraries and getting them to adopt it is necessary to have true impact. e.g. if extension types need to opt-in to freezing, how will you make it easy for extension authors to do so - otherwise you will end up in a similar position as subinterpreters themselves are currently, where one generally can’t even attempt to use well-known extensions because they/their binding generators don’t support it.

Beyond that position, I’ll add what I see as a key response to some counterarguments here. There’ve been statements like: “you can avoid data race problems if you just don’t mutate things”. That’s true in theory, but in practice we have to consider that most Python code is a highly interconnected graph combining objects & functions from many 3rd-party libraries. If I create and know that I have the only references to val and am only touching it with my code, I can just not mutate it. But when I pass it to a function from some library, how do I know whether or not it will mutate it? If I’m using their class, how do I know it doesn’t depend on some module global state which could get updated somewhere I don’t notice? What if the class is designed so that mutating instances is how they’re normally used? What if I write a library which supports a callbacks API, so that when it’s used I have no idea what user code is doing? My point is, ensuring nothing risky happens across all these libraries, when every object is a reference and you don’t know what else references the same object or what threads those references are used in, requires impractical amounts of coordination and developer effort. Language support is important because it enforces guarantees across that whole object graph.

Anyway, I’m looking forward to seeing this materialize further.

2 Likes

And this is precisely the problem of this proposal. We know what an object graph in computer science theory is, but how do we define and specify its meaning in the actual context of (C)Python? No matter how you define it in theory, it currently looks nothing like what the simple examples in the proposal would suggest.

To do this we need stronger immutability and data-race guarantees than what has been implied by the historic mechanisms (GIL for example), even as the Python community has always tried to avoid the need to write a full memory specification. And this would likely mean large incompatible changes to how (pure Python) libraries are currently written.

Subinterpreter isolation and free-threading by themselves are not yet the Python 4 moment as some may have hyped, but the programming model changes to prefer their use will likely be. Right now they are an implementation detail of CPython (not the Python language itself), but deep immutability is proposing a language change. It is up to the community to decide whether the benefits will be worth the costs. For now we should experiment more with use cases of subinterpreter isolation and free-threading to see what we really need before setting off for the biggest changes.

1 Like

One question too: have you all contacted @eric.snow ? I recall him talking about a plan to expand cross-interpreter sharing.

2 Likes

Thanks! Yes we have been talking to Eric about this for close two 2 years.

1 Like

That overview seems to be "Subinterpreters will be able to share objects.

Yes. PEP795 will make this possible for immutable objects. In the long run, we aim to make also mutable objects sharable, using the regions concept.

And the per-subinterpreter GIL can be removed. (I’m not 100% sure if the second point is true).

No, the per-subinterpreter GIL will remain. Removing them is not necessary as parallelism happens because there are multiple subinterpreters.

How does this approach compare with the approaches that have been discussed during the evolution of the subinterpreter feature (CSP-like mechanisms, queues, etc)? How does this compare with the current intentions for improving sharing with subinterpreters?

To the best of my knowledge, we complement and improve these efforts by permitting them to do what they want more efficiently (for immutable objects in PEP795).

How will this affect C extensions?

Since support for immutability is opt-in, authors of C extensions don’t need to do anything unless they want to support immutability. In that case, a C extension needs to perform immutability checks if there is state in the extension or ensure that the state can be safely manipulated across multiple subinterpreters and/or threads. A C extension that is stateless just has to state that it is safe to freeze.

There’s also the questions from this thread which are still unanswered:

Thank for the reminder. I will answer those in a separate post to avoid hiding them here.

1 Like

I am working through the backlog of unanswered questions. This will likely take some more days. Feel free to raise questions again if you feel I misunderstood of if you think I have missed some question. (I find it hard to get an overview of this thread to be honest.)

Let me answer a question about escape hatches for immutability which is related to concerns about the strictness of PEP795.

We propose to support an escape hatch in the form of mutable thread-local (or interpreter-local) data. The Example class below uses thread-local storage in the field mutable. As the Example class demonstrates, freezing an instance e will make e (or more precisely, the object pointer to by e) immutable, but the thread-local storage object (e.mutable) stays mutable. We can keep mutating that object, but we cannot swap it out for anything else, because the field in the instance pointing to it cannot be updated.

class Example:
  def __init__(self, x):
    self.mutable = threading.local()
    self.mutable.x = x

e = Example([1,2,3])
freeze(e)
e.mutable.x.append(4) # OK because e.mutable points to a mutable object
e.mutable.x = [1,2,3,4,5] # OK for same reason
e.mutable = threading.local() # Not OK – cannot mutate e (throws exception)

This escape hatch is good for some things, but will not satisfy all desires. For example, if I wanted to have a class with an instance counter, then I can only get a counter per thread which probably isn’t exactly what I want. On the other hand, if I had an object or class with some kind of cache, then I get a nice per-thread cache that does not need any additional locking even when used across threads.

The escape hatch is also very useful for dealing with “global” state which I will demonstrate in a separate post. (Note that this is not yet in the prototype, and not in the PEP, but in a PR on the PEP.)

1 Like

Let me answer some questions regarding the fractions class.

(Ping @oscarbenjamin)

Given f = fractions.Fraction(1, 2) is it expected that f would be freezable?

Yes, but that’s not currently the case because the prototype is not yet finished. I’ll expand on this more later in this post.

If the answer is “yes” then what is the full set of objects that would be frozen by freeze(f) and what are the implications of all of those things being frozen?

This is a little “tricky” to answer as it depends a little on how we address the caching points below, and also since we want to do sharing across subinterpreters as part of PEP795 maybe we want to mark some de-facto immutable types frozen at interpreter start. Let’s for simplicity assume that there is no caching going on, and that we do not consider any of the de-facto immutable classes in Python frozen out-of-the-box. Then a fraction object would indeed be freezable, and freezing a fraction object would lead to the following objects being frozen:

  1. The fraction object
  2. The numerator object
  3. The denominator object
  4. The Fraction type object along with all functions
  5. The fractions module object
  6. The numbers.Rational type object (the base class of Fraction), along with all functions
  7. The Real type object (the base class of Rational) with functions
  8. The Complex type object with functions
  9. The Number type object with functions
  10. The numbers module object
  11. The ABCMeta metaclass object with functions
  12. The object class with functions
  13. The type object with functions

The implications of these being frozen:

With respect to the types, the type objects Fraction, Rational, Real, Complex, Number, and object are not technically immutable at the implementation level, but are effectively immutable in normal Python usage. After freezing them, they are also “technically immutable” meaning no amount of metaprogramming should be able to change these classes. (But you can subclass them with a mutable class for example.) By becoming technically immutable, they also become safely sharable across subinterpreters.

The module objects also become immutable meaning you are not able to add or change fields in the fractions module object or int the numbers module object. (See escape hatches in a post that’s coming up soon.)

After freezing, the fraction object in f is also “technically immutable” which means that it can be shared with another subinterpreter by reference. Because its entire class hierarchy is also frozen, all subinterpreters will have a consistent view of the fraction object.

Why the prototype isn’t there yet

The reason why the prototype does not yet support fractions is because we have not yet made the caches that are used by the fractions class thread-safe and safe for use with subinterpreters. The reason for that is simply that we have not gotten around to it yet. Once we have addressed this, fractions will be possible to freeze. For clarity: there is no hidden gotcha or problem – just that we haven’t had the time to do it.

Let’s look at one of these caches which is in the ABCMeta class which uses caching that is implemented in C. If it had been implemented using threading.local, then it would (probably) have worked out-of-the box. Note that a per-thread cache is not the same as a global cache. There are pros and cons with each: a per-thread cache does not need to block on contention from many threads, but on the other hand one thread cannot take advantage of cached results from another thread. Subinterpreters come with some limitations that we need to respect.

Python currently does not permit objects to be shared between subinterpreters. This PEP will permit immutable objects to be shared. This will permit sharing an immutable cache across subinterpreters (i.e. warm it up and then freeze it), but does not permit having a shared mutable cache (i.e. that can keep adding cached values). It is of course possible to drop to C and implement a shared mutable cache (indeed, that’s probably what is going to happen in ABCMeta). Note that even if one drops to C to implement a mutable cache, one must take care to not share mutable objects between subinterpreters as this is not something that Python supports. (This will lead at best to crashes and at worst to silent corruption of data and heisenbugs.)

With free-threading support, rather than using subinterpreters, it would be possible to implement a global mutable cache, since the isolation imposed by subinterpreters does not apply. (One way to think about the thread-local escape hatch is as an interpreter-local escape hatch: there will be only one interpreter when using free-theading, in that case.)

See also below for a follow-up to this.

How do I write a class that is freezable?

In our original proposal, any Python class would be freezable out of the box as long as it did not use C extensions that did not explicitly declare support for freezing. We think this is the right design, but there have been concerns about freezing on this thread with respect to how far freezing can propagate. For that reason, we are considering one more requirement to make a class freezable: that is explicitly opts in to support freezable. This can be done for example by declaring a __freezable__ field in the class (leaving the decision up to the person implementing the class) or by calling register_freezable(my_class) (allowing users of a class to turn on this support).

Since an object cannot be frozen if it points to an object that cannot be frozen, it is easy to protect an object against being frozen, or to a class.

Going back to the fraction class and the question “what does an author of a class like fractions need to do to support immutability” (my rephrasing). You will have to make sure that all the building blocks you rely on support freezing. If they don’t, you may be able to use the thread-local escape hatch, but you will be limited by what subinterpreters can do, including our extension with support for sharing immutable data. That means for example that you cannot maintain a global, mutable cache of mutable objects inside an immutable object. If you drop to C, you could implement a global mutable cache of immutable objects.

Can I call a logger from a frozen function? Etc.

Yes, thanks to the thread-local escape hatch discussed above. Consider the following code:

import logger

def foo(a, b):
   logger.log(“Message”)
   return a + b

freeze(foo)
foo(40, 2)

Using standard Python, we can create an immutable wrapper for the logger module that internally uses threading.local() to point to the not-frozen logger module of the current subinterpreter. By installing this wrapper in sys.modules, the import above could get the immutable wrapper automatically, so that existing code does not need to be updated.

Another option is to write a logger module which has two parts: one immutable module which can be shared, and which forwards all log messages to some subinterpreter that has imported the mutable through message passing.

A third option is to create a similar wrapper for a specific function:

import logger # imports normal (mutable) logger module
escape_hatch = threading.local()
escape_hatch.log = logger.log

def foo(a, b):
   escape_hatch.log(“Message”)
   return a + b

freeze(foo)
del escape_hatch # for illustrative purposes
foo(40, 2) # logs as expected

When foo is frozen, we detect that foo captures escape_hatch. We create a new cell that points to escape_hatch, and freeze that cell along with the foo function. However, since the cell contains a threading.local() object, freezing the cell only ensures that the variable cannot be reassigned. When we call the frozen foo, it will find the thread-local storage object in escape_hatch, and then call the logger.log method from the current subinterpreter.

So, what happens if I copy foo and send it to another subinterpeter? Great question! In the other subinterpreter, escape_hatch.log may or may not have been assigned a log function. If it hasn’t it will fail.

If one does not want to have a logger per subinterpreter, then simply implement the wrapper such that all calls on the wrong subinterpreter sends a message to the correct subinterpreter with the message. Naturally this message must either be pickled (because subinterpreters) or frozen.

Questions about regions etc.

Let’s open a new thread on the ideas side to talk about this. I am happy to try to answer questions or engage in discussions about that but not in this thread since I don’t want anyone reading this thread to believe that regions are part of PEP795, nor that regions are necessary for PEP795 to work etc. Regions and sharing mutable objects across subinterpreters is an extension to PEP795. It felt important to paint a bigger picture, especially since we hadn’t originally included the sharing of immutable objects between subinterpreters in that PEP but only in the big picture.

When you say “we can”, who is “we”? If this is something that will be part of the stdlib, then that’s extra implementation complexity that you need to explicitly note in the PEP. Or, to put it another way, if it’s not mentioned in the PEP, then “we” (the core devs) won’t do it, so it will not be possible to freeze a function that uses logging.

If by “we”, you mean “the person implementing foo”, then that’s a bunch of extra complexity that people aren’t going to expect. Again, it needs to be in the PEP.

Hang on, you’ve just said that the Fraction class isn’t freezable unless the internal caching is changed to be thread local (assuming the proposed change to the PEP - in your original proposal, there was no way to make a class with an internal cache freezable). It’s really hard to follow your explanations when you contradict yourself like this :slightly_frowning_face:

1 Like

The issue here seems to be that Fraction does use a C extension: namely, _abc. Fraction inherits from numbers.Rational, which (half a dozen base classes later) has metaclass=ABCMeta. And that uses functionality in the _abc extension module.

Sorry, where is the contradiction? Can you be clearer?

By “we can” I meant “it is possible to”.

As I mentioned: this is not yet in the PEP. We are working on amending the PEP based on the discussions here. But thanks for trying to help clarifying what you expect from a PEP!

I think that’s a different issue (although it’s related). We still need clarity on which stdlib modules will be freeze-safe, and if any aren’t, then that’s going to have wide-ranging implications. For example, if ABCs aren’t freezable (because of _abc) then a lot of Python code will not be freeze-safe.

IMO, people expect “pure Python” code to be code that just uses the language and stdlib. So I’d expect all of the stdlib to be made freeze-safe if broad statements like “any Python class would be freezable out of the box” are being made.

But as I say, this is separate. The _hash_algorithm function in fractions.py is cached. It uses functools.lru_cache, which is written in C, but that’s not the point - what matters is that it’s a global cache, and for the Fraction class to be freezable, this needs to be made into a thread-local cache. (Although it shouldn’t need to be - a global cache is correct, as nothing about the hash algorithm depends on the thread. All that’s actually needed is for the cache to be correctly synchronised across threads).

Specifically,

In this section, you said that in order to be freezable, the caching strategy of the fractions class needs to be modified. However, you say in the later post that I quoted:

But this contradicts your point that maintaining a global shared cache prevents freezing.

The best I can interpret this is that you mean “any Python class can be modified to be freezable…”. Which is very different from claiming that any Python is freezable “out of the box”.

I’m still not clear whether it’s something that will be done in the stdlib, so the end user doesn’t have to, or if the responsibility will be on the end user to do this.

No problem. To be even more explicit, I expect the PEP to:

  1. List all the places in the core and in the stdlib where changes are needed.
  2. List all the core and stdlib classes that are not going to be freeze-safe under the proposal (I’d like that to be “none”, but it’s more important that they are listed).
  3. List any coding techniques (such as global shared caches) that will prevent a class that just uses Python code and the stdlib working when frozen, along with recommended ways of rewriting that code to be freeze-safe (and in this case, I won’t believe you if you say “none”, because of the cache example - that’s the crux of my comment about contradicting yourself above).

At the moment, things keep cropping up in the discussion where people expect one thing, and you correct their assumptions after the fact. It would be helpful to make the discussion more productive if people could just refer to the PEP, so you’re not the bottleneck on understanding the implications of the proposal.

I appreciate that it’s hard trying to manage a discussion while still updating the PEP. But conversely, it’s hard trying to discuss things when responses are being given that don’t match any published version of the PEP.

Maybe the discussion should be put on hold until a revised PEP is ready? I don’t think there’s any urgency here, after all.

3 Likes

Having the cache be correctly synchronised across threads is sufficient in free-threading but not with subinterpreters. The problem is that interpreter1 can insert objects into the cache which are then reachable by interpreter2 so it breaks the isolation of the object graphs. Once interpreter1 and interpreter2 have a shared non-frozen object the per-interpreter GILs cannot protect the object any more and the reference counts can be corrupted etc so we don’t have memory safety.

It could perhaps be possible to have a version of lru_cache that once frozen only allows frozen objects to be returned. That would not corrupt the memory model or break the isolation of the object graphs. If different threads can store different values in the cache then execution order affects the results in each thread but I’m not sure if that counts as a “data race” for the purpose of this discussion.

1 Like

I don’t think it’s a good idea to have register_freezable() available at all. To know whether freezing a class would work and not break things, you need to have knowledge of its implementation. Right now, if I had some library, it would be perfectly backwards-compatible to add a cache somewhere, aside from the increased lifetime of objects which is rarely noticeable. Or even do something like add a cache for the hash value, which normally is entirely undetectable. But if users were freezing my class, suddenly this sort of change would break the library without warning.

It seems better to me to require the author to verify their code will support freezing, then opt-in, making the commitment to allow it in the future. If you really need to freeze some class, you could do it by monkey-patching the flag in - which is fairly clearly messing with the internals of a class, where the patcher takes all the responsibility for keeping the code working.

2 Likes

This, and

refer to my concern about separation of responsibility. Making user classes work with freezing is largely an implementer’s (and thus, library author’s) problem. For users it would be quite difficult to figure out what classes they use in their scenarios are freezable.

As a library writer while I can live with using thread-interpreter-local caches, for me I’d just make __setattr__ and __delattr__ raise an exception, disable inheritance for my class and declare it final, and freeze every instance as soon as it is created, just to save users’ (myself :smiling_face_with_tear:) trouble of finding out and trying to indicate this information in the type system which instances are frozen and which ones aren’t. Allowing user class instances to be individually freezable, versus freezing all or none of the instances, just does not seem too useful — if an author intends to provide mutable and immutable versions of a class, it’s just be better to make them separate classes with a shared superclass or protocol (which unfortunately has its own problems).

(Speaking of which — do type annotations get frozen too via __annotations__?)

1 Like

It is still unclear to me what the total set of frozen objects is. If we freeze the fractions module does that mean that we have to freeze all of these modules as well:

Does that also mean freezing every class and function in those modules as well and every module that they import recursively?

Is there a big difference here between doing

import math

class Fraction:
    def func(...):
        return math.gcd(a, b)

rather than using

from math import gcd

Does the former freeze the whole math module while the latter only freezes the gcd function? Or does freezing the gcd function have to freeze the math module anyway?

When I asked which objects would be frozen when freezing a Fraction I was not only looking for an answer in high-level terms. I want to see how far freezing reaches so I really want to know concretely the full set of modules, classes, functions etc to see how large it is. Is there some way to enumerate that programmatically?

3 Likes

This is an important but subtle point.

@stw framed his list in such a way that at first glance, it looks like the idea is that the fraction object itself (along with its components) gets frozen, and then you go “up” the object hierarchy, freezing objects as you go. That seems natural and relatively harmless, to the extent that my first thought was “oh yes, I see - that looks OK”.

But it’s that “along with its components” that’s the killer. It’s 100% necessary, because if the numerator of a frozen fraction is mutable, you’ve lost any meaning to the idea that the fraction object is frozen. And that means, as @oscarbenjamin points out, that when you freeze a module (which you do as you go up the hierarchy) you need to freeze the globals defined in that module. And that takes you back down the hierarchy again.

To be very specific here, what exactly does it mean to freeze a module object? Suppose I have a very simple module, x.py:

val = [42]

If I do import x and then freeze x, what does that mean? Is it allowed to reassign x.val? Is it allowed to modify x.val (for example, appending a new object to it)? Is it allowed to assign new values in x, via something like x.newval = []? Is x allowed to contain functions that modify the module global state? Will those functions be frozen (they can’t be, because they modify global state!)?

Now that you’ve answered all those questions (and I’m sure I can come up with more, if needed :slightly_smiling_face:), what if x was actually sys? Because almost no realistic Python code can manage without importing sys. And please don’t suggest that we special case sys, unless you can explain precisely why sys is the only module that needs special casing. Because otherwise, I’m happy to come back with “OK, but that just covers sys, so what about os”, and so on…

4 Likes