That’s an interesting point; thank you for raising it! We haven’t discussed this hashability in detail yet.
My understanding(Please correct me if it’s wrong) is that Python currently uses the presence of the __eq__() and __hash__() methods to determine if something is hashable. These functions require the object to be immutable, as the hash has to remain constant over the lifetime of the object. Marking currently hashable objects as immutable objects should just work^™️.
However, marking mutable objects hashable might be more complicated. The naive approach would be to add a __eq__() or __hash__() function to the type on freezing. But this would be unsound since it would mark other (potentially mutable) objects of the same type as hashable as well.
A less clean solution would be to use a check isfrozen(self) in __hash__() to dynamically restrict the hash implementation to immutable objects. Or maybe have something like __frozen_hash__()?
It feels like there is definitely some design space here. I wouldn’t pick it up in this PEP since we haven’t explored the design space and it’s not necessary for the feature to be useful. But this could be something to follow up on when this is implemented
No. __eq__ and __hash__ need to be consistent with each other, i.e. a == b implies hash(a) == hash(b) (just not necessarily the reverse) and hash(a) is not allowed to change during it’s lifetime; That doesn’t mean that a isn’t allowed to change - any such change just can’t be reflected in the hash. E.g. all objects that don’t overwrite __eq__ are hashable by default by just using their id.
from immutable import freeze
def example2():
x = 0
def foo(a = False):
if a:
a = a + 1 # Note: updating local variables work, even in a frozen function
return a
else:
x = x + 1
return x
freeze(foo)
foo(41) # OK, returns 42
foo() # Throws NotWriteableError
example2()
Doesn’t the call to foo() fail due to UnboundLocalError (with or without freezing)?
The presence of the assignment x = x + 1 makes x local to foo, consequently causing x + 1 to fail in finding x in foo’s body before performing the assignment.
The example needs nonlocal x for this to work as expected.
This won’t work because a type is determined to be unhashable only if its __hash__ attribute is None. This means that the __hash__ attribute of an unhashable mutable type can’t be made a dynamic method to make an instance hashable when frozen.
Yes, I think this is the way to go, and to make isinstance(obj, Hashable) work for frozen objects we can make changes to abc and collections.abc such that:
abc.ABCMeta offers not just a __subclasshook__ but also an __instancehook__ that is called by its __instancecheck__ before falling back to calling __subclasscheck__.
collections.abc.Hashable overrides the __instancehook__ to look for type(obj).__frozen_hash__ if isfrozen(obj) is true.
This allows isinstance(obj, Hashable) to take the state of the instance itself into account. Without the change it currently can only check its type.
Thanks for that suggestion — we will move it to a deferred ideas. Possibly also a discussion on typing can be moved there, a post on that to appear here shortly.
Thanks for the question! In our current implementation, we have (broadly) three scenarios as pertain to hashability:
An object graph which contains pure Python classes and currently hashable types
As above, but containing CPython classes which have been correctly made freezable (e.g. list, dict)
As above, but containing Python classes that wrap unfreeable CPython types but are registered as freezable (the wrapper/opt-in option for handling legacy CPython code)
In scenarios 1 and 2, I feel that what you propose is entirely doable. In scenario 3, it is probably not safe. Thankfully, we can detect which of these scenarios we are in fairly trivially, and ideally over time graphs in scenario 3 should decrease in number as various CPython codebases are updated to integrate more cleanly with this work. That said, I concur with Fred that this would be best suited for a follow-up PEP, as it would require some minor changes to the hashing logic.
There are several possible options here I think, again thinking of what a future PEP about this might entail. One can imagine adding a __ishashable__ method and a default __hash__ implementation that use immutability checks, which could act as an alternative to the current system for testing hashability. Something like that would have to be done carefully given the (potentially) wide-reaching implications. Still, given the potential benefit I for one would love to explore the possibilities here.
This question (regarding the scope of freezing) is one which we’ve returned to multiple times during the course of this work. Our current philosophy is that we want, wherever possible, to minimise surprise. It is always possible to create, for lack of a better word, malicious object graphs which result in freezing a large number of things. Just simply calling freeze(sys) is going to result in a “good” time being had by all. We have tried to wargame some of these scenarios in our reference implementation to avoid language-breaking from occurring. For example, we manually omit some things from freezing to avoid breaking module import. We also make everything mutable during finalisation.
To your larger point, however, there remains this question of how to communicate to the programmer what is happening when they call freeze. The mental model we want for the programmer is that the object they passed will not change, both in content or behaviour, for the remainder of the program’s execution. Furthermore, our internal model is that the object becomes threadsafe, and able to used and interacted with by multiple threads concurrently without the possibility of a data race.
I find the fundamental approach here a bit worrying. Frozen and non-frozen objects should be distinct types. The following code should not be able to fail for any reason except out-of-memory IMO:
if isinstance(l, list):
l.append(1)
Changing this invariant is such a fundamental change in the mental model programmers need to have that it IMO requires equally fundamental justifications, if it can ever happen. I have not seen any such justifications yet. In rust, every variable has tracked mutability states - this is not what is being added in this PEP, what is being added is every object having a tracked mutability state.
There are already frozen modules in Python. Those modules are not immutable at runtime, so not frozen in the sense of this proposal. Maybe it was good, if a new name was invented.
class L(list):
def append(self, item):
raise ValueError("boom")
l = L()
So I’m not sure a frozen list would be that different from a list subclass, conceptually. The fact that the type(l) is list guard won’t work either (IIUC) may be is worrysome, though.
Yes, but that assumes a class almost malicious providing a subclass that breaks the Liskov Substitution Principle - which is mostly a safe assumption for python (with the very notable exception of hashability ofcourse).
This is contrasted with this proposal which adds a builtin way to do this for any object.
I don’t think it would fly if someone suggested to add a new ImmutableListlist-subtype somewhere in the stdlib, let alone deeply integrated into the interpreter.
I don’t think it’s going to be an issue. When we call a function we already know full well what the function does, including whether or not it is going to mutate an argument. So if a function does if isinstance(l, list): l.append(1) where l is an argument we already know from the docs not to pass to it an object that we don’t want to be altered. That’s already how it is today. We are already mentally tracking mutability ourselves now. Adding an assurance of immutability to an object at the language level only helps us ensure that it behaves in the way we expect it to.
We will also have further programmatic assurance when we are allowed to type an immutable parameter with a special form of Immutable in a separate proposal.
Good point. The verb “freeze” is already used in Python to describe a type of module packaging.
I suppose we can instead use the invented word “immutablize”, already used to describe the act of making an object immutable by some modules and recipes.
OTOH there is also frozenset denoting a set-like immutable object, so both meanings exist in the language. (If anything, frozenset is likely more commonly known than frozen modules).
I think you have seen the justification, which is “threadsafe programming is really hard”. So hard that people have invented new language models to handle it.
You don’t need to be convinced – I’m undecided about it, for whatever my opinion is worth – but I don’t think you’re going to find a much greater depth of explanation here.
Let’s take your example, list.append . I agree with you that anything today which breaks the safety of that code is, itself, fundamentally broken. If you created a list which can’t append, you shouldn’t have made it a list. Convincing isinstance to agree with you is totally facile – today, if it can’t append, it’s not a list.
This proposal changes that. It says that we’re allowed to “deny list” all mutators on an object. That makes it a new and different type for most people’s (and type systems’) purposes, but Python can’t represent it as such. It will have a new “immutable tag” that we can check at runtime.
IANATT[1], but I’m sure the fancy higher order type systems are able to represent this sort of thing.
We can declare that for a codebase which includes freezing, the following snippet is wrong:
if isinstance(l, list):
l.append(1)
It “should be”:
if not isfrozen(l) and isinstance(l, list):
l.append(1)
Code which was correct has been rendered incorrect. That’s to be avoided where possible, but acceptable if there’s a commensurate benefit.
I’m pretty uncertain about how this could all shape up in terms of DX. In the worst case, it becomes difficult for library maintainers because they have no historical contact about immutability and suddenly need to invent one.
How will programmers keep immutable objects from escaping the walled garden in which they were created and meant to be kept? What happens to libraries which assumed they could do things like sorting a list in-place? Their APIs become “not frozen safe”, but what does that look like in reality for under resourced maintainers who won’t even know about this feature until their users file bug reports?
The only way the existing libraries can break is if someone passes an immutable object to a function that is known to mutate the object. But why would anyone do that? You wouldn’t pass to such a function an object you don’t want to be altered today, so why would you want to do that when you have an object that is guaranteed by the language to be immutable? It should not change anything about how you call these existing library functions.
If anything, having an immutability guarantee from the language helps us prevent accidentally passing objects we don’t want altered to functions that do mutate them so we can catch such errors further upstream.
How do immutable objects become cyclic? And is it common for these use cases?
The PEP section on simplified GC doesn’t elaborate, though the example in Abstract shows that cycles could be formed before freezing.
Is your stance merely that cycles are possible so must be handled correctly?
But if you’re saying SCC counts are an optimization opportunity, I think it’s worth motivating how common it is / what’s the best-case benefit.