Would frozendict support unions with | (similar to dict)? It’s not in the Mapping protocol, but it is quite a useful utility. I don’t believe it’s mentioned either way in the PEP.
A
Would frozendict support unions with | (similar to dict)? It’s not in the Mapping protocol, but it is quite a useful utility. I don’t believe it’s mentioned either way in the PEP.
A
Can there be a mention of possible future plans towards constant folding a dict in a fashion similar to how a large set literal consisting of constants is constant folded as a frozenset?
But now you have made that dictionary frozen for everyone who holds a reference to it, which means side-effects at a distance in a way that could be unexpected (e.g. context switch in a thread and now suddenly you’re going to get an exception trying to mutate what was a dict a microsecond ago but is now frozen). That seems like asking for really nasty debugging issues just to optimize some creation time.
Well, yes, what I’m suggesting is a mutation method in dict which is a mutable data structure. It’s as dangerous as dict.clear() or dict.update(), which Python already has. And it requires the same kind of discipline to use (mutate to setup, don’t share / don’t mutate in a concurrent environment). We already expect Python devs using concurrency to be aware of the risks of shared mutable data and to manage them to avoid the same “nasty debugging issues”, so this wouldn’t be introducing a particularly new problem.
I understand that benefits go beyond performance, but one of the touted benefits is “safely share dictionaries across thread and asynchronous task boundaries”, so I am thinking what can make this proposal more successful in enabling that at a low cost.
And I perfectly understand if the PEP authors don’t want to introduce this early on, but it would be nice if at least the initial implementation didn’t close the door on this possibility.
Agreed, but since building a dict before shipping it for concurrent consumption is a major use case, I think it makes sense to optimize for the use case by adding a dict.detach method, which creates a frozendict that reuses the hashtable of the dict and make the original dict point to a new empty hashtable, similar to the one for bytearray described in Add zero-copy conversion of `bytearray` to `bytes` by providing `__bytes__()` - #17 by pf_moore.
Agreed too.
That’s a good catch. frozenset.copy currently creates a new reference to the same object (unless it’s an instance of a frozenset subclass) too.
EDIT: I think this is because frozendict as proposed does not require keys and values to be immutable, so actual copies of them are needed when copying a frozendict.
Surely, keys need to be immutable to be hashable, so no need to copy them. Values can be mutable, but dict.copy() doesn’t deep copy, so surely frozendict.copy() wouldn’t either.
I might be missing something, but from a cursory glance it seens unnecessary to actually copy the frozendict and not just return itself.
Keys can be both mutable and hashable. All objects are hashable by default unless their types explicitly set __hash__ to None because object.__hash__ hashes an object by the object’s address.
We’re just talking about shallow copies here.
That depends on whether we take a consenting adult stance towards the mutability of items in the “copied” frozendict. For better safety I think it does make more sense to actually copy items for the copy method.
That would be a very odd interpretation of shallow copy. Everywhere else, a shallow copy of a container literally copies the outermost structure only, and for everything “contained” it just copies the object pointer and increments the reference count.
I strongly recommend doing the same thing for frozen dict – otherwise it’s just going to confuse people who compare it to e.g. frozen set.
Ah yes I totally misworded my sentence when I intended to say what you’re saying. Thanks for the correction!
And if the outermost container is immutable, copying its pointer (with incref) is totally equivalent.
While I agree that having a one-way switch to turn a dict into a frozendict is maybe not the best idea, I don’t think this is a good reason in itself.
Sharing a pointer to a mutable object means unexpected side-effects at a distance are to be expected—so to speak. You can argue exactly the same with dict: suddenly the dict is empty and the key you just verified was there, is there no more.
It may be unsafe because while the outermost container is immutable, the items within may be mutable.
But we currently allow frozenset.copy to copy the container’s pointer even though it can contain mutable items, so yeah we should just be consenting adults and make frozendict.copy also copy the container’s pointer in O(1) time.
adding to the question about |:
It would be nice if frozendict | dict returns a frozendict, and dict | frozendict returns a dict. This is the behavior of the frozendict library, and it is useful, but I see no reference to | mentioned anywhere in the PEP.
A method or operator to create a copy with certain keywords removed would be useful too. The pattern frozendict({k:v for k,v in my_d.items() if k not in my_filter}) is usable, but does cause clutter, and reduce readability.
Will love to see frozendict in the standardlib ![]()
edit: I replied to the wrong message. I can’t see how to fix that.
Can we get a typing.TypedDict(frozen=true) for this?
Oh, this PEP 603 section is misleading, sorry. You’re correct that frozendict.copy() can return a new reference, I just fixed the implementation for that:
>>> # no copy for frozendict type
>>> s=frozendict(x=1, y=2)
>>> s2=s.copy()
>>> s2 is s
True
>>> # but subclasses still need a copy
>>> class FrozenDict(frozendict): pass
...
>>> s=FrozenDict(x=1, y=2)
>>> s2=s.copy()
>>> s2 is s
False
>>> s2
frozendict({'x': 1, 'y': 2})
In fact, I wanted to highlight in this section that mutating an immutable dictionary has a O(n) complexity with frozendict, but O(1) complexity with PEP 603 frozenmap. I picked the wrong example with the copy() method. A better example would be to add an item to a mapping: frozenmap.including(key, value) versus frozendict | frozendict(key=value). By the way, proposed frozendict doesn’t have these including(), excluding() and union() methods.
Yes. I didn’t realize that adding a new built-in type requires changing so much code
It will be done later.
There is already PyMapping_Check() for that but it’s a different check: it only checks for __getitem__(), whereas frozendict has way more methods.
PyAnyDict_Check() is more efficient since it can check for the Py_TPFLAGS_DICT_SUBCLASS flag for dict.
frozendict supports all dict methods except the ones listed in Differences between dict and frozendict. So yes, it supports the a | b operator:
>>> frozendict(x=1) | frozendict(y=1)
frozendict({'x': 1, 'y': 1})
>>> frozendict(x=1) | dict(y=1) # works with dict as well
frozendict({'x': 1, 'y': 1})
It’s one way to mutate a frozendict ![]()
We can consider such optmization in the future. It has been discussed recently at: Dict constant folding, new frozendict type.
A consequence of the frozendict not being a subclass of dict is that
s = frozendict({1: 2}); isinstance(s, dict) # False
I suspect that packages or modules changing a dictinto a frozendict (for efficiency or safety reasons) will lead to some breaking of downstream packages. For example in the reference implementation the JSON encoder had to be modified (Lib/json/encoder.py L326) to handle frozendict objects.
I think this not a problem (it is similar to when packages decide to use a pypi package with a frozendict/frozenmap implementation), but maybe the example can be added to the PEP.
Yes, we’ll have to think about how this new builtin type (if it is accepted) interacts with TypedDict. (“We” here doesn’t necessarily mean the authors of PEP 814; this is something we can add separately later.)
Currently, TypedDict is specified as being restricted to instances of exactly dict: no subclasses, no other kinds of Mappings. So if we don’t change anything, frozendicts would not be compatible with TypedDict at all.
A few options:
class TD(TypedDict, frozen=True): (as Thomas suggests). I assume this would mean that it is like TypedDict, except it must be a frozendict instead of a dict. But probably lots of people will want to write APIs that can accept either a dict or a frozendict, not just one of the two. Another variant could be that dict is acceptable for frozen=True, but not the reverse. This makes some sense because dict mostly only adds functionality to frozendict. Except hashing.TypedMapping that can contain any kind of Mapping with fixed keys. This has been suggested before but I don’t know if there’s been a design that is fully fleshed out.Frozen[MyTypedDict] that indicates a transformation of an existing TypedDict into a frozen one. Then naturally, MyTypedDict | Frozen[MyTypedDict] means you accept a dict in either thawing state.We could also wait and see how widely adopted frozendict is going to be before we add type system support. This might create a bit of a chicken-and-egg problem though; people may be hesitant to start using it if they can’t use it with TypedDict.
Looks like a good proposal, but I think that the discussion surrounding dedicated syntax could be moved to a deferred ideas section, since the text itself says the idea is deferred and not rejected.
I give up. You don’t seem to understand the philosophy of shallow vs. deep copy.
I haven’t checked the reference implementation yet, just the PEP.
What would be the cost of line (*) below? O(1) or O(n)?
my_dict = {}
# ... populate my_dict so that it contains n items ...
frozendict(my_dict) # (*)
I’m guessing it requires an actual copy, but I wanted to see if I got it right.
Also, is frozendict deeply immutable or just shallowly immutable? I didn’t quite catch this from the PEP. As in, does it require keys and values to be immutable as well? And would this be a valid frozendict?
frozendict(spam=tuple(list()))
Anyway, I like this change, it’s definitely going to be helpful in various places ![]()