PEP 705 - TypedDict: Read-only and other keys

alicederyn · October 17, 2023, 12:39pm

Hi all,

I have significantly revised the proposal in PEP 705, reducing its scope to just TypedDict changes. It now proposes two extensions:

read-only entries, via a readonly flag and a ReadOnly type qualifier
preventing subclasses having other keys (more context for this can be found on pyright issue 5254)

The draft of PEP 728 came out late on in the rewrite, and overlaps this second point; I have updated my proposal to be compatible with it (but with slightly different syntax as that was the main point of contention voiced on the PEP 728 pull request). We were strongly motivated to cover both issues in a single PEP, as otherwise the PEP 705 proposal breaks the semantics of the existing solution to other keys (using @final) without providing a space to discuss fixing the issue.

See also the prior discussion on typing-sig.

(Many thanks to @dmoisset, @pradyunsg, @Jelle, @JukkaL and @PIG208 for contributing to this revision of the PEP!)

gwerbin · October 17, 2023, 6:55pm

This is great. I ran into this use case a few times when working with external APIs, and it always felt like something I should be able to express.

One question: was “frozen” considered as a synonym for “readonly”? The word “frozen” is used in a handful of other places already, whereas “readonly” seems new.

alicederyn · October 17, 2023, 7:39pm

I had not specifically considered “frozen”. I believe it is used to mean “an object that cannot be changed by anyone after initialization”, and implies things like “can be used as a key in a dict”, so I don’t think it fits this usage?

loic-simon · October 17, 2023, 10:14pm

I believe there’s a small typo in the second example of Specification > Inheritance section, you surely mean class NamedDict(TypedDict, other_keys=Never) instead of readonly=True

Great proposal aside, it certainly would be something I’d use!

alicederyn · October 18, 2023, 7:32pm

Fixed, thanks!

erictraut · October 18, 2023, 9:56pm

@alicederyn, thanks for all your hard work on this PEP. It has come a long way since the earlier drafts, and I like how it’s shaping up!

I’ve organized my feedback by topic area roughly corresponding to the order in which these topics appear in the PEP.

—

Motivation

I think this section is looking really good. One small nit is that the “type discrimination” example assumes mypy’s current type narrowing logic. Mypy is using a very conservative approach here — one that is arguably inconsistent with the approach it takes with other type narrowing patterns. By contrast, pyright handles type narrowing for the provided example in the manner in which most users would expect, even though it opens up a small type safety hole.

—

Specification for other_keys flag

This feature feels incomplete because it accepts only Never. It defines the behaviors for Never and alludes to additional behaviors that could be added in the future. This is a case where I think it’s important for the feature to be fully fleshed out. If not, we risk getting it wrong. I also think this is a case where generalizing the feature will actually simplify the spec and the implementation rather than complicating them. Special-casing the behavior for Never opens up a bunch of questions and makes it harder to understand the intent of the feature going forward. My preference is that we combine the concepts in PEP 728 and 705 into a single PEP. The resulting PEP will be easier to understand and more coherent. Alternatively, we could separate the ReadOnly functionality from the other_keys functionality and make them “concurrent PEPs” that need to be considered as a “linked pair".

The spec says that other_keys is a flag (implying that it takes a boolean value), but it’s not a flag. Perhaps it should be called a “parameter” instead?

The spec says that other_keys is optional, but it doesn’t say what it defaults to. The challenge here is that Any and None are legitimate types, so there is a danger in using either of them as a default value. We could say that None isn’t a useful type for additional keys, but that seems somewhat arbitrary. We could maybe convince ourselves that a default value of Any is the correct behavior (more on this below). Another option is to use … (the ellipsis singleton) as a sentry value. (I’ll note that this is a good example of a detail that we could get wrong if we don’t consider the feature in its fullness.)

I would think that the other_keys type, if provided, would be inherited by subclasses, but the spec says otherwise. It indicates that if a base class specifies other_keys, a subclass must likewise specify other_keys (and presumably always specify the same value as the superclass). Wouldn’t it be better to inherit this property rather than forcing the user to specify a redundant piece of information? It could be argued that specifying other_keys in a subclass should be considered an error if the parent class already specifies it because it cannot be redefined without breaking type safety.

—

Generalizing other_keys

As I mentioned above, I think the spec could be simplified (and provide additional value in the process) if the other_keys feature was fully fleshed out. I think this can be done with a few additional rules copied from PEP 728. This would make the “type consistency” section much easier to understand. I find the current bulleted list in the “Type consistency” section very difficult to follow, and it’s important for this section of the spec to be crystal clear.

Here are the modified rules I’d propose:

For each key in B that is not in A, the type of the key must be compatible with the other_key type if it is specified for A (or Any if it is not specified).
A TypedDict is consistent with Mapping[str, V] where V is the union of all its value types and the other_key type if specified (or Any if not specified).

Would it make sense to say that other_keys defaults to Any if no class within the class hierarchy defines it? I think that works as long as we allow subclasses to override the other_keys with a value other than Any.

I think it’s important to allow ReadOnly[T] to be used for an other_keys type. This enables various use cases and is required for future type features (like mapping types, which have been discussed in the past).

—

other_keys versus __extra__

PEP 705 specifies an other_keys parameter whereas PEP 728 specifies a dundered __extra__ field. Neither of these formulations is ideal. I see disadvantages to both. Some of these disadvantages have not yet been discussed in either PEP, and I think it’s important for us to consider the full list of pros and cons before deciding.

Disadvantages of other_keys

It’s not clear how this would work with inlined TypedDict type definitions (a feature that is not yet spec’ed but is in high demand).
It’s not clear how to “spell” other_keys in a type synthesized within the type checker (e.g. using the process described in the PEP for the | “merge” operator). How should these types appear in error messages, hover text, etc.? What if two types differ only in their other_keys value, and the error indicates that one is not consistent with the other?
The argument expression for other_keys is evaluated at runtime, which means that forward references will need to be quoted even when deferred annotation evaluation becomes the norm.
It requires special-case handling in the type checker to verify that the expression is a valid static type expression and that Required and NotRequired are not used (although ReadOnly is permitted).

Disadvantages of __extra__

It means that __extra__ cannot be used as a legitimate key name.
It requires special-case handling in the type checker to disallow Required and NotRequired.

Given these tradeoffs, I slightly prefer __extra__, although I’m not crazy about either of them. I wonder if there are other options here that we haven’t yet considered.

readonly

I think the spec should make it very clear that readonly=True is simply a shorthand way to mark all fields as ReadOnly. This is not a property associated with a class. It’s a property of individual entries. I think it would be clearer if the spec first described ReadOnly and only later talk about the readonly parameter (and define it simply as a shorthand). The current spec is confusing because it says at one point that "The readonly flag is equivalent to marking all entries as ReadOnly[]”, but it then spells out rules that contradict this statement. Any place the spec says that the “class” or the “type” is “read only” adds to this confusion. This distinction is especially important if we start to synthesize new classes (e.g. as suggested in the in the PEP for the | operator). I think it’s likely that we’ll add more ways to modify TypedDict definitions in the future, and the notion of a “read-only class” will get in the way of those efforts.

Here are some specific areas of the spec that this affects:

The spec says that “a read-only type cannot extend a non-read-only type". I contend that there’s no such thing as a “read-only type”, only read-only fields. This restriction should be removed. It’s perfectly legitimate to add read-only fields (or redefine existing fields as read-only) when a base class contains non-read-only fields.
The spec says that readonly=True “indicates that no mutator operations will be permitted” and that it “removes mutator operations”. These statements are confusing (and even incorrect) because mutator operations are synthesized on a per-field basis as individual overloads. Mutator methods are not synthesized for the entire class.
The spec uses the phrase "The subclass will not be read-only…”. Once again, this is confusing because classes cannot be “read-only”; this property applies only to individual fields within a class.

The spec says that “it is an error to use both readonly=True and ReadOnly[]. I don’t think this limitation makes sense. It’s inconsistent with the precedent set by total=True and Required. I don’t see any good justification for imposing this restriction because it doesn’t result in a logical conflict, and its meaning is clear. It seems like something that should be handled by a linter, not the runtime or a type checker. It’s akin to redundant subtypes within a union annotation (int | str | int).

—

ReadOnly

As I mentioned above, I think the spec would be clearer and build a better mental model for the reader if it leads with a definition of ReadOnly and then defines readonly=True as simply a shorthand. Currently, these concepts are presented in the opposite order, and this serves to build the wrong mental model for the reader.

I wonder if we should support ReadOnly[X] (where X is a TypedDict) anywhere that a type annotation is allowed? This would allow the TypedDict X to be defined as writable but used in specific situations where you want all of its fields to be ReadOnly. This would be comparable to Readonly<X> in typescript. I see that this was considered in the “Rejected Ideas” section, but I think it’s worth considering.

—

Merge Operations (Union Operation)

The section titled “Union Operation” is confusing. I needed to read it a couple of times before I realized that it’s not talking about “unions” (which is a term that has a well-defined meaning in the type system). Instead, it’s talking about a dictionary “merges” — the operator |, which is implemented with the __or__ magic method. I recommend replacing all use of the term “union” with “merge” to avoid confusion.

I found the wording in this section hard to understand. In particular, it says “This is different from the usual compatibility rules…”. I’m not sure what “compatibility rules” means here. Does it mean “type consistency” rules? I don’t think that’s actually different here. If I understand the spec correctly, it’s saying that “type checkers should treat the | operator specially when the LHS and RHS operands are both TypedDict instances. Rather than applying the typeshed-supplied definition of _TypedDict.__or__, it should apply special-case logic to compute the type of this operator. I presume the same is true of __ior__? If my interpretation is correct, I recommend rephrasing it to avoid confusion.

The section also mentions copy and deepcopy. Is this referring to the copy method on _TypedDict or the copy function exported by the copy module? I presume deepcopy refers to the function exported by the copy module? Is the spec suggesting that type checkers should special-case these functions, effectively overriding the behavior specified in the stubs? I prefer not to do this in pyright unless there’s really strong justification. The spec should be clear about its meaning — and whether this is required or optional for type checkers.

—

Update Operations

I’m not sure what is being proposed in this section. As the spec points out, type checkers are currently more permissive here (practicality over purity). The spec then indicates that it allows type checkers to be less permissive. Won’t this result in more false positives? Or are you proposing that type checkers should be less permissive only under certain circumstances? Or in certain modes (e.g. when “strict” is requested)? It’s unclear what is being required of type checkers here.

I’m not sure what the spec means by “declare a new type”. A type declaration (at least as I tend to use the word) refers to a code statement that declares the presence of a symbol and describes its type. I don’t think that’s what you mean. Perhaps what you mean is “internally synthesize a new type”? I’d also avoid using the term “copy” and instead use “include”.

I was initially confused by “Union this with an iterable of matching key-value pairs”. I wasn’t aware that _TypedDict.update supported an iterable of pairs. Neither mypy nor pyright support this today. Also, typeshed’s _TypedDict.update definition does not include an overload that supports this form. I now understand why you included this statement, but I wonder if it will generate similar confusion for other readers. Perhaps consider deleting it. If you want to keep it, then you may need to specify additional ways in which a type checker needs to accommodate the full range of capabilities of the update method — including support for keyword parameters. I filed this issue to track this in pyright.

—

Nomenclature

Do we consider “readonly” to be one word or two? If it’s one word, it should be readonly and Readonly (consistent with typescript, FWIW). If it’s two words, it should be read_only and ReadOnly. Currently the spec uses readonly and ReadOnly, which are inconsistent.

In general, the text of the PEP is very inconsistent when talking about TypedDict fields. It alternates between “field”, “entry”, and “key”. FWIW, the term used in the Python dict class is “item”. I think “key” is probably the least accurate, since that refers to just the name of an entry, not the entire entry. I don’t have a strong opinion here, but I think the spec should try to be more consistent.

Is other_keys the right name? “Key” doesn’t seem to be the right term here. The value passed to the other_keys parameter is the type of the “value”, not the “key”, so this name strikes me as especially confusing. I don’t have a strong opinion on the name, but I think we could do better than other_keys. Here are some options: other_entries, other_fields, other_items, extra_entries, extra_fields, extra_items, other, extra.

—

Samples

I noticed minor errors in a few of the code samples.

The sample beneath the statement "Subclasses can also require keys that are read-only but not required in the superclass:” does not mark the name field in the subclass as ReadOnly. That means that it is no longer ReadOnly in the subclass, right? I don’t think that’s what the sample is trying to show. So I recommend changing it to name: Required[ReadOnly[str]]. Or maybe I’m misunderstanding what this sample is intending to show.

In the “Keyword argument typing” example, there is a bug in the impl function. It shouldn’t have a self parameter because it’s not a method. This prevents the assignment from working on the last line of the example.

alicederyn · October 19, 2023, 2:54pm

Thanks! There’s a lot of feedback I will attempt to merge straight into the next revision of the PEP, so if I haven’t specifically answered anything here, I probably agree with it.

TL;DR of the below: I think it is best to remove both other_keys and the readonly flag from the PEP, leaving only the ReadOnly type qualifier.

other_keys

I was sure I’d seen pyright complain about that example. Does pyright have any special logic for final TypedDicts, or is that solely a mypy idiom?

This completely changes my understanding of the conversation we were having on pyright issue 5254. I thought we were having a conversation about taking the status quo where both type-checkers treated @final as banning other keys, but only partially implemented it, and deciding as a community whether to replace this with something better-suited to read-only typeddicts, or loosening the rules for type narrowing for both. I’d have written this PEP completely differently if I’d realised sooner. That said…

Given my newly-corrected understanding around type narrowing, and the fact that PEP 728 will be addressing your concern about having this issue addressed formally in a PEP, I’d actually prefer to completely remove other_keys from PEP 705, and leave PEP 728 to fully address this issue by itself. I won’t update PEP 705 yet, and I’m happy for people to comment here about other_keys, because I think that can provide valuable input into PEP 728, but I won’t try to address any notes.

readonly flag

I am following the Zen rule “readability counts”. As per the example in the docs:

class Band(TypedDict, readonly=True):
    name: ReadOnly[str]
    members: Collection[str]

To me, this is not readable, as it is far too easy to see the ReadOnly in the body and logically deduce that members must be mutable, since it makes no sense to declare an item as read-only if it already known to be read-only! Longer examples would only make this worse, as the readonly disappears above the fold. Since it is (a) not necessary to permit it, and (b) trivial to exclude it, I believe it is the right thing to ban it. That said…

I think having readonly=True at the top of the class is a stronger statement than this. I shouldn’t have said the “is equivalent to” thing. If I see a TypedDict with readonly=True, I expect/want any possibility of non-readonly items to be ruled out, just like if I saw frozen=True. I don’t think the absence of a “logical conflict” is sufficient reason to introduce a (to me!) counterintuitive break from expectations, and I don’t think total is a good precedent to lean on for this.

I think the best way to reconcile these two forces tugging in opposite directions is simply to remove the readonly flag entirely, and only add the ReadOnly type qualifier. “There should be one - and preferably only one - obvious way to do it.” I thought it would help write simpler code, but it seems as if there is a lot of hidden and contradictory complexity in how people assume it will work.

Inheritance

Do you mean you would expect this behaviour?

class Parent(TypedDict):
  foo: int

class Child(Parent):
  foo: ReadOnly[int]

child: Child = { "foo": 3 }
parent: Parent = child  # mypy: Error: 'foo' is read-only in Child

I think this would be confusing. Or did you mean it would work like this?

child: Child = { "foo": 3 }
child["foo"] = 4  # Fine: foo isn't actually read-only, because of its declaration in Parent

I think this would be confusing too. Or maybe this?

child: Child = { "foo": 3 }
child["foo"] = 4  # mypy: Error: 'foo' is read-only
parent: Parent = child  # Fine: Child extends Parent
parent["foo"] = 4  # Fine: 'foo' is not read-only in Parent

This also seems confusing. Is there an option I missed? Or did I completely misunderstand what you meant?

Higher-order type function

I think that makes it ambiguous what ReadOnly[X] means when used in another TypedDict. Is this a read-only item holding a mutable X, or is it a mutable item holding a read-only X? I won’t include anything like this in this PEP, anyway, let’s keep the scope narrow.

copy/deepcopy

Yes, this is what I was suggesting. This is the only mechanism available to users to safely change mutability, required-ness, and type constraints on a TypedDict, especially a nested structure. cast is unsafe as it loses knowledge of what might be in the dict already. Writing out a deepcopy longhand is a recipe for bugs (missing data, accidentally copy-pasting the wrong keys…).

Update Operations

The sentences with “should” are intended as requirements, the rest is not, functioning more as “motivating text”. I’m going to need to rewrite this anyway when I drop other_keys, would it be best to keep it confined to “should”-type sentences?

Nomenclature

This is answered in the PEP: “read-only” is hyphenated. I am not aware of any pre-existing examples of how hyphens are treated in Python, and from the limited examples I’ve found elsewhere, the convention appears to be to drop it completely for snake_case (readonly), and treat it as a word separator for CamelCase (ReadOnly). This also appears to be perfectly consistent with how these two conventions are defined on Wikipedia. I would appreciate concrete examples to go on from Python core libraries, or failing that, popular third-party libraries.

I will standardize on this, thanks.

erictraut · October 20, 2023, 5:14pm

Pyright’s type narrowing logic (e.g. for the S in D pattern, where S is a string literal and D is a TypedDict) no longer bases its behavior on the absence or presence of @final. Mypy has an open issue to do the same.

Pyright does still use @final to adjust its behavior when synthesizing the get and pop methods for a TypedDict. I’d like to remove this distinction because I don’t think @final should have any meaning for a TypedDict (since it’s a structural type). I’ve been waiting for the discussions around PEP 705 and 728 to reach a consensus before I make this change because it could be disruptive to pyright users, and I don’t want to make the change more than once.

Yes, if the proposed readonly=True is anything other than a simple shortcut for marking all items as ReadOnly, then I think it’s a good idea to remove the readonly parameter entirely. I’m also fine keeping it around as a shortcut since it’s analogous to the existing total parameter.

My bad. You are correct. It doesn’t make sense to allow existing fields to be redefined as read-only if the base class defines them as writable. The other direction makes sense.

Yes, you’re correct. I missed that. Then it makes sense for this capability to be deferred to some future PEP.

If we add some form of “mapping type” in the future (like the one we were discussing here), it will be possible to create a new type from an existing TypedDict where all of the keys are ReadOnly. But that will require much more discussion.

alicederyn · October 20, 2023, 5:46pm

Oh, so this did change since I last checked! I thought you were going to hold off on any change until the PEP.

alicederyn · October 24, 2023, 9:40pm

I have posted a new draft of PEP-705, removing other_keys and the readonly flag as promised, as well as the section on “union operation”. What’s left (the ReadOnly type qualifier) has been clarified and fleshed out.

Mea culpa: there is in fact a blessed way to do all three in Python.

copy: C = { **a }
merge: C = { **a, **b }

erictraut · October 26, 2023, 3:32am

Thanks for the update @alicederyn!

If any of you would like to experiment with the recent updates to PEP 705, pyright 1.1.333 implements this new functionality as an experimental feature. You need to set enableExperimentalFeatures to “true” in the pyright configuration.

I recently published a new pyright playground web app so you can play with typed code in your browser. This is similar to the popular mypy playground.

Here are some code samples copied directly from the draft PEP 705 so you can play with them in the pyright playground:

tmk · October 26, 2023, 7:18pm

I like the current version of the PEP!

Just FYI, the pyre team has been implementing a ReadOnly marker that works on any variable, such that when you run this through pyre:

from typing_extensions import Self
from pyre_extensions import ReadOnly

class Bar: ...

def foo(x: ReadOnly[Bar]) -> Bar:
    y: Bar = x  # error
    z: ReadOnly[Bar] = y
    return z

def f(s: ReadOnly[str]) -> None:
    y: str = s.capitalize()  # error

class Foo:
    def mutating_method(self) -> None: ...

    def readonly_method(self: ReadOnly[Self]) -> None:
        self.mutating_method()  # error

then you get this output (tested with version 0.9.19):

main.py:7:4 ReadOnly violation - Incompatible variable type [3001]: y is declared to have type `main.Bar` but is used as type `pyre_extensions.ReadOnly[main.Bar]`.
main.py:12:13 ReadOnly violation - Calling mutating method on readonly type [3005]: Method `str.capitalize` may modify its object. Cannot call it on readonly expression `s` of type `pyre_extensions.ReadOnly[str]`.
main.py:18:8 ReadOnly violation - Calling mutating method on readonly type [3005]: Method `main.Foo.mutable_method` may modify its object. Cannot call it on readonly expression `self` of type `pyre_extensions.ReadOnly[Variable[_Self_main_Foo__ (bound to Foo)]]`.

I assume they eventually want to propose this as a PEP(?) but I don’t think this will conflict with what is proposed in PEP 705, so it should be fine.

jamestwebber · October 26, 2023, 7:21pm

Those examples are weird because x and s aren’t mutated but apparently would still error. What’s the point of a “read-only” annotation that gives you all these false positives?

Jelle · October 26, 2023, 7:34pm

Yes, the ReadOnly special form should be implemented at runtime so that it can also be used in other places than TypedDicts. Under PEP 705, type checkers should give an error when ReadOnly is used anywhere but in a TypedDict, but future PEPs (or type checker extensions) may give it a defined meaning elsewhere.

I haven’t looked at Pyre’s proposed extension, and discussion of exactly how it would work is off topic for this discussion.

alicederyn · November 4, 2023, 12:02pm

Discussion moved to PEP list: PEP 705: Read-only TypedDict items