PEP 728: TypedDict with Typed Extra Items

Daverball · February 10, 2024, 7:21am

@PIG208 I think one thing the PEP could illustrate better is how the current open case interacts with a closed TypedDict. Essentially any subclass of any open TypedDict may be closed, because the open case is more or less equivalent to a closed TypedDict with __extra__: ReadOnly[object], so according to the type consistency rules that leaves it open to be freely overridden, since every type is consistent with object and ReadOnly allows it to be Required in addition to NotRequired.

Maybe this also means that at runtime for introspection that attribute should always exist and default to ReadOnly[object]^[1], so runtime analysis has fewer special cases to consider.

There would be a subtle semantic difference however, when it comes to the assignment of a dict literal to a typed dict, since currently mypy will emit an error for unknown keys^[2]. The rest would be consistent though, it errors for trying to write to an unknown key, but using get with an unknown key is fine, which I think matches the desired semantics^[3].

the NotRequired part is implicit ↩︎
I think we could loosen that restriction, given the new consistency rules from PEP 705 and PEP 728 ↩︎
unless you wanted extra keys to be allowed for __getitem__, just like specific NotRequired keys ↩︎

alicederyn · February 10, 2024, 7:47am

Isn’t that consistent with them being ReadOnly?

Daverball · February 10, 2024, 8:43am

No, creation of an instance is not the same as writing a value to an existing instance, so in that sense it’s safe for extra keys to be included, because they cannot be modified after the fact^[1]. Otherwise we could already consider TypedDict closed and use more precise types for methods like items(), because there’s no way to include extra keys.

using the reference that’s using the TypedDict annotation, if you had a separate reference that was a plain dict those keys could be changed, but dict is not consistent with TypedDict or vice versa, the exception is the original assignment, since you know there are no other references to the dict ↩︎

alicederyn · February 10, 2024, 9:23am

Misunderstood what case you were talking about. That makes sense. But I think the existing rule is helpful, as it catches typos in optional key names that otherwise wouldn’t be highlighted.

class Foo(TypedDict):
  bar: NotRequired[int]

foo: Foo = {
  "baz": 3  # Probably a typo
}

Daverball · February 10, 2024, 10:03am

That’s a fair point, although that puts the original TypedDict in a weird spot where it behaves like a closed TypedDict as a recipient, but like an open one as a provider, which is just a strange inconsistency, especially once you consider subtyping rules. It’s easier to justify that inconsistency if you only have an open version, but still want to catch a class of common errors that would only be caught by a closed one^[1], but after the addition it becomes a bit harder to justify and creates extra complication for type checkers.

Maybe it should be left up to type checkers if they want to consider this case an error or not going forward and if they wish to draw a distinction between closed and open with __extra__: ReadOnly[object].

Actually thinking about it further, there is one more inconsistency and that is Unpack, where our current TypedDict also behaves like a closed one. I suppose that is a strong argument for keeping the original TypedDict interpretation intact and distinct, even if it is not entirely internally consistent.

i.e. the current status quo ↩︎

mikeshardmind · February 10, 2024, 11:31am

I would suggest against any method or reserved field for this. While it’s currently not possible to have a subclass of a typed dict, there are known things people want to expand in this space, such as being able to use that typed dicts are structural types, mixed with intersections to type things like pandas dataframes, and this approach not only cuts off names that might be valid (such as from json) for the current case, but also for future extensions.

I don’t see an issue with extra as an argument to the type constructor. The forward reference issue isn’t really an issue, you can get around that with a type alias.

type ExtraValueType = RecursiveTypedDict

class RecursiveTypedDict(TypedDict, extra=ExtraValueType):
    some_field: SomeType

mikeshardmind · February 10, 2024, 11:50am

There’s also an argument that we could do this with another special type form that works in concert with an unpacked typed dict. PartialTypedDict[SomeTypedDict, ExtrasType]

I’m not enamored by the option, I generally don’t want more typeforms than we need, but this might be the least ugly option. Handling mappings is a very common typed data problem, and I’m all for giving users a way to express their needs.

mikeshardmind · February 10, 2024, 12:03pm

Only other note of feedback (split this across messages due to train of thought, sorry)

Supporting intersections in Python’s type system requires a lot of careful considerations, and it can take a long time for the community to reach a consensus on a reasonable design.

Ideally, extra items in TypedDict should not be blocked by work on intersections, nor does it necessarily need to be supported through intersections.

The work in progress intersection discussion brought up TypedDicts over here If both the currently being fleshed out definitions/specification for intersections and this pep are each accepted, there’s a very small area of overlap that requires additional clarity. I agree that neither feature needs to block the other, and the expectation I have is that this PEP is mostly ready for review now, whereas intersections is not.

I do have a question for you though. The current definitions for intersections would prefer if type checkers are allowed to use the other operands along with a typeddict, do you believe that a typed dict needs to specify extras for this to be allowed? My intuition is that since typed dicts are a structural type, it’s fine for the operand to provide other things not provided by the typed dict if extras is omitted, but other operands must be consistent with extras if it exists. If that tracks with your intent for extras, I can work with that, if it doesn’t, I probably can still work with it, but I’d like to better understand the intent to ensure I don’t clobber what you’re working toward here.

Daverball · February 10, 2024, 12:26pm

Michael H:

I don’t see an issue with extra as an argument to the type constructor. The forward reference issue isn’t really an issue, you can get around that with a type alias.
type ExtraValueType = RecursiveTypedDict

class RecursiveTypedDict(TypedDict, extra=ExtraValueType):
    some_field: SomeType

You can also get around it with a string forward reference, but neither solution is really as ergonomic or pretty as the closed proposal, unless you happen to run into the corner case, where you need to specify the _ ^[1]key, but the workaround is simple enough, that I still think it’s a net win, plus it simplifies the most common case, where you don’t want any extra keys at all.

I think we should try to avoid types as values wherever possible^[2] as long as we don’t have TypeForm and even then the decision should be made with care, since just like with cast, it may encourage always wrapping the type in a string, so you don’t pay the runtime cost unless you inspect the TypedDict at runtime.

or __extra__, if we stick with that ↩︎
this type of practice is best reserved for actual runtime uses of types ↩︎

PIG208 · February 11, 2024, 5:11am

Thanks for all the new insights on class TD(TypedDict, extra=SomeType) approach (let’s call it option A). So to sum up, many of the previously mentioned issues are addressable. While I think option A would work, the closed proposal (option B) still seems better to me for less special handling.

Usability of forward reference
A: As in the functional syntax, using a quoted type or a type alias will be required when SomeType is a forward reference. This is already a requirement for the functional syntax, so implementations can potentially reuse that piece of logic.

B: It doesn’t need special handling.
Concerns about using type as a value
A: Whatever is not allowed as the value type in the functional syntax should not be allowed as the argument for extra either. Type checkers might be able to reuse this check.

B: N/A.
Support for the functional syntax
A: Something like TD = TypedDict("TD", {"foo": str}, extra=SomeType) should be allowed.

B: It would be TD = TypedDict("TD", {"foo": str, "_": int}, closed=True) (or whatever extra key spec’d) instead.
Inlined TypedDict type definition
A: I think variants of future inlined TypedDict proposals need to deal with total=False support, but it is unclear to me whether supporting extra=SomeType will be harder.

B: Similarly, closed=True needs to be somehow supported, but _ lives as an item.
Handling Inheritance
A: It needs to support the inheritance of _ to replicate the behavior of inheriting a regular TypedDict item.

B: It needs special handling when there are both _ the regular item and _ the special extra item.
How to teach
A: Simpler to learn especially when there is the need to restrict extra key’s value type.

B: “closed” and the magical key might seem disjoint at first to people.

PIG208 · February 11, 2024, 5:17am

I guess it makes sense to assume that extra items have a value type of ReadOnly[object] if not specified in the context of intersection.

I think it is great to be explicit. It has the discoverability advantage that people always see the reserved key alongside the “closed” param to know that they are correlated. I think are motivated to choose a verbose name like Alice suggested making it clear that the key is special and related to “closed”.

alicederyn · February 11, 2024, 10:34am

I’ve suggested using an ellipsis for this in the past:

foo: { ...: int } = { "bar": 1 }

Not to start a debate about this here, just to note there are options that aren’t available to the class syntax.

PIG208 · February 18, 2024, 12:37am

The proposal has been updated mainly addressing these topics:

Spec out the closed=True proposal and use __extra_items__ as the special key name
Enhance the extra=Type discussion under the “Rejected Ideas” section
Discuss the interaction between closed vs. non-closed TypedDict.
Note on backwards compatibility between closed and the keyword arguments flavor of using the functional TypedDict alternative (i.e., TD = TypedDict("TD", foo=int, bar=str, closed=SomeType))
Use a more specific name "__extra_items__" for the special key.

I also want to note that support for this in typing_extensions has been merged, and we are expecting to get it released in 4.10.0. This will introduce two attributes allowing runtime introspection:

__closed__
A boolean flag indicating whether the current TypedDict is
considered closed.

This is not inherited by the TypedDict’s
subclasses.
__extra_items__
The type annotation of the extra items allowed on the TypedDict.

This attribute defaults to None on a TypedDict that has itself and
all its bases non-closed. This default is different from type(None)
that represents __extra_items__: None defined on a closed
TypedDict.

If __extra_items__ is not defined or inherited on a closed
TypedDict, this defaults to Never.

alicederyn · February 18, 2024, 10:08am

Adding everything up, it is slightly less favorable than the current proposal.

Might want to say “we think it is…” to make it clear this is the opinion of the PEP author(s) not the consensus of the discourse?

JadenCorr · February 21, 2024, 2:08pm

I want to say that something like this will be really useful for typing, especially if you are working with some frameworks like Django, etc. Where a list of keyword arguments is pretty arbitrary, you for sure know part of them, but the list is incomplete and should expect to be extended for children of the original class.

I would greatly appreciate it if something like that would be implemented

erictraut · February 21, 2024, 4:53pm

In this section, the spec says that a TypedDict is bidirectionally consistent with a dict[KT, VT] when a specific set of criteria are met. It mentions that no items can be required. I think we also need to add another criterion: that no items can be read-only.

erictraut · February 21, 2024, 8:27pm

I found what I think is a bug in one of the PEP’s code samples.

class MovieBase(TypedDict, closed=True):
    name: str
    __extra_items__: ReadOnly[str | None]

class Movie(MovieBase):
    __extra_items__: str  # A regular key

a: Movie = {"name": "Blade Runner", "__extra_items__": None}  # Not OK. 'None' is incompatible with 'str'
b: Movie = {"name": "Blade Runner", "other_extra_key": None}  # OK

In the last line, this is not OK because __extra_items__ is a required item in the Movie class. This can be fixed by 1) marking Movie as total=False, 2) by changing the annotation for __extra_items__ to be NotRequired, or 3) including an item named __extra_items__ in the dict expression assigned to b.

erictraut · February 22, 2024, 6:41am

One other issue that I encountered when implementing support for PEP 728 in pyright… I noticed that the PEP doesn’t mention anything about constructors for closed TypedDicts.

Type checkers need to synthesize an __init__ or __new__ method for TypedDict classes. Today, this synthesized method includes one parameter for each item defined in the TypedDict. With PEP 728, if a TypedDict is closed and has an __extra_items__: T (where T is a type other than Never), the synthesized constructor must also include a **kwargs: T parameter.

I think it would be useful to mention this in the PEP for the benefit of other type checker developers. This will also inform the conformance tests for the feature.

alicederyn · February 23, 2024, 6:48pm

class MovieBase(TypedDict, closed=True):
    name: str
    __extra_items__: ReadOnly[str | None]

class Movie(MovieBase):
    __extra_items__: str  # A regular key

This code sample really makes me favour putting the type into a keyword argument. I had to read it three times to figure it out even knowing the PEP and with the comment.

erictraut · February 23, 2024, 7:16pm

I think we should avoid mixing up type expressions and value expressions. The few places where this has been done in the past have contributed to user confusion, ambiguities in the spec, inconsistencies between runtime and type checking behaviors, issues for libraries that introspect types, and inconsistencies between tools. Type expressions follow different rules than value expressions (e.g. they can be quoted for forward references, deferred evaluation, special forms are interpreted differently).

For these reasons, I’m not in favor of specifying a type expression with a keyword argument.

I think the most common use of this PEP (by far) is going to be for closed TypedDicts with no extra items allowed. The use of __extra_items__ is a more advanced and less-common usage. If we were to use the keyword argument to specify the extra items, it would presumably require that you specify Never if you want the TypedDict to be truly “closed”. That would make the common use case more difficult to understand (instead of the easy-to-understand closed=True).

I just finished implementing support for the current proposal in pyright. It will be included in the next release (1.1.352). I’ll post here once it’s released so folks can try it out. I find that I can’t always predict how I’m going to like a new proposal until I’ve had a chance to actually use it in a real use case with the real tooling support. Let’s see how you feel about the current proposal once you’ve had a chance to use it firsthand.

Topic		Replies	Views
Add to TypedDict new argument for extra keys Typing	3	1190	September 3, 2023
PEP 705 - TypedDict: Read-only and other keys Typing	15	2539	November 4, 2023
PEP 655: Required[] and NotRequired[] for TypedDict PEPs	13	3498	February 23, 2022
PEP 692: Using TypedDict for more precise **kwargs typing PEPs typing	93	10000	March 24, 2024
PEP 705: Read-only TypedDict items PEPs typing	39	3301	February 29, 2024

PEP 728: TypedDict with Typed Extra Items

Related Topics