Not a fully thought out idea, but what about a typing import like extra_items
and setting __getitem__=extra_items[str | bytes]
in the functional form behaves as if you set str | bytes
as extra items and provides valid __getitem__
behavior at runtime.
This PEP has languished unresolved for a while and we need to get it finished. We need to figure out the spelling for the concept.
Current PEP proposal
# Must contain exactly one key
class Movie(TypedDict, closed=True):
name: str
# May contain arbitrary extra keys of type `bool`
class Movie(TypedDict, closed=True):
name: str
__extra_items__: bool
# As above, but all the extra items are read-only
class Movie(TypedDict, closed=True):
name: str
__extra_items__: ReadOnly[bool]
# Contains a key `__extra_items__` of type bool
# (type checkers could warn about this)
class Movie(TypedDict):
name: str
__extra_items__: bool
- Con: The
__extra_items__
key becomes special. Easy to make a mistake and forgetclosed=True
when using__extra_items__
. Type checkers could warn about this, but then what if you actually want__extra_items__
as a key?
Shantanu’s proposal
(In a few posts above. I extended some edge cases with what seemed to me the intuitive behavior.)
# Must contain exactly one key
class Movie(TypedDict, closed=True):
name: str
# May contain arbitrary extra keys of type `bool`
class Movie(TypedDict, closed=True):
name: str
def __getitem__(self, key: str) -> bool: ...
# As above, but keys are read-only
class Movie(TypedDict, closed=True):
name: str
def __getitem__(self, key: str) -> ReadOnly[bool]: ...
# Type checker error: Cannot use __getitem__ on a non-closed TypedDict
class Movie(TypedDict):
name: str
def __getitem__(self, key: str) -> bool: ...
# Contains a key "__getitem__" of type str and other keys of type bool
class TD(TypedDict, closed=True):
__getitem__: str
def __getitem__(self, key: str) -> bool: ...
- Con: Creates a new, special-case concept that doesn’t have a lot of parallels elsewhere. Several odd edge cases: If
__getitem__
is used as a key, annotations in the class body look like they conflict with the method name. It works at runtime but still looks odd. Similarly, returningReadOnly[]
from a method looks odd and lacks parallels elsewhere. - Con: The presence of the
__getitem__
method would also affect how type checkers interpret other operations (e.g.,__setitem__
). - Observation: We’ll have to think about more edge cases. For example, what if the argument to
__getitem__
is annotated asLiteral["some", "strings"]
? Or a subclass of str? Or an int?
Based on the above, I think I’d prefer to stick with the existing syntax and submit the PEP to the Typing Council. However, if someone is interested in championing Shantanu’s suggestion and resolving all the edge cases, we can still consider it too.
In the interest of completeness, let me enumerate a full list of options.
Option 1: Use __extra_items__
and closed=True
as proposed in the current draft of the PEP.
class Movie(TypedDict, closed=True):
name: str
__extra_items__: ReadOnly[bool]
Option 2: Use a custom __getitem__
override to specify additional key values as suggested by Shantanu.
class Movie(TypedDict, closed=True):
name: str
def __getitem__(self, key: str) -> ReadOnly[bool]: ...
Option 3: For now, drop the idea of supporting arbitrary additional keys and introduce only closed=True
. This would imply that no additional keys are present. It’s a subset of the current PEP. This covers most of the use cases that motivated this PEP.
class Movie(TypedDict, closed=True):
name: str
Option 4: Specify the extra value type as a TypedDict
keyword argument with a name like extra_values
. If unspecified, extra_values
would default to Never
if the TypedDict is closed and object
if it isn’t closed.
class Movie(TypedDict, closed=True, extra_values=ReadOnly[bool]):
name: str
I previously pushed back on option 4 because I didn’t want to see us add more places in the Python type system where a value expression was treated like a type expression. However, since that time we’ve made good progress in clarifying the concept of a type expression and specifying where they can appear in the grammar, so I’m more comfortable with that proposal now. The other concern with option 4 was that it would be problematic for some future short-cut syntax for describing TypedDict types, but I think this objection applies to all four options.
Of these four options, I’m slightly negative on option 2 for the reasons that Jelle mentions above. The other three options (1, 3 and 4) all seem reasonable to me.
Option 3 is the most conservative because it tackles only part of the problem, but it leaves open the possibility of a future extension. If we think that this will cover the vast majority of use cases that prompted this PEP, maybe we should start here and defer adding the additional functionality.
One issue with Option 4 is that the type for the extra items would be eagerly evaluated only in 3.14. Therefore, if you were to include a forward reference in the type, you’d still have to quote the type.
I do like Option 4 the best conceptually; the idea is that you modify the TypedDict type, and an argument to the class constructor feels like the best place for that.
I’m not sure if it is worthwhile if we only add support to closed=True
. The sample of early adapters [1] [2], while small, of the experimental implementations, all seem to use __extra_keys__
already. This contradicts to what I assumed in Supporting TypedDict(extra=type)
.
I would prefer going with Option 4 if the issue of using type expression there is no longer the main concern. With Option 1 it seems that its issues root in the lack of elegance of making an otherwise regular key special. In contrast, issues with Option 4 seems to be evolving in a positive direction (glad to see all the progress on the typing spec!)
Regarding the still present issue with forward references, I believe that the concern is that it adds burden on the type checkers, right? String literals need to be special cased for extra_value
, and this behavior (PEP 563) will not be dropped until the EOL of Python 3.13 in 2029-10.
I’m open to exploring Option 2, but at the moment Option 4 seems more practical.
Of these, options 2 and 4 seem the most viable. 4 seems the most straightforward, 2 invites overloads on __getitem__
with literals, which I would rather not support user definition of at this time, as I think it would be better for the long term with function composition if at a later date when more important things are handled, that type checkers appropriately synthesized this and a few other methods of TypedDicts such that passing around bound methods as callbacks was visibly typesafe to type checkers
Of the options Eric presented, I find Option 4 the nicest-looking, but I think we’d need to think more about what the various combinations of closed
and extra_values
mean, and which one(s) preserve the existing sort-of-closed-sort-of-not TypedDict behavior (which I find quite ugly, but I assume we need to keep it around for backwards compatibility).
With the __extra_items__
option, the expected behavior is pretty clear:
Definition | Behavior |
---|---|
closed=True, no __extra_items__ |
Fully closed TypedDict |
closed=True, __extra_items__ present |
Closed TypedDict with typed extra items |
closed=False | Existing TypedDict behavior |
With extra_values
, there are two options that feel sensible to me:
- We could allow
extra_values
only whenclosed=True
. Then Option 4 maps in an obvious way onto Option 1. - We could get rid of
closed
altogether, and haveextra_values
behave as follows:
class Movie(TypedDict, extra_values=Never):
# Fully closed TypedDict
class Movie(TypedDict, extra_values=SomeType):
# Closed TypedDict with typed extra items
class Movie(TypedDict):
# Existing TypedDict behavior
Thanks! That would be reasonable. The closed
idea was proposed earlier in this thread. Let’s recount some of the issues closed
was intended to resolve.
It allows us to use the special __extra__
key on a TypedDict
– that’s no longer an issue if we use extra_items
/extra_values
instead.
Discoverability was one of the motivations for closed
too, that would be a non-issue for extra_items
/extra_values
.
Another previous concern that closed
attempted to resolve, was to cover the simple use case with a simpler syntax. This still applies to extra_items
/extra_values
.
For use cases where no extra values need to be specified, closed=True
might look simpler than extra_values=Never
to some. We don’t have enough data to support if the “simple” case will be common to make closed=True
more favorable, but I agree that combining closed
and extra_values
comes its own flavor of mental overhead.
I will work on a revision later this week spec’ing Option 4 without closed
, because most of the issues closed
addressed aren’t there anymore.
I find the extra_values=
option my least favorite, especially if we believe that extra_values=Never
is the most common case (which sounds likely to me).
First, it requires an extra import of Never
from typing, so the common case looks like this:
from typing import TypedDict, Never
class Movie(TypedDict, extra_values=Never):
...
I think this is significantly more verbose and less clean-looking compared to closed=True
:
from typing import TypedDict
class Movie(TypedDict, closed=True):
...
Second, closed=True
is aligns well with the existing total=False
flag, which arguably improves consistency and makes this easier to learn and remember:
class Moved(TypedDict, total=False):
...
Third, the use of Never
feels a bit too clever. Never
is a fairly specialized concept, and I bet most typing users rarely use it. Also, I’d argue that in this use case it’s used in a somewhat unusual way, so looking up the current definition of Never
in the docs might not clear things up. We might need to explain the use case in the documentation entry for Never
, which feels less than ideal.
I think __extra_items__
would be okay, even if verbose, since this is the less common use case. I think I’d slightly prefer using _: <type>
due to not needing to invent a new name, and similarity with the match statement.
I agree that closed=True
is likely the most common use case and requiring the use of Never
for it is a bit obscure.
What if we use the following behavior?
closed=True
: no extra items allowedextra_items=T
: extra items allowed but must be of type Tclosed=True, extra_items=T
: type checker error: cannot combineclosed=True
withextra_items=
- Neither: arbitrary extra keys may be present (current behavior)
This would conflict with TypedDicts that use _
as a key (which feels a lot more likely than __extra_items__
).
I feel that the impact of this potential conflict is overstated. It seems rather unlikely that you’d actually both allow extra keys and need there to be a _
key. And there is a workaround in the rare case that you actually do need it.
The workaround doesn’t seem any worse to me than being forced to use the functional syntax for keys you can’t spell with the class syntax like the far more commonly used class
key. I’d rather have a clear and concise syntax that has a parallel in the match statement, than sacrifice ergonomics for some edge case.
That being said, I like your proposal as well. I prefer your proposal over the __extra_items__
key, but I would slightly prefer the _
key over the extra_items
class parameter.
This seems pretty reasonable to me. I agree that closed=True
is more readable than extra_items=Never
.
As a side note, the more I look at it, the less I like the combination of closed=True
with extra items, simply because it conflicts with my intuition of what a “closed” class should be. (Surely if a class is closed, all its attributes should be statically known?) So I like that closed
and extra_items
are mutually exclusive in this proposal.
Another possibility there: if closed=True
then you can’t add extra items, so their type is irrelevant. In that case, writing the combination could just be a linter rule (“this argument is pointless”) rather than an error.
That’s fair, though _
as a key is still probably quite rare. Your proposal seems pretty reasonable.
The issue with key conflicts got me thinking about a more general solution to work around key naming limitations. We already don’t support all possible keys when using the class-based syntax, and thus we probably can’t deprecate the functional syntax. What if we’d add a new way of specifying arbitrary string keys using the class-based-syntax? Here’s one idea:
class Foo(TypedDict):
name: str # Regular item
_: bool # Type of extra items
__items__ = {
"_": int, # Literal "_" as a key
"class": str, # Keyword as a key
"tricky.name?": float, # Arbitrary str key
}
(The name __items__
is just the first thing that came to my mind – the specifics of the name aren’t important.)
This may go beyond the original goals of the PEP, but this would have some nice properties:
- Arbitrary key names can be supported using the class-based syntax.
- We can hopefully deprecate the functional TypedDict syntax.
- We can support forward references in the extra item type without escaping.
- All TypedDict items (including the extra ones) are defined using a similar syntax in the common case where
__items__
is not needed. - We have the option of adding arbitrary magic key names such as
_
, since name conflicts can be worked around easily.
This would make the proposal a bit bigger, but on the other hand, we could deprecate the functional syntax, so arguably this would simplify the overall non-deprecated TypedDict functionality.
I like the __items__
concept and its potential uses such as replacing the functional syntax. But I feel that feature deserves probably a separate discussion thread and a PEP.
Making closed
and extra_items
incompatible as class parameters seems most viable right now.
Couldn’t __items__
be used for extra typing information too?
class Foo(TypedDict):
name: str # Regular item
__items__ = {
str: bool, # like "fallback to dict[str, bool] for extra keys",
"__items__": str, # as key in dict
"class": str, # Keyword as a key
"tricky.name?": float, # Arbitrary str key
}
The PEP has been updated to specify the closed
and extra_items=T
proposal.
closed
works a bit like total
as it only allows a literal True
or False
. The value of closed
itself is not inherited, but it does implicitly set extra_items=Never
. This makes it an error for one to subclass a closed=True
TypedDict without explicitly setting closed=True
again.
extra_items
is quite similar to __extra_items__
, except that it is not compatible with closed
on the same TypedDict definition.
The revised proposal has an open issues section because I think there are some interest in the __item__
idea, or some other ones, but there are some concerns around them:
Quoting Jelle’s comment:
I feel this isn’t a strong argument; if this PEP is accepted, we’ll be stuck with its behavior for at least many years, so we need to make sure we get it right the first time.
This proposal is nice because it also unlocks some other things that are currently awkward or impossible (keys that aren’t valid identifiers), it has a few disadvantages:
It's less apparent to a reader that _: bool makes the TypedDict special, relative to adding a class argument like extra_items=bool.
It's backwards incompatible with existing TypedDicts using the _: bool key. While such users have a way to get around the issue, it's still a problem for them if they upgrade Python (or typing-extensions).
The types don't appear in an annotation context, so their evaluation will not be deferred.
I agree that _: bool
might be not apparent. Its similarity to match statements and brevity are appealing, but TypeDict lacks a case
keyword that makes it stand out from regular keys.
I think reintroducing the closed
class argument from the __extra_items__
proposal would help with backwards compatibility issue, but that seems less nice when you can already write _
as a regular key with __item__
.
I’ve published pyright 1.1.386, which contains support for the revised draft of PEP 728.
You’ll need to set “enableExperimentalFeatures” to true in the pyright config.
Here’s an example in the pyright playground.
I have a different concern about PEP 728. [1] I think that this section does not adequately consider the soundness reasons for the restrictions on index signatures in TypeScript that the PEP proposes to discard.
Unlike index signatures, the PEP proposes that the “extra items” type does not need to be assignable from the types of known keys. But this makes __setitem__
on a TypedDict unsound: [2]
class Movie(TypedDict):
name: str
__extra_items__: bool
def set_movie_metadata(movie: Movie, key: str, value: bool):
movie[key] = value
movie = Movie({"name": "Monty Python's Life of Brian"})
set_movie_metadata(movie, "name", False) # no type error!
# oops! now movie["name"] is a bool, not a string
The problem here is that Literal["name"]
is assignable to str
, which means we can have a variable typed as str
whose runtime value is actually "name"
. This means if we are setting a value for a key that is typed as an arbitrary string, we may be setting a value for an extra item, or we may be setting a value for a known item – a type checker has no way to know.
The PEP links this issue on the TypeScript issue tracker to suggest that this limitation was a mistake in TypeScript, and thus we should not copy it. But I think this is a mis-reading of that issue. There doesn’t appear to be serious consideration in that issue of simply lifting the limitation and allowing the unsoundness. The latest proposal instead is to use negation types to allow setting an “extra” key only when the key is of type str - Literal["name"]
– that is, it’s a string that the type checker knows cannot be the string "name"
(you could imagine that an if key != "name":
narrowing check were added to set_movie_metadata
above, to make it safe.) But without subtraction types in the Python type system, it isn’t possible for type checkers to enforce this.
I don’t feel comfortable with introducing this hole. I would prefer if the PEP did require, like TypeScript, that the type of __extra_items__
must be assignable from the types of all known keys.
EDIT: Oops, that restriction still doesn’t close the hole; if known keys can have a narrower type than extra-items, the hole demonstrated above still exists. The TypeScript restriction is only sufficient to make __getitem__
safe from this problem, not __setitem__
. I’m not actually sure how the extra-items feature can be made safe at all, without intersection and subtraction types.
I saw some discussion suggesting that the most useful feature in the PEP is closed
– perhaps it would be more advisable to add only that feature, for now?
Apologies if this has been previously discussed; prior discussions of the PEP are quite lengthy to read through in full; I’m relying here on the intended property of PEP discussions that outcomes of substantial discussion and concerns raised should be reflected in the PEP text! ↩︎
Hat tip to @samwgoldman for bringing this issue to my attention. ↩︎
Could the type checkers synthesize a number of overloads for the __setitem__
?