Not a fully thought out idea, but what about a typing import like extra_items
and setting __getitem__=extra_items[str | bytes]
in the functional form behaves as if you set str | bytes
as extra items and provides valid __getitem__
behavior at runtime.
This PEP has languished unresolved for a while and we need to get it finished. We need to figure out the spelling for the concept.
Current PEP proposal
# Must contain exactly one key
class Movie(TypedDict, closed=True):
name: str
# May contain arbitrary extra keys of type `bool`
class Movie(TypedDict, closed=True):
name: str
__extra_items__: bool
# As above, but all the extra items are read-only
class Movie(TypedDict, closed=True):
name: str
__extra_items__: ReadOnly[bool]
# Contains a key `__extra_items__` of type bool
# (type checkers could warn about this)
class Movie(TypedDict):
name: str
__extra_items__: bool
- Con: The
__extra_items__
key becomes special. Easy to make a mistake and forgetclosed=True
when using__extra_items__
. Type checkers could warn about this, but then what if you actually want__extra_items__
as a key?
Shantanu’s proposal
(In a few posts above. I extended some edge cases with what seemed to me the intuitive behavior.)
# Must contain exactly one key
class Movie(TypedDict, closed=True):
name: str
# May contain arbitrary extra keys of type `bool`
class Movie(TypedDict, closed=True):
name: str
def __getitem__(self, key: str) -> bool: ...
# As above, but keys are read-only
class Movie(TypedDict, closed=True):
name: str
def __getitem__(self, key: str) -> ReadOnly[bool]: ...
# Type checker error: Cannot use __getitem__ on a non-closed TypedDict
class Movie(TypedDict):
name: str
def __getitem__(self, key: str) -> bool: ...
# Contains a key "__getitem__" of type str and other keys of type bool
class TD(TypedDict, closed=True):
__getitem__: str
def __getitem__(self, key: str) -> bool: ...
- Con: Creates a new, special-case concept that doesn’t have a lot of parallels elsewhere. Several odd edge cases: If
__getitem__
is used as a key, annotations in the class body look like they conflict with the method name. It works at runtime but still looks odd. Similarly, returningReadOnly[]
from a method looks odd and lacks parallels elsewhere. - Con: The presence of the
__getitem__
method would also affect how type checkers interpret other operations (e.g.,__setitem__
). - Observation: We’ll have to think about more edge cases. For example, what if the argument to
__getitem__
is annotated asLiteral["some", "strings"]
? Or a subclass of str? Or an int?
Based on the above, I think I’d prefer to stick with the existing syntax and submit the PEP to the Typing Council. However, if someone is interested in championing Shantanu’s suggestion and resolving all the edge cases, we can still consider it too.
In the interest of completeness, let me enumerate a full list of options.
Option 1: Use __extra_items__
and closed=True
as proposed in the current draft of the PEP.
class Movie(TypedDict, closed=True):
name: str
__extra_items__: ReadOnly[bool]
Option 2: Use a custom __getitem__
override to specify additional key values as suggested by Shantanu.
class Movie(TypedDict, closed=True):
name: str
def __getitem__(self, key: str) -> ReadOnly[bool]: ...
Option 3: For now, drop the idea of supporting arbitrary additional keys and introduce only closed=True
. This would imply that no additional keys are present. It’s a subset of the current PEP. This covers most of the use cases that motivated this PEP.
class Movie(TypedDict, closed=True):
name: str
Option 4: Specify the extra value type as a TypedDict
keyword argument with a name like extra_values
. If unspecified, extra_values
would default to Never
if the TypedDict is closed and object
if it isn’t closed.
class Movie(TypedDict, closed=True, extra_values=ReadOnly[bool]):
name: str
I previously pushed back on option 4 because I didn’t want to see us add more places in the Python type system where a value expression was treated like a type expression. However, since that time we’ve made good progress in clarifying the concept of a type expression and specifying where they can appear in the grammar, so I’m more comfortable with that proposal now. The other concern with option 4 was that it would be problematic for some future short-cut syntax for describing TypedDict types, but I think this objection applies to all four options.
Of these four options, I’m slightly negative on option 2 for the reasons that Jelle mentions above. The other three options (1, 3 and 4) all seem reasonable to me.
Option 3 is the most conservative because it tackles only part of the problem, but it leaves open the possibility of a future extension. If we think that this will cover the vast majority of use cases that prompted this PEP, maybe we should start here and defer adding the additional functionality.
One issue with Option 4 is that the type for the extra items would be eagerly evaluated only in 3.14. Therefore, if you were to include a forward reference in the type, you’d still have to quote the type.
I do like Option 4 the best conceptually; the idea is that you modify the TypedDict type, and an argument to the class constructor feels like the best place for that.
I’m not sure if it is worthwhile if we only add support to closed=True
. The sample of early adapters [1] [2], while small, of the experimental implementations, all seem to use __extra_keys__
already. This contradicts to what I assumed in Supporting TypedDict(extra=type)
.
I would prefer going with Option 4 if the issue of using type expression there is no longer the main concern. With Option 1 it seems that its issues root in the lack of elegance of making an otherwise regular key special. In contrast, issues with Option 4 seems to be evolving in a positive direction (glad to see all the progress on the typing spec!)
Regarding the still present issue with forward references, I believe that the concern is that it adds burden on the type checkers, right? String literals need to be special cased for extra_value
, and this behavior (PEP 563) will not be dropped until the EOL of Python 3.13 in 2029-10.
I’m open to exploring Option 2, but at the moment Option 4 seems more practical.
Of these, options 2 and 4 seem the most viable. 4 seems the most straightforward, 2 invites overloads on __getitem__
with literals, which I would rather not support user definition of at this time, as I think it would be better for the long term with function composition if at a later date when more important things are handled, that type checkers appropriately synthesized this and a few other methods of TypedDicts such that passing around bound methods as callbacks was visibly typesafe to type checkers
Of the options Eric presented, I find Option 4 the nicest-looking, but I think we’d need to think more about what the various combinations of closed
and extra_values
mean, and which one(s) preserve the existing sort-of-closed-sort-of-not TypedDict behavior (which I find quite ugly, but I assume we need to keep it around for backwards compatibility).
With the __extra_items__
option, the expected behavior is pretty clear:
Definition | Behavior |
---|---|
closed=True, no __extra_items__ |
Fully closed TypedDict |
closed=True, __extra_items__ present |
Closed TypedDict with typed extra items |
closed=False | Existing TypedDict behavior |
With extra_values
, there are two options that feel sensible to me:
- We could allow
extra_values
only whenclosed=True
. Then Option 4 maps in an obvious way onto Option 1. - We could get rid of
closed
altogether, and haveextra_values
behave as follows:
class Movie(TypedDict, extra_values=Never):
# Fully closed TypedDict
class Movie(TypedDict, extra_values=SomeType):
# Closed TypedDict with typed extra items
class Movie(TypedDict):
# Existing TypedDict behavior
Thanks! That would be reasonable. The closed
idea was proposed earlier in this thread. Let’s recount some of the issues closed
was intended to resolve.
It allows us to use the special __extra__
key on a TypedDict
– that’s no longer an issue if we use extra_items
/extra_values
instead.
Discoverability was one of the motivations for closed
too, that would be a non-issue for extra_items
/extra_values
.
Another previous concern that closed
attempted to resolve, was to cover the simple use case with a simpler syntax. This still applies to extra_items
/extra_values
.
For use cases where no extra values need to be specified, closed=True
might look simpler than extra_values=Never
to some. We don’t have enough data to support if the “simple” case will be common to make closed=True
more favorable, but I agree that combining closed
and extra_values
comes its own flavor of mental overhead.
I will work on a revision later this week spec’ing Option 4 without closed
, because most of the issues closed
addressed aren’t there anymore.
I find the extra_values=
option my least favorite, especially if we believe that extra_values=Never
is the most common case (which sounds likely to me).
First, it requires an extra import of Never
from typing, so the common case looks like this:
from typing import TypedDict, Never
class Movie(TypedDict, extra_values=Never):
...
I think this is significantly more verbose and less clean-looking compared to closed=True
:
from typing import TypedDict
class Movie(TypedDict, closed=True):
...
Second, closed=True
is aligns well with the existing total=False
flag, which arguably improves consistency and makes this easier to learn and remember:
class Moved(TypedDict, total=False):
...
Third, the use of Never
feels a bit too clever. Never
is a fairly specialized concept, and I bet most typing users rarely use it. Also, I’d argue that in this use case it’s used in a somewhat unusual way, so looking up the current definition of Never
in the docs might not clear things up. We might need to explain the use case in the documentation entry for Never
, which feels less than ideal.
I think __extra_items__
would be okay, even if verbose, since this is the less common use case. I think I’d slightly prefer using _: <type>
due to not needing to invent a new name, and similarity with the match statement.
I agree that closed=True
is likely the most common use case and requiring the use of Never
for it is a bit obscure.
What if we use the following behavior?
closed=True
: no extra items allowedextra_items=T
: extra items allowed but must be of type Tclosed=True, extra_items=T
: type checker error: cannot combineclosed=True
withextra_items=
- Neither: arbitrary extra keys may be present (current behavior)
This would conflict with TypedDicts that use _
as a key (which feels a lot more likely than __extra_items__
).
I feel that the impact of this potential conflict is overstated. It seems rather unlikely that you’d actually both allow extra keys and need there to be a _
key. And there is a workaround in the rare case that you actually do need it.
The workaround doesn’t seem any worse to me than being forced to use the functional syntax for keys you can’t spell with the class syntax like the far more commonly used class
key. I’d rather have a clear and concise syntax that has a parallel in the match statement, than sacrifice ergonomics for some edge case.
That being said, I like your proposal as well. I prefer your proposal over the __extra_items__
key, but I would slightly prefer the _
key over the extra_items
class parameter.
This seems pretty reasonable to me. I agree that closed=True
is more readable than extra_items=Never
.
As a side note, the more I look at it, the less I like the combination of closed=True
with extra items, simply because it conflicts with my intuition of what a “closed” class should be. (Surely if a class is closed, all its attributes should be statically known?) So I like that closed
and extra_items
are mutually exclusive in this proposal.
Another possibility there: if closed=True
then you can’t add extra items, so their type is irrelevant. In that case, writing the combination could just be a linter rule (“this argument is pointless”) rather than an error.
That’s fair, though _
as a key is still probably quite rare. Your proposal seems pretty reasonable.
The issue with key conflicts got me thinking about a more general solution to work around key naming limitations. We already don’t support all possible keys when using the class-based syntax, and thus we probably can’t deprecate the functional syntax. What if we’d add a new way of specifying arbitrary string keys using the class-based-syntax? Here’s one idea:
class Foo(TypedDict):
name: str # Regular item
_: bool # Type of extra items
__items__ = {
"_": int, # Literal "_" as a key
"class": str, # Keyword as a key
"tricky.name?": float, # Arbitrary str key
}
(The name __items__
is just the first thing that came to my mind – the specifics of the name aren’t important.)
This may go beyond the original goals of the PEP, but this would have some nice properties:
- Arbitrary key names can be supported using the class-based syntax.
- We can hopefully deprecate the functional TypedDict syntax.
- We can support forward references in the extra item type without escaping.
- All TypedDict items (including the extra ones) are defined using a similar syntax in the common case where
__items__
is not needed. - We have the option of adding arbitrary magic key names such as
_
, since name conflicts can be worked around easily.
This would make the proposal a bit bigger, but on the other hand, we could deprecate the functional syntax, so arguably this would simplify the overall non-deprecated TypedDict functionality.
I like the __items__
concept and its potential uses such as replacing the functional syntax. But I feel that feature deserves probably a separate discussion thread and a PEP.
Making closed
and extra_items
incompatible as class parameters seems most viable right now.
Couldn’t __items__
be used for extra typing information too?
class Foo(TypedDict):
name: str # Regular item
__items__ = {
str: bool, # like "fallback to dict[str, bool] for extra keys",
"__items__": str, # as key in dict
"class": str, # Keyword as a key
"tricky.name?": float, # Arbitrary str key
}