PEP 692: Using TypedDict for more precise **kwargs typing

PEP 692 is posted. Currently, only **kwargs comprising arguments of the same type can be type hinted. The PEP proposes to use TypedDict for typing **kwargs of different types. This feature has gained a lot of interest from the Python community. The PEP also proposes a grammar change and a new dunder __unpack__.

Please comment, share your thoughts and feedback (especially on the runtime implementation)

5 Likes

Instead of using a new dunder, could you instead use the existing __annotations__? The PEP would need to decide if it then supports any object, or just TypedDict.


Using the new annotation will not have any runtime effect - it should only be taken into account by type checkers.

I’m assuming you are not disallowing the inspection of the type at runtime (ie, the same as the existing type annotations): the wording doesn’t make this clear.

How would using __annotations__ work exactly? Given this code:

def f(**kwargs: **K): pass

What should actually happen at runtime? The PEP’s current answer is that it is equivalent to def f(**kwargs: K.__unpack__()):. Are you suggesting that it become def f(**kwargs: K.__annotations__): instead? I don’t think that would be very intuitive.

1 Like

Yes, with otherwise the same behaviour

I disagree, but I really don’t feel strongly about it. I just want to make sure it’s considered (and explicitly rejected).

Edit: actually, I think it would become def f(**kwargs: Unpack[K.__annotations__]):

Thanks for your comment!

What would be the advantage of reusing __annotations__ apart from not having to introduce a new dunder?

I think the advantage of introducing __unpack__ is that the name is more intuitive (__annotations__ already has a commonly known meaning) and then __unpack__ and __annotations__ would deal with orthogonal concerns, instead of __annotations__ having to deal with two, sort of related things.

2 Likes

__unpack__ isn’t actually unpacking anything, correct? It just returns the object that will end up in __annotations__?

What are the expectations that the object won’t match what is written in the annotation (e.g. how often will Movie.__unpack__() not return Movie itself)? Is the motivator for __unpack__ to have a special-cased repr?

The only type that will implement __unpack__ in the stdlib will be TypedDict, and it will return typing.Unpack[self], not self.

Returning self would make it impossible for runtime introspection to distinguish between def f(**kwargs: TD) and def f(**kwargs: **TD), but those mean different things.

Is it too crazy to suggest that ParamSpec be used for this purpose? We already have ParamSpec.kwargs as a way of referring to a bound type’s kwargs.

The trouble as I see it is that a ParamSpec can’t today be assigned a specific callable which it describes.

Disclaimer: I got here by way of an ideas post I just put up ( Dynamically building ParamSpecs from callables ) and missed the whole mypy thread and past work. I don’t want to disrespect the significant amount of work already done on this topic, but the mypy issue predates PEP 612, so it seems like there might be a missed opportunity to unify the ideas.

2 Likes

I think this is because ParamSpec is a very specialized type variable and not a type.

To be able to use P.args and P.kwargs, P has to be in scope as described in PEP 612.

Because of those, ParamSpec could not be used for this use case.

The way I understand it looking at the example that you proposed in the other thread, I think P would have to be a type that can be created off of a function and would describe just the inputs to the function. We would still need to represent the P.kwargs somehow and it seems like it should be a type as well. Then, we would still have to solve all the problems that this PEP is trying to solve.

Could you add that info to the PEP? It’s currently not specified well, and it seems to contradict the claim that **Movie in the example above is the repr of the object that __unpack__() returns. (repr(typing.Unpack[Movie]) is currently just '*Movie'.)

FWIW, reusing the same class for both * and ** unpacking looks like too much of a shortcut to me.
I think the backwards-compatible form is not only useful as a backcompat shim, but also as a way to create the object for run-time introspection, error messages, etc. It could use the extra precision of a dedicated class.

Relatedly, I assume that the example in the Backwards Compatibility section:

def foo(**kwargs: Unpack[Movie]) -> None: ...

was meant to be specified as equivalent to:

def foo(**kwargs: **Movie) -> None: ...

and that should be made explicit, even if the example changes.

2 Likes

One small thought on a piece of the PEP document: the abstract currently states

It also involves introducing a grammar change and a new dunder __unpack__ .

When I see references to dunders/special methods, I tend to think of the “big ones” like __iter__, __init__, __reduce__ etc. that control significant kinds of behavior and do something meaningful for a wide variety of types. In many cases these methods are also relevant for users to know because they may want their classes to support some version of that “protocol” (eg iteration or pickling).

In comparison, if I understand right the __unpack__ method is only called when a function definition is evaluated where a parameter is annotated with a ** expression. And as far as the scope of this PEP, there is no intent that it will do anything useful for any types other than TypedDicts. IMO it might be helpful to make it more explicit that the scope of the “new dunder” is really small, because community members might see the text and have objections if they think it’s a larger change to the language’s data model.

1 Like

Yes, I don’t mean to suggest that using ParamSpec solves any particular problems for this PEP. In fact, it probably introduces more work.

I’m primarily looking at this from the perspective of not having too many different ways to spell the same types.

I don’t believe ParamSpec.kwargs has a meaningful type right now (when I use reveal_type on it, I get Any from mypy). I’m not sure it makes sense to sggest that ParamSpec.kwargs be typed as a TypedDict, since it’s not clear what that would do or mean for type checkers. But it does seem internally consistent, and it may make it easier to describe the possible values for type annotations on **kwargs.

1 Like

Would it help if the dunder method is named __class_unpack__ instead? This makes it clearer it’s supposed to be called against the class itself, not an instance of it. This is also more analogous to the existing __class_getitem__. We could potentially still make use of the name __unpack__ in the future but that’s out of the scope here.

3 Likes

I really like this PEP. I agree that it’s a little awkward to have a dunder with a “nice” name like __unpack__ for such an esoteric use case, and I like Tzu-ping’s suggestion to rename it __class_unpack__. But I don’t feel strongly about it.

The only other issue I see (other than some copy-editing stuff I won’t bother with here) is that the grammar you specify would disallow a trailing comma after **kwargs: **Movie – this is very minor, but the existing **kwargs and **kwargs: SomeType grammar rules do allow that trailing comma.

Assuming we can get agreement on the unpack dunder I think this is a good PEP and would welcome it for Python 3.12. I also really appreciate that you added a prototype implementation for mypy!

1 Like

I think it’s a good idea, but I’m torn when it comes to deciding whether introducing a dedicated class would really be worth the effort. It would mostly behave like Unpack, but the repr would be different (also if we reuse Unpack, we would need to change its repr). I guess it makes sense to introduce a new special form - how about UnpackKwargs?

Makes sense, not sure if I’ll be able to come up with phrasing concise enough to be included in an abstract but will try.

I’m not sure that those should be mixed though. As I mentioned earlier, ParamSpec is a very specialized type variable and in the context of this PEP, a concrete type is required. Reusing ParamSpec would make its specification very convoluted in my opinion.

I think it’s a good suggestion.

Didn’t know about that! Will change the rule to allow for the trailing comma.

Thanks for all the comments and feedback! I’ll try to spend some time tomorrow to include all the suggested changes in the PEP. If anyone has any additional comments, please share!

I’m not a fan of this; it will be somewhat confusing for users who use the backward-compatible Unpack[] form before Python 3.12.

I agree that __unpack__ is too imprecise, but the analogy with __class_getitem__ doesn’t really hold: the new dunder doesn’t have special lookup rules. What about __dict_unpack__ or __kwargs_unpack__?

2 Likes

+1 to __kwargs__unpack as that is suggestive of the specific context in which the method is called.

(There are lots of other names we could bikeshed :sweat_smile: but I guess you have to balance wanting to be specifically descriptive against verbosity.)

__kwargs_unpack__ suggests to me a new API that controls how unpacking an instance as **kwargs works. It does not suggest that it’s only about how a type annotation will apply to **kwargs. I anticipate confusion with users wanting to customize how their object is unpacked instead, and reaching for this new method instead of implementing the abc.Mapping.

Perhaps a better name is __typing_unpack__, or __typing_unpack_map__ to distinguish it from some possible __typing_unpack_seq__.

Alternatively, why implement a new method for this at all? It seems like you could just special-case TypedDict to be a valid type annotation applying to **kwargs as a whole instead of applying to each item as it would now. **kwargs is only ever a dict, it wouldn’t make sense for any other type to be applied to **kwargs in this way.

1 Like

That’s a good point - I like __typing_unpack__, it suggests that there should be no runtime effect.

Could you elaborate on that? I don’t understand how that would work.

The way I think about that is - currently typing **kwargs with a TypedDict would mean that each individual keyword is itself a TypedDict. Therefore, we need new a syntax **kwargs: **Movie or use Unpack[Movie] so we can discern the “all keywords are typed dicts” from “keywords are represented by a typed dict” use cases. New syntax will create a new AST node, so static analyzers will be able to differentiate the two. Now, at runtime, for the sake of anyone who uses __annotations__, they should also be able to differentiate the two. So at runtime, something needs to create an “unpacked” version of the TypedDict to put it in the __annotations__.

After writing this, I realized - are you proposing to change the current behaviour so that whenever **kwargs are typed with a TypedDict instance, it should be interpreted as “keywords are represented by the typed dict” and there would be no way to do “all keywords are typed dicts”?