PEP 692: Using TypedDict for more precise **kwargs typing

brettcannon · September 28, 2022, 7:46pm

Speaking of PEP 646, has that at all motivated __typing_unpack__ as well? There’s no mention of that being a concern previously.

I also now remember why PEP 646 feels a bit different compare to this PEP: the extra syntactic use of * in other indexing contexts. This PEP only affects syntax for type checking while PEP 646 opened things up for non-typing uses.

Jelle · September 28, 2022, 8:58pm

PEP 646 didn’t need a new dunder because there was an obvious runtime meaning for * in that context: iteration. So a *T annotation is approximately equivalent to next(iter(T)) (see PEP 646 – Variadic Generics | peps.python.org for details). For kwargs, there isn’t a similar dunder that works well, which is why we are suggesting a new one.

PEP 646 is certainly much broader than 692, although the non-typing benefits of 646 are fairly marginal.

thomas · November 14, 2022, 5:02pm

(Speaking for myself, not the SC, although these questions did come out of discussions with the rest of the SC).

I think the use-case for TypedDict for **kwargs is sensible, and a logical extension of both PEP 646 and PEP 589, but I have strong reservations about the syntactic change. Specifically, the PEP introduces a new meaning of ** in an expression (but only in one specific context), something that’s in line with the use of ** for parameter lists but very different from ** elsewhere in expressions. So I have a few pointed questions:

How much does using ** actually win here? The alternative is to use Unpack, which doesn’t seem that bad, especially since it’s an established annotation. Is the use of Unpack really confusing enough, or common enough, or error-prone enough to warrant special syntax? Is the use of ** really obvious enough, and discoverable enough, to be less confusing or less error-prone?
How bad would be it be to invalidate type annotations on **kwargs that aren’t Unpack? Do people actually want the current behaviour of annotating **kwargs? Regardless of whether they do, is the current behaviour actually desirable?
If the current behaviour isn’t desirable, how bad would it be to transition to using TypedDict semantics by default on **kwargs annotations? How many annotations of **kwargs using the old semantics are in fact correct, rather than misunderstandings or overly broad annotations like Any?
I feel like at some point in the journey to type-annotate all Python code we will end up discovering mistakes of the past or better ways of doing things, so having transition processes is probably a good idea. Something similar in concept, if not design, as __future__ imports. Considering the simplicity of the Unpack alternative to the new syntax, and assuming a transition is warranted, would this not be a good testing ground for such a process?

Just so it’s clear, the SC hasn’t decided yet on the PEP, one way or another, but I would like some clarity on the questions above to help make the decision.

franekmagiera · November 14, 2022, 6:37pm

It is difficult to give an answer that is not subjective. To me, using ** is intuitive and more concise than Unpack. It’s been proposed both on the original GitHub issue as well as during one of the typing meetups and it seemed to get neutral/positive responses.

It is hard to come up with a motivating example for the current behavior off the top of my head. In general it might be useful for very specific cases where a function expects a variable amount of keyword arguments of the same type and doesn’t care about the keyword names. It doesn’t seem to be a very prevalent use case. At the same time, I remember playing with sourcegraph a couple of weeks back, and I recall that there were a lot of examples of, as you’ve put it, overly broad annotations like Any and misunderstandings like **kwargs: Movie (where Movie is a TypedDict - I assume the author did not mean for every keyword argument to be a TypedDict). At the same time I am wondering to what extent those mistakes stem from the fact that trying something like **kwargs: **Movie is illegal.

I think, given the assumptions you’ve mentioned, that it would be a good testing ground for such a process. That said, it would be a major change and definately would require a lot of effort.

Also, I think it is worth repeating what Guido has mentioned regarding *args and **kwargs

I think this is what makes the current proposal “intuitive”.

Jelle · November 14, 2022, 6:42pm

Personally I don’t think it’s extremely important, and I’d be OK with the SC approving the new type system feature proposed in the PEP while rejecting the syntax change. The biggest argument in favor of the syntax change is consistency: PEP 646 added new syntax for typing *args with *args: *tuple[int, ...], and it would be odd if the analogous construct for **kwargs didn’t have anologous syntax.

I believe it is uncommon for **kwargs to be annotated as anything other than Any or occasionally object. I looked in our internal codebase and found half a dozen annotations as dict[str, Any] or similar (probably incorrect) and three where the annotation was something meaningful like int. I definitely believe it would have been better if the annotation for **kwargs had worked like you suggest from the beginning.

But changing this now would be the worst kind of backward compatibility break, where currently working code would suddenly mean something else. So to change it, we’d need a careful plan.

Also, if we change this, we’d want to make an analogous change to the meaning of *args annotations.

That’s an interesting idea but we’d have to think more about how to approach it. I suppose it could just be a flag to typecheckers that initially defaults to off, then gets switched after a few releases.

Jelle · November 21, 2022, 4:15pm

At last week’s typing-sig meeting, we discussed the idea of changing the meaning of **kwargs annotations and the reception was negative. I am planning to add an entry about this to PEP 692’s “Rejected Ideas” section but haven’t gotten around to it yet.

tmk · December 1, 2022, 11:16am

Why not just use a __future__ import as the flag for typecheckers?

Something like

from __future__ import explicit_args_kwargs

def f(*args: tuple[int, ...], **kwargs: dict[str, float]) -> bool: ...

but the future import wouldn’t actually do anything (other than be a signal for a type checker).

The new syntax introduced in PEP 646 would still be needed for things like

from typing import TypeVarTuple

Shape = TypeVarTuple('Shape')
class Array(Generic[*Shape]): ...

but the **kwargs: **MyDict syntax wouldn’t be required anymore, because it would just be **kwargs: MyDict. (And for a TypeVarTuple for *args, it would simply be *args: Ts instead of *args: *Ts.)

EDIT: and if someone doesn’t like the __future__ import, they can always just use Unpack[...].

EDIT2: to make this into a concrete proposal:

the future import is added to Python 3.12
- type checkers switch to the new behavior in presence of the import
after Python 3.11 has reached EOL, the next Python version (3.16?) makes the new behavior default
- type checkers usually have a target_version that specifies which Python version they target (I found such a config option in mypy, pyright, pyre (undocumented though), pycharm and pytype)
- if target_version is set to 3.16 (or whatever the version after 3.12 EOL is), then type checkers should use the new semantics; for target versions below that, like 3.12, type checkers should look for the future import

The most potential disruption would then happen when people start to target Python 3.16 with their type checkers.

Jelle · December 7, 2022, 3:56am

I submitted PEP 692: Add changing the meaning of **kwargs annotations to rejected ideas by JelleZijlstra · Pull Request #2916 · python/peps · GitHub adding this to the Rejected Ideas section for PEP 692.

thomas · December 12, 2022, 8:43pm

(Speaking for the outgoing SC, as elections are ongoing, not the next SC.)

Perhaps I’m reading the room badly here, but it doesn’t seem like there’s a strong argument for the syntactic change, and the rest of the PEP is well-contained to just typing and doesn’t seem to lose much without the syntax… If the syntax changes (and associated things, like __typing_unpack__) were left out of this PEP (or moved to a separate PEP), the current SC would be happy to accept this PEP or delegate it. If there is a desire (now or later) to add the syntax, having it as a separate PEP might make the discussion easier.

guido · December 12, 2022, 9:03pm

I’m disappointed – I didn’t speak up since it felt Jelle was presenting all the right arguments, but since there’s talk of “reading the room” maybe that was a mistake. The **kwds syntax felt natural to me and the alternative, using Unpack, a crutch. But of course in the end using Unpack would be better than not having this feature at all, so if you’re willing to delegate the rest back to Jelle and me, we can rework the PEP to remove the new syntax and __typing_unpack__ and then approve it without further SC input (current or future).

franekmagiera · December 12, 2022, 10:33pm

I agree that having Unpack is better than nothing in this case and will respect any decision that SC goes for.

That said, the arguments for the syntax change would be:

Using ** has been proposed both on the original GitHub issue as well as during one of the typing meetups and it seemed to get neutral/positive responses.
We already have a similar mechanism for *args, why not extend it to **kwargs as well?
Given the two points above, it seems to me like this behavior would be quite natural. It has been for me, difficult to judge in general.

I’m also curious, maybe becaue I can’t come up with any better arguments - what would you consider a strong argument for a syntax change in this case, if not those two above? Why aren’t the arguments above not enough to be comfortable with embracing the change?

Also, maybe I’m ignorant, please correct me if I’m wrong, but when it comes to changing the syntax in one specific place (i.e. the type annotaitons of **kwargs in function signatures) - are the consequences in case that feature is not widely used (which tbh I don’t think will be the case) really that dire?

Melendowski · December 12, 2022, 11:37pm

Was there no motivation in pep 646 * and 604 | to reduce verbosity of using type hints? Wouldn’t that reduction similarly apply to 692 **?

brettcannon · December 13, 2022, 12:21am

The key difference with the PEP 646 syntax change was it generalized beyond type hints. Thanks to that PEP we now support * unpacking in indexing anywhere in the language where we previously didn’t. This PEP specifically only opens up a new syntactic possibility just for type hints which is a harder sell, especially when we have tried to not have the type hint syntax deviate from the language overall so knowledge in both directions translates.

Sure, in it’s specific context of type hinting and the history of what **kwargs without the added syntax means, it does make sense. But the SC has to watch out for the entire language, typing or not, which makes any syntactic addition an expensive and tough call to make, especially when it doesn’t apply outside of typing.

Look at it from the perspective of someone coming across typing code for the first time and they see this construct. Will they expect to be able to use ** in other contexts? Going back to PEP 646, that syntactic construct generalized out to any use of [] when indexing on an object. This is why approving this PEP has been a struggle for the SC and Thomas came back to say, “if you drop the syntax you can have the Unpack semantics today, and then do a separate PEP for the syntax”. That gets you the easy win now and let’s you formulate how to push for the syntax later without it holding everything up as a total package where it’s all-or-nothing.

cdce8p · December 13, 2022, 12:34am

Dictionary unpacking is part of Python already. That’s one of the reasons why I and maybe other as well, feel like using it in this typing context is a natural extension.

>>> d = {1: 'Hello', 2: 'World'}
>>> {3: '!', **d}
{3: '!', 1: 'Hello', 2: 'World'}

Maybe the wording in the PEP could be adjust.

thomas · December 13, 2022, 12:38am

Note that we’re still open to an argument about why the syntax is desirable. The PEP doesn’t make a strong argument, and when I asked, Jelle said “I don’t think it’s extremely important”, and that the biggest argument is consistency. I (and I think the rest of the SC) don’t consider that argument very compelling, since while it’s superficially consistent with the **kwargs parameter definition, it’s very inconsistent with ** in other expression contexts (which type annotations nominally are).

cdce8p · December 13, 2022, 12:59am

That might be true, however I would consider the similarity to *args much more important which does have the same issue btw. In both case something is unpacked, for *args a sequence with * (since that’s what’s used in expressions too [1, 2, *(3, 4, 5)], and for **kwargs a dict with **.

Symmetry to *args
Unpack needs to be imported every time it is used, thus it could quickly become the next most imported symbol from typing after Any.

Over the last few years, there has been a considerable effort to reduce the barrier and make it easier for people to add typehints in Python. One of these aspects was to reduce the amount of names which need to be imported. PEP 585 (list instead of List) and 604 (X | Y) helped enormously and from what I’ve seen are quite popular.

Better typing for *args was a welcome byproduct of PEP 646 but it helped as well. **kwargs was the natural next step.

As a further step, there is also PEP 695 to improve the TypeVar syntax.

layday · December 13, 2022, 9:32am

That’s not quite accurate - allowing * in subscripts could’ve been done without allowing * in function annotations - the two are unrelated.

malemburg · December 13, 2022, 9:33am

This may be an unpopular opinion, but I think that the implicit syntax both for specifying the type of kwargs (implicitly assuming the type refers to the dict value type) and of args (implicitly assuming a tuple of the given type) are not very intuitive to a Python programmer.

The “*” and the “**” are normally seen as “prefix operators” for putting arguments into the variable behind them or to extract this variable and convert its contents to arguments. They are not part of the variable and thus don’t relate to the variable type.

IMO, it would be better to undo this implicit type assumption and be explicit about the container type in both cases.

That way we avoid digging us even more into the rabbit hole which was caused by being implicit about the container type.

Yes, this breaks some existing type annotations now, but it’s for the better in the long run. Type checkers should be able to easily spot the implicit use in many cases during the transition.

brettcannon · December 13, 2022, 8:00pm

From a technical perspective, sure, but not from an SC-having-to-make-a-decision position they are not. As I said, we have to view all changes from the perspective of the entire language for all users. From that perspective, arguing for just *args in a typing situation isn’t as strong as one that makes sense across the whole language.

franekmagiera · December 27, 2022, 6:29pm

Thank you for all the replies. I think it would make sense to split this PEP into (possibly two) smaller ones. I’ve found some time to prepare the next version of the PEP.