Hmm, but, percentages of what, exactly? Lines of code? Calls? Parameters? (It’s worth noting that the figures my script gives are for parameters, but could easily count calls instead - or as well.) If calls, do you count calls that have even a single parameter that could be done this way?
My personal best guess is “calls, and counting any number of affected parameters”. If you feel other metrics would be useful, pick something quantifiable and we can look. But I’ll whip up some stats. Here’s the stdlib:
Total function calls: 265734
Calls with any kwarg: 21009 7.91%
Calls with any 'x=x': 2849 1.07%
- compared to kwarg: 2849 13.56%
So by that metric, roughly one in seven of all function calls with any kwargs at all would be able to use this syntax. Not sure how meaningful that is, but there you have it.
I think number of hits per line of code would be a good metric for getting a feel for future PEPs in comparison to other PEPs? Touching many lines doesn’t necessarily make a PEP good, but it gives an idea of how impactful a PEP might be?
Some other Python features that might be easy to adapt your script for might be:
Definitely possible. But I would have to get some indication of how many lines COULD be affected by other proposals, which is harder than simply checking how many times assignment expressions are used or how many match keywords there are (since the stdlib doesn’t accept “pure churn” commits).
I also suspect that nobody will really know what the numbers mean anyway. What counts as “meaningful” or “impactful” for one proposal might be quite meaningless for another; for example, assignment expressions are extremely useful in regular expression matching, but keyword argument matching is going to be more useful in APIs based around lots of different parameters. How do you compare those? What percentage of LOC, or percentage of calls, or any other metric, makes sense? I’m kinda at a loss here.
Thanks for the updated script! I was thinking of all keyword parameters; comparing how many x=x against how many x=y. But even so it’s clear that this has significant impact, as you find that around 10% of the function calls you looked at could be simplified.[1]
if you accept that this proposal makes things simpler ↩︎
Total function calls: 265734
Calls with any kwarg: 21009 7.91%
Maximum kwargs count: 22
Calls with any 'x=x': 2849 1.07%
- compared to kwarg: 2849 13.56%
Maximum num of 'x=x': 10
Total keyword params: 35221 0.13 per call
Num params where x=x: 3858 10.95%
So when they’re used, they’re used a lot. Somewhere in the stdlib, there’s a function with 22 kwargs… and yet the average is just 0.13 per call.
Sorry, I guess I don’t really understand the KW_NAMES opcode. For some reason, I thought that function calls would have to load both the keyword name and the variable of the same name in two separate opcodes. But dis.dis(lambda:f(second=a, first=b, third=c)) produces:
I expected the see first beign loaded separately, and I thought we might be able to combine opcodes if that were the case. I guess the keywords are being loaded by KW_NAMES? So all the keywords only use up one operation, so there probably isn’t as big of savings as I thought.
Right, the keyword names are not treated as expressions. In 3.13 (on the main branch) we’re doing it somewhat differently again. Probably not worth pursuing more.
Yeah - they’re a tuple (a function constant). I’m not sure why your dis.dis didn’t show that; on mine, there’s a parenthesized annotation out the right hand side showing it, and:
I like the idea, but not the proposed way that the submitter suggested it be expressed; it seems to invite ambiguity in the language syntax (which will confuse linters, syntax highlighters, third-party tooling which relies on the existing syntax), and risks typos resulting in potentially undesirable behavior, if a developer injects a comma at the wrong place.
The benefit of the walrus operator is that it provided a syntactic construct which was similar to existing assignment expressions, but different enough that it could be discerned from other valid variable assignment syntax.
Could a special set of tokens (like :=) be used instead to explicitly note that variable punning is taking place?
It seems to be called a “pun” because it’s about using the same identifier in different namespaces. Apparently this kind of Haskell code is also referred to as punning:
data Point a = Point a a
where the first “Point” is a “type constructor” (something like the name of the type) and the second “Point” the “data constructor” (something like a constructor).
Nice, thanks. I guess to restate what you said, a “pun” is generally where you say something which you intend to be interpreted in two ways, akin to how here the variable should be interpreted both as an argument name and the local variable which gives that argument its value.