Syntactic sugar to encourage use of named arguments

Rosuav · October 22, 2023, 10:57pm

Hmm, but, percentages of what, exactly? Lines of code? Calls? Parameters? (It’s worth noting that the figures my script gives are for parameters, but could easily count calls instead - or as well.) If calls, do you count calls that have even a single parameter that could be done this way?

My personal best guess is “calls, and counting any number of affected parameters”. If you feel other metrics would be useful, pick something quantifiable and we can look. But I’ll whip up some stats. Here’s the stdlib:

Total function calls: 265734
Calls with any kwarg: 21009 7.91%
Calls with any 'x=x': 2849 1.07%
 - compared to kwarg: 2849 13.56%

Updated script: https://github.com/Rosuav/shed/blob/master/find_kwargs.py

So by that metric, roughly one in seven of all function calls with any kwargs at all would be able to use this syntax. Not sure how meaningful that is, but there you have it.

NeilGirdhar · October 22, 2023, 11:10pm

I think number of hits per line of code would be a good metric for getting a feel for future PEPs in comparison to other PEPs? Touching many lines doesn’t necessarily make a PEP good, but it gives an idea of how impactful a PEP might be?

Some other Python features that might be easy to adapt your script for might be:

Generic classes (and thus PEP 695)
Assignment expressions (PEP 572)
Match statements (PEP 636)

Rosuav · October 23, 2023, 2:15am

Definitely possible. But I would have to get some indication of how many lines COULD be affected by other proposals, which is harder than simply checking how many times assignment expressions are used or how many match keywords there are (since the stdlib doesn’t accept “pure churn” commits).

I also suspect that nobody will really know what the numbers mean anyway. What counts as “meaningful” or “impactful” for one proposal might be quite meaningless for another; for example, assignment expressions are extremely useful in regular expression matching, but keyword argument matching is going to be more useful in APIs based around lots of different parameters. How do you compare those? What percentage of LOC, or percentage of calls, or any other metric, makes sense? I’m kinda at a loss here.

ntessore · October 23, 2023, 6:43am

Thanks for the updated script! I was thinking of all keyword parameters; comparing how many x=x against how many x=y. But even so it’s clear that this has significant impact, as you find that around 10% of the function calls you looked at could be simplified.^[1]

if you accept that this proposal makes things simpler ↩︎

Rosuav · October 23, 2023, 7:01am

Okay!

Total function calls: 265734
Calls with any kwarg: 21009 7.91%
Maximum kwargs count: 22
Calls with any 'x=x': 2849 1.07%
 - compared to kwarg: 2849 13.56%
Maximum num of 'x=x': 10
Total keyword params: 35221 0.13 per call
Num params where x=x: 3858 10.95%

So when they’re used, they’re used a lot. Somewhere in the stdlib, there’s a function with 22 kwargs… and yet the average is just 0.13 per call.

davidism · October 23, 2023, 3:30pm

9 posts were merged into an existing topic: Syntactic sugar and unpacking of dictionaries

hels15 · October 23, 2023, 7:23pm

Any ideas on what this syntactic sugar should be called?

NeilGirdhar · October 23, 2023, 11:23pm

Is there an opportunity to make this common pattern more efficient by providing a new OP code?

For example, instead of:

  1           0 LOAD_CONST               1 ('keyword')
              2 LOAD_GLOBAL              0 (keyword)

would it be more efficient to have

  1           0 LOAD_CONST_AND_GLOBAL               0 ('keyword')

Since matching named arguments are so common, could this make a noticeable performance difference?

guido · October 23, 2023, 11:54pm

Uh, what Python code would correspond to those two bytecodes?

NeilGirdhar · October 24, 2023, 12:06am

Sorry, I guess I don’t really understand the KW_NAMES opcode. For some reason, I thought that function calls would have to load both the keyword name and the variable of the same name in two separate opcodes. But dis.dis(lambda:f(second=a, first=b, third=c)) produces:

  1           0 RESUME                   0
              2 LOAD_GLOBAL              1 (NULL + f)
             14 LOAD_GLOBAL              2 (a)
             26 LOAD_GLOBAL              4 (b)
             38 LOAD_GLOBAL              6 (c)
             50 KW_NAMES                 1
             52 PRECALL                  3
             56 CALL                     3
             66 RETURN_VALUE

I expected the see first beign loaded separately, and I thought we might be able to combine opcodes if that were the case. I guess the keywords are being loaded by KW_NAMES? So all the keywords only use up one operation, so there probably isn’t as big of savings as I thought.

guido · October 24, 2023, 12:33am

Right, the keyword names are not treated as expressions. In 3.13 (on the main branch) we’re doing it somewhat differently again. Probably not worth pursuing more.

Rosuav · October 24, 2023, 12:34am

Yeah - they’re a tuple (a function constant). I’m not sure why your dis.dis didn’t show that; on mine, there’s a parenthesized annotation out the right hand side showing it, and:

>>> (lambda:f(second=a, first=b, third=c)).__code__.co_consts
(None, ('second', 'first', 'third'))

TomRitchford · October 25, 2023, 2:55pm

What about “implicit named arguments”?

EDIT:

Neil Girdhar elsewhere suggested matching named argument ellision.

guido · October 25, 2023, 3:21pm

“Abbreviated keyword arguments”?

joshuabambrick · October 25, 2023, 4:26pm

In other languages, this feature has sometimes been called ‘punning’:

That said, it’s not very clear to me why and seems a little overloaded with Type punning

ngie · October 25, 2023, 5:59pm

This is an interesting proposal.

I like the idea, but not the proposed way that the submitter suggested it be expressed; it seems to invite ambiguity in the language syntax (which will confuse linters, syntax highlighters, third-party tooling which relies on the existing syntax), and risks typos resulting in potentially undesirable behavior, if a developer injects a comma at the wrong place.

The benefit of the walrus operator is that it provided a syntactic construct which was similar to existing assignment expressions, but different enough that it could be discerned from other valid variable assignment syntax.

Could a special set of tokens (like :=) be used instead to explicitly note that variable punning is taking place?

ngie · October 25, 2023, 6:02pm

Elided keyword arguments?

hels15 · October 25, 2023, 6:42pm

shorthand-keyword-arguments? Similar terminology in JS and Swift.

tmk · October 25, 2023, 10:34pm

It seems to be called a “pun” because it’s about using the same identifier in different namespaces. Apparently this kind of Haskell code is also referred to as punning:

data Point a = Point a a

where the first “Point” is a “type constructor” (something like the name of the type) and the second “Point” the “data constructor” (something like a constructor).

joshuabambrick · October 25, 2023, 11:32pm

Nice, thanks. I guess to restate what you said, a “pun” is generally where you say something which you intend to be interpreted in two ways, akin to how here the variable should be interpreted both as an argument name and the local variable which gives that argument its value.