PEP 736: Shorthand syntax for keyword arguments at invocation

joshuabambrick · January 17, 2024, 2:11pm

This is a discussion thread for PEP 736 which is currently in draft.

Previous threads have generated a lot of conversation already so it would be really appreciated if you could read the PEP and previous discussion before contributing and confirm whether your suggestion has already been addressed satisfactorily.

Thanks so much to everyone who has contributed their thoughts so far. Some minor suggestions have been omitted from this post for brevity but if you feel I’m missing a critical point that was previously made, please send a Discuss message.

Noted PEP feedback

The following feedback on the PEP itself has been noted and I will do my best to address it in a subsequent edit:

Correct ‘named variables’ to ‘named arguments’
Clarify our response to ‘explicit is better than implicit’ objection
Address the balance of coupling semantically distinct variables vs avoiding desynchronising semantically equivalent ones
Explain the impact on editing code and IDEs

Ongoing discussion

The publication of the PEP sparked a few conversations on the previous thread. I’ll mention them and give my own current take here, but if you have substantive contributions which will further the discussion on these points, please do offer them here.

Chosen syntax

Much of the previous discussion has centred on the particular syntax chosen for the PEP. The most common alternative proposal is f(x, *, y). I think this is the strongest alternative that has been proposed as it closely resembles the keyword-only syntax of function definitions and it was the second most popular in the poll.

I’m not personally wedded to the syntax presented in the PEP (indeed that syntax is different from the one I originally proposed). However, so far, I haven’t seen any concrete benefits of any alternative which are not already described in the PEP, instead most of the emphasis is on stylistic preference. The weight of merits as presented in the PEP still seem stronger for the f(x=) syntax.

Debate on adherence to ‘explicit is better than implicit’

I recognise that the explanation given in the PEP is inadequate and that it is true that, in an obvious sense, the argument value is ‘implicit’ in our proposed syntax. However, I do not think that this is what the Zen of Python is trying to discourage.

In the sense that I take the Zen to be referring to, keyword arguments (for example) are objectively more explicit than positional arguments where the argument name is omitted and impossible to tell from the local context. Conversely, the syntactic sugar for integers x += 1 is not more implicit than x = x + 1 in this sense, even though the variable is omitted from the right hand side, because it is immediately obvious from the local context what it is. The syntax proposed in this PEP is much more analogous to the second example, and is designed in part to encourage use of keyword arguments which are more explicit than positional ones.

I’m unconvinced that we’re going to make any more progress on this so I’d not recommend much more discussion on this area. I will edit the PEP to be clearer.

Overall merits and potential for misuse

As ever, all syntax has the potential for misuse and so should be used judiciously. In this case, if a parameter and its value have the same semantics in both contexts, that may suggest that using this new syntax is appropriate. If not, that may suggest that they should have different names. Our analysis of popular repos showed that the former is at least very common.

The status quo in Python encourages developers (e.g., me, at a minimum) to use shorter and less descriptive names to save keystrokes or use positional arguments to reduce visual clutter. We argue that this new syntax presents a valuable nudge towards use of keyword arguments and will ameliorate the risk of desynchronisation of semantically equivalent variables in different contexts which harms readability. Whether the risk of misuse outweighs the benefits of the proposed syntax enumerated in the PEP (the degree of which is hard to measure, as with any hypothetical change) is a judgement for the SC to make. I’m open to suggestions of objective evidence that could help shed light on this.

Jelle · January 17, 2024, 2:30pm

I like the idea. Functions with many keyword arguments are common in lots of real-world Python code, and this will help make calls a little less verbose and therefore easier for humans to understand.

The proposed syntax is good because it marks each individual keyword argument clearly. The alternative f(*, a, b) syntax would make it harder to see at a glance that a is this new special kind of keyword argument.

I would caution against arguing too much on what exactly the Zen of Python should have to say here. The Zen is useful for each of us to think about as we consider changes to the language, but it’s inherently open to interpretation, and nobody is going to convince anyone by arguing over that interpretation.

pf_moore · January 17, 2024, 3:22pm

Thanks for the updated PEP and new thread. The following are some comments, intended to re-state issues where I remain against the proposal. But as I said on the original thread, I haven’t the time to go back through everything and check if all that I said has been addressed. I remain -1 on the PEP as a whole, regardless.

“Encourages use of named variables”. I’m not clear what “named variables” is intended to mean here, but I don’t see why it’s less readable to say (for example) velocity=distance/time in a function call’s parameter list as opposed to saying velocity= coupled with an assignment velocity=distance/time. In fact, the latter is less readable, because you need to look elsewhere to find the value. And yes, I know you can say “don’t use the syntax in that case, then”. But that just reduces the argument to how often the construct is appropriate, and how frequently it will be misused by people who are, let’s say, “overenthusiastic” in their use of new features.
“Encourages consistent variable names”. This can just as easily be stated as “discourages context-appropriate variable names”. Take the example above again - a function to do a generalised motion calculation might well take a velocity argument. But in my code, vehicle_velocity might be a more meaningful name. Which again puts us back in the “don’t do this if it’s not appropriate” debate, just in a different context.
“Reduces verbosity”. This is highly subjective, IMO. The syntax is less verbose when it’s a direct replacement of var=<some_calculation>; f(var=var), but it’s not less verbose when replacing f(var=<some_calculation>), or local_name=<some_calculation>; f(var=local_name), or many other situations. So it’s hard to see this as an independent advantage, as opposed to “if you are currently typing var=var in function calls, you can save a few keystrokes”. And simply “saving keystrokes” is widely acknowledged as a very weak argument in favour of a new feature.

All of which leaves “Applicability to dictionary construction” as the only benefit of the new syntax that I’d support. And while I do find having to type dict(a=a, b=b) annoying, it mostly only comes up at the REPL, or in throwaway code.

The “Prior Art” does act as a good argument that this is simply following a common trend, and not inventing something unusual. If the PEP’s argument was simply “lots of other languages do this, why don’t we?” then I’d say this section would be compelling. But conversely, I don’t think that “doing something because other languages do it” has been a particularly successful argument for adding features in the past - in cases where we have done it (conditional expressions and assignment expressions come to mind) the features have been controversial, and have usually had strong justifications on their own merit.

The “Applicability” section does establish that this is fairly common, and could immediately be used in quite a lot of places, in spite of the fact that I’m claiming that it’s often an anti-pattern. But I’m very unconvinced by the “lines saved” statistics, as I find that almost universally, collapsing a call into a smaller number of lines with multiple var= constructs per line is less readable^[1]. So the fact that formatters like black will do this without asking is overall (IMO) a net loss for the proposal, rather than a gain.

With regard to the “Objections” section:

“The feature is confusing”. You haven’t addressed my point (from the original thread) that there are two types of confusion - the “initial unfamiliarity” type (which is the one that you address in this section) and the “inherent awkwardness” type (that you don’t address at all). Please re-read my post on this from the original thread, because I don’t think you’ve addressed it properly.
You didn’t address the point that this tightly couples variable names in the caller and parameter names in the callee. Whether that’s a technical coupling (if I change the name of the local variable, I have to change the syntax used, not just rename the variable in the call as well) or “social” coupling (people are encouraged to use over-generic names in their code to match the arguments in APIs they use), it’s not really been addressed. There’s also the broader coupling, in the sense that it’s not possible to use this feature effectively for two APIs that don’t use the same argument names for something (for example, one API uses the name colour for an argument, and another API uses color)^[2].

100% agreed. IMO, the reason there’s been a debate over principles from the Zen like explicit vs implicit is because the proposal leans heavily on the Zen for its arguments. In particular, the “Encourages use of named variables” section uses “explicit is better than implicit” as a justification, so it’s hardly surprising if people with different interpretations disagree. Of course, without the Zen quote, that rationale becomes “it’s more readable”, which is clearly very subjective, so the whole argument in that sentence becomes less compelling (as I noted above)…

conceded, readability is subjective ↩︎
I don’t think this broader form of coupling was mentioned in the original thread, but it’s something that the PEP should address ↩︎

joshuabambrick · January 17, 2024, 3:43pm

Thanks a lot Paul for your thoughtful reply. I will respond in more detail later but to avoid derailing the conversation here, I wanted to quickly note that you’ve identified an error in the draft where I referred to ‘named variables’ instead of ‘named arguments’ (equivalently ‘keyword arguments’).

I don’t usually like the style that you reasonably understood the PEP to be encouraging (x = 1; f(x=)) unless there’s a good justification that it improves readability in context and did not mean to suggest that we should nudge towards this.

ericvsmith · January 17, 2024, 3:59pm

Is this actually in the PEP somewhere? I didn’t see anything about formatters. I would be surprised if any formatter, and especially black, did such a transformation, especially because black keeps the same AST before and after formatting.

I’m also -1 on the PEP, for the reasons @pf_moore stated. I don’t think it’s adding any expressiveness to the language.

Rosuav · January 17, 2024, 4:13pm

I agree. “Other languages do this” is neither a strong argument for nor a strong argument against a proposal. However, prior art is always extremely useful to be aware of. What do other languages do? Can we get some opinions from people who use them? ESPECIALLY the case if a language used to have a feature but removed it, or if a clearly-derived language dropped the feature despite otherwise being similar.

pf_moore · January 17, 2024, 4:17pm

Not explicitly, no. But in the “Applicability” section, one of the metrics quoted is “lines saved” based (presumably) on the idea that

some_func(
    argument_one=argument_one,
    argument_two=argument_two,
    argument_three=argument_three,
    argument_four=argument_four,
)

could be converted into the “fewer lines” form

some_func(argument_one=, argument_two=, argument_three=, argument_four=)

My point isn’t that a formatter would decide to use the new syntax (which as you say changes the AST), but rather that if I chose to use the new syntax in the form

some_func(
    argument_one=,
    argument_two=,
    argument_three=,
    argument_four=,
)

the formatter might choose to line-wrap it to the “shorter” form based on the implied “fewer lines” preference in the PEP, which I find significantly less readable. Basically, for function calls with many arguments, I prefer one argument per line as that makes the call easier to read (for me). Therefore, the new syntax would actually save no lines in my preferred style. And it might be another area where I find myself having to use # fmt: off to override formatters to get what I want.

Of course, what formatters would do with this new syntax is unknown at this point, they may take the view that if a function call uses PEP 736 syntax, one argument per line is the preferred reformatting. Maybe this is something that the PEP should discuss? Although if the authors prefer not to get into such style matters^[1], I can accept that.

while accepting that my concern is valid in the absence of the PEP making a statement ↩︎

joshuabambrick · January 17, 2024, 4:29pm

“Encourages use of named variables ”. I’m not clear what “named variables” is intended to mean here, but I don’t see why it’s less readable to say (for example) velocity=distance/time in a function call’s parameter list as opposed to saying velocity= coupled with an assignment velocity=distance/time.

This is based on an error in the PEP that you identified. Encouraging use of named arguments (as the PEP should read) is important given the clear merits they offer over positional arguments. Current Python syntax penalises use of keyword arguments by introducing visual clutter.

“Encourages consistent variable names”. This can just as easily be stated as “discourages context-appropriate variable names”.

True, this syntax may be misapplied and should be used judiciously as summarised in the thread description. I’ve added a note in the thread description to expand on the balance between risk of coupling vs benefit of synchronisation. I will consider specifically adding a recommendation in the PEP to use this syntax for variables that are semantically equivalent.

“Reduces verbosity”. This is highly subjective, IMO. The syntax is less verbose when it’s a direct replacement of var=<some_calculation>; f(var=var), but it’s not less verbose when replacing f(var=<some_calculation>), or local_name=<some_calculation>; f(var=local_name), or many other situations.

I think this is also based on the error in the PEP that you identified. f(x=x) will always be more verbose than f(x=). Yes, if you introduce a redundant variable it will become more verbose but the PEP was not intended to suggest that.

So the fact that formatters like black will do this without asking is overall (IMO) a net loss for the proposal, rather than a gain.

I highly discourage formatters from applying this syntax globally by default for the agreed reasons. I’d be happy to add a suggestion in the PEP to think carefully before doing this.

“The feature is confusing”. You haven’t addressed my point (from the original thread) that there are two types of confusion

Thank you, I did miss your point there and have dug into it more closely. Overall, I don’t think I agree that this is ‘confusing’ as opposed to a suggestion that some will find it ‘incongruent’ with the rest of Python syntax. It’s clear that you’re not the only one who feels this way. However, I think this comes down to a matter of the particular syntax selected rather than the feature itself. The PEP describes several proposed syntaxes, of which I’m open to any. That said, while hotly disputed, the weight of the arguments and broad support appears to be behind the syntax described in the PEP.

this tightly couples variable names in the caller and parameter names in the callee

Yes, this feature should be used only where this effect is desirable. I will extend the PEP to emphasise this consideration as in the thread description.

it’s not possible to use this feature effectively for two APIs that don’t use the same argument names for something

That’s fair but I don’t think that’s an issue with the proposal but an issue of inconsistency between the libraries used. If the developer owns the library in question, it indicates that they may want to update them for consistency. If there is a way to tweak the PEP to support this case, I’d love to consider it.

(for example, one API uses the name colour for an argument, and another API uses color)

Haha, I think you need to take that one up with Noah Webster!

ericvsmith · January 17, 2024, 4:31pm

Ah, got it. black, at least, controls this with “magic trailing commas”.

But back on subject: I don’t think the PEP should concern itself with what formatters might do.

Axe319 · January 17, 2024, 5:29pm

There is a small grammatical error in the section “The syntax is ugly”.
You duplicated the word “is” in the first bullet point.

This objection is is subjective and many community members disagree.

pf_moore · January 17, 2024, 6:03pm

I disagree. Named arguments are perfectly well encouraged now. The benefits of clearly linking the argument name and the value are present now - arguably more so, as the PEP x= syntax hides (to an extent) the fact that this refers to the value in the local variable x. Claiming that the existing syntax introduces “visual clutter” is simply restaing the (IMO extremely weak) argument that the PEP “reduces verbosity”. But of course there’s a lot of subjectivity to all this, so you may well disagree. I’m stating my objections so you can address them in the PEP, not to try to persuade you.

I strongly dispute the claim that the variable is redundant. Certainly it may sometimes be redundant - but so can any construct in Python (yes, I’m using your "any construct can be misused argument against you ) The point here is that if you have a non-redundant variable, the PEP is either more verbose or more likely inapplicable.

No-one is arguing that not typing the second x in x=x isn’t shorter. The argument is that shortening that case isn’t a worthwhile use of a language feature (i.e., “just because we can do a thing doesn’t mean we should”).

I think you’re misunderstanding my point (as @ericvsmith did). I’m not suggesting formatters convert x=x into x=. That’s a change in the code, not just a formatting change, and IMO it’s unacceptable for any formatter to do this. I was pointing out that there’s no clear “best” way to format calls that use PEP 736 format. But see below for more.

I don’t think you can legitimately criticise two independent 3rd party libraries for not choosing the same argument name for a particular value (see my “color” vs “colour” example). But you can criticise the PEP for ignoring that potential limit on its applicability. By all means say “we don’t care about that issue”, but don’t claim it’s a problem with the libraries.

Agreed. But the PEP suggests (in the “applicability” section) that reducing line count by wrapping arguments is a benefit of the syntax. That’s just as much about what formatters might do (in terms of taking a stance on the “best” format). So let’s remove that as well

Happy to do so. As someone born and living in England, I can confirm he’s wrong and colour is the correct spelling. Will you raise PRs on all the libraries that use color, or should I?

Rosuav · January 17, 2024, 7:13pm

The vast majority of functions in Python have all of their parameters as keyword-or-positional. The vast majority of function calls in Python use positional arguments. Is this “well encouraged”, or is the extra verbosity of having to label every argument a barrier to usage?

Maybe, but at the same time, there are a huge number of places where they WILL use the same name, simply because it is the single most obvious name for something. When the concept is the same, the name will frequently be the same.

Is it a problem for PEP 634 that match statements have only limited applicability, or is it a benefit that they are applicable in certain situations? Is it a problem for the __future__ directive that only certain features make sense to be governed that way? It’s not a criticism of a proposal that it does not make sweeping changes to every single Python program ever written. It does what it does, and where it doesn’t apply, the existing syntactic form is perfectly fine.

Oh, you go ahead. They’ll fit in just fine alongside all the other drive-by PRs these projects get

MegaIng · January 17, 2024, 7:18pm

Do you have any statistics, for example a survey, that this verbosity is the reason that the majority of programmers are not using keyword arguments? How do you know that this feature is going to change this?

Or is this just based on your own opinions?

I probably would not change my behavior at all based on the presences of this feature, except for rare cases where the IDE suggests it or something.

Oh, actually: Has it been suggested to add this shortcut syntax also to match patterns? That does seem like a decent usecase. That is probably the context where name=name has been the most annoying to me.

Rosuav · January 17, 2024, 8:07pm

No, I don’t have any survey. What I do have is data that disputes your claim that keyword arguments are “perfectly well encouraged now”. So this is based on statistical analysis of the Python standard library. You’re welcome to use the script on your own codebase, or any other large codebase, if you think the stats shown here are non-representative.

Script: shed/find_kwargs.py at master · Rosuav/shed · GitHub
Usage: python3 ~/shed/find_kwargs.py -q --no-test from the CPython source directory (main branch s of today, 20240118).

Result:

Total function calls: 73573
Calls with any kwarg: 5257 7.15%
Maximum kwargs count: 20
Calls with any 'x=x': 1028 1.40%
 - compared to kwarg: 1028 19.55%
Maximum num of 'x=x': 11
Total keyword params: 10288 0.14 per call
Num params where x=x: 1616 15.71%
Total function defns: 17304
Function params: pos: 179 0.51%
Function params: kwd: 1160 3.30%
Function params: any: 32611 92.86%

Out of nearly 75,000 function calls, a mere 5000 use even a single keyword argument. Calls that have a mixture of positional and keyword arguments are counted in that, which means that thirteen out of fourteen function calls use entirely positional arguments. (I don’t have a way of identifying whether the functions being called are implemented in C or Python, though. That MAY make a difference, as it’s more work to implement keyword parameters in C.)

This is true despite the fact that function definitions are, by and large, entirely compatible with keyword arguments. Just half a percent of all function parameters are positional-only, with the overwhelming majority being positional-or-keyword - not at all surprising, since that’s what you get if you don’t explicitly ask for something else.

Does this count as “perfectly well encouraged”?

Rosuav · January 17, 2024, 8:09pm

I don’t think so, but it would be a perfectly logical extension. Technically that’s not a function call, so it would be its own grammatical change. I’d definitely be in favour of doing it there too though. Again, even though technically this is a completely separate scope, it makes enormous sense to use the same name on both sides.

steve.dower · January 17, 2024, 8:13pm

This is my first time seeing the proposal, and in general it’s an idea that I like (I implemented a similar concept in my own language a decade ago). I have two concerns:

The x=, y=, syntax looks incomplete. I’d prefer to have something after the equals, like x=*, y=*^[1] or even x=..., y=... (though I do appreciate that x= matches the f"{x=}" --> f"x={x}" transform)
It makes editing code, and the implications of those edits, harder. This in on my mind because I watched a livestreamer today spend ages figuring out why her { authToken } object in JavaScript worked, but { authToken2 } did not.^[2] As much as I like the sugar, I like not-confusing-new-users even more.

Or x=_ would be better if it weren’t already meaningful. x=$ feels a bit far-fetched ↩︎
If you can’t spot it, the second object does not have the .authToken member expected by the callee. ↩︎

MegaIng · January 17, 2024, 8:16pm

I did not make this claim. But realizing a problem and suggesting something that might be a solution does not solve a problem.

I actually already had run this script and wrote modifications of it. Out of those function calls 65% only use 0 or 1 arguments. Are you going to argue that adding keyword arguments there is any kind of improvement?

(I can’t actually replicate the exact number of 75000 calls locally because I don’t have a CPython clone right now, I am getting 55k. But I am not going to argue that either number is non-representative.)

Just because a call don’t use kwargs doesn’t mean that using them there would make sense.

MegaIng · January 17, 2024, 8:21pm

That was my knee jerk reaction as well, but I got used to it. It might not be perfect, but none of the other proposals have looked better.

Not an option because ... is already a valid expression.

Oh, I am also against the f(a,*,x,y) proposal because it makes this: f(a,*x,y) mean something completely different than this: f(a,*,x,y). What * means in call argument lists is generally already established, and this now adds a subtlety different definition that is going to confuse people.

jamestwebber · January 17, 2024, 8:24pm

Those mean two different things in a function signature–is that a big problem?

I think omitting the standard whitespace in those examples makes them needlessly confusing.

MegaIng · January 17, 2024, 8:25pm

I don’t think so? In a function signature it’s perfectly valid to replace ,*, with ,*_, (i.e. an unused variable name) and you don’t get any real change in behavior.

Edit: Oh, I guess you do. But it doesn’t affect the later arguments.