Syntactic sugar to encourage use of named arguments

miraculixx · October 21, 2023, 10:38am

From my experience teaching beginners, and as a senior Python developer with a few 100K pythonic code under my belt, I think this is not something Python needs. If anything it will make keyword arguments harder to understand, not less.

For example beginners already have a hard time to grasp why keyword arguments need default values, and why it should be None unless good reasons, where as positional arguments don’t.

This change would need the opposite understanding, i.e. why positional arguments need be passsed by variable name, whereas keyword arguments take just the keyword but no assignment, despite of the = sign, which everywhere else always necessitates a left=right assignment.

In short, it will be confusing. Also it is at odds with a lot of Pythonic ideas:

it is not obvious,
it is not explicit,
it adds yet another way of doing things,
it is inconsistent with the use of = as the keyword=value syntax
it adds complexity
if is bad namespacing (renaming the variable in the calling context will break the code and it is not obvious to fix)

All things considered, I am not in favor of the idea.

pf_moore · October 21, 2023, 10:51am

I agree - I wouldn’t use this feature myself, and I’d reject its use in projects that I mantain, for the reasons you give.

I guess I can live with the “if you don’t like it, don’t use it” argument, although Python does seem to be gaining features I’m preferring not to use much more than features I’m enthusiastic about, these days

miraculixx · October 21, 2023, 10:56am

Indeed that’s also my observation. It’s off topic here but since you mention it, I think it has to do with the STC’s PEP sponsor model, whereby the STC seems to become, or at least risks to be used as, a courtyard for championing and deciding on ideas instead of, well, steering the language and thus the community into the future.

guido · October 21, 2023, 3:54pm

An astute observation which deserves its own thread. (It’s commonly abbreviated as SC, not STC though.)

hels15 · October 21, 2023, 5:16pm

Well, based on my understanding it would definitely be ONLY a syntactic sugar.

I also think this syntax would make more sense f(x=, y=).
It might be confusing to deal with something like this: f(=a+b), if we predict the name of the keyword argument from the expression…

If f(x=, y=) will be used, under the hood it just means f(x=x, y=y).

In terms of how to teach it to beginners and whether it’s confusing or not, is totally relative.
Coming from this grammar rule

We could just say if the “expression” is not defined then we automatically look for the “NAME” as the name of the “expression”.

Rosuav · October 21, 2023, 7:28pm

Hmm, keyword arguments don’t need default values though.

Python has two ways of passing arguments: by position and by keyword.

Python has three ways of defining parameters: positional-only, positional-or-keyword, and keyword-only.

Any of the three parameter types may have a default or not have a default. The only exception is that, once any positional or pos-or-kwd parameter has a default, all subsequent ones have to as well.

But this proposal has nothing to do with defaults. It’s entirely at the call, and allows deduplication of a VERY common call notation. I put together this script to quickly search for them, and found 17 examples in my own shed (which is a pretty small repository), and 3858 - that’s nearly four thousand - in the Python standard library. Allowing this to be deduplicated prevents desynchronization bugs.

sirosen · October 21, 2023, 8:02pm

Ruby has this feature, of the form foo(bar:). Their description of it, IIRC, is that this calling pattern “lifts the variable bar from the surrounding scope”. It’s the same idea as this thread.
Rubocop now has a lint which enforces this pattern if you are using matching names like foo(bar: bar). We use Rubocop in the one Ruby project I maintain at work. I strongly dislike this lint (more on why below). I try not to advocate for turning off lints for a variety of reasons, which creates a situation with some undesirable tradeoffs and tension.

Typically, we favor brevity in programs. Shorter code is usually better. But brevity for it’s own sake misses that brevity is a great proxy for clarity. Shorter code is usually clearer and simpler.

I think you can argue that

x=,
y=,

is as clear as

x=x,
y=y,

But can you argue that it’s more clear?
(I don’t even fully buy that these two are equally clear, FWIW, but I’m open to it.)

Many other recent features pass the test for “shorter and clearer” – I think f-strings and the walrus are good examples. They don’t exist merely to code-golf down the number of lines or characters.

Sometimes the syntax would be handy and save some small amount of typing.

But based on my experience with Ruby, I’m very wary of this. As a language feature, if it were always truly optional, it’s fine and I don’t mind it.
But what if Ruff or flake8 or pyupgrade disagrees? What if my colleagues are split 50/50 over whether or not it’s a good feature?

There’s a lot of risk that this introduces stylistic debates and strife. I’m not seeing a huge benefit in exchange – I think the argument in favor needs to run deeper than just “the code is shorter”.

@Rosuav, thanks for checking this against the stdlib! I think that’s an important input.

My rhetorical question for the thread would be:
Is it enough for a language feature to make the stdlib shorter (in lines or characters)?
I think it’s a given that brevity on it’s own is not enough.

I used a trivial example above for a case where this is at best equally clear. I wonder if more elaborate real-world cases make a more compelling argument that the “name lifting” syntax is more clear.

EDIT: I just caught that Ruby was mentioned earlier in the thread, so I wasn’t bringing new information with that. Sorry!

hugovk · October 21, 2023, 8:07pm

It is very common. Running it on some other codebases:

5,166 in 29 (forks of) Python Packaging Authority · GitHub repos
4,227 in 24 Python · GitHub repos
1,756 in 77 Jazzband · GitHub repos
836 in 34 pytest-dev · GitHub repos
260 in 10 Pillow · GitHub repos

Rosuav · October 21, 2023, 8:27pm

Stephen Rosen:

I think you can argue that
x=,
y=,
is as clear as
x=x,
y=y,
But can you argue that it’s more clear?
(I don’t even fully buy that these two are equally clear, FWIW, but I’m open to it.)

With names this short? No, I’m not going to argue that it’s more clear. And that’s the problem with toy examples. But what if it’s longer? The standard library contains 525 cases where the x=x notation is used with a parameter name of 10 characters or more. Just in case unit testing produces unfair results, I excluded Lib/test - still found 310 examples with >=10 character parameter names. And I would absolutely argue that there’s clarity to be gained here:

    ElementTree(element).write(stream, encoding,
                               xml_declaration=xml_declaration,
                               default_namespace=default_namespace,
                               method=method,
                               short_empty_elements=short_empty_elements)

# vs #

    ElementTree(element).write(stream, encoding,
                               xml_declaration=,
                               default_namespace=,
                               method=,
                               short_empty_elements=)

Notably, also, I would then change encoding to be a kwarg, as there’s now almost no cost to doing so.

Python, as a general rule, avoids massive amounts of boilerplate. When you write code in some languages, AI code generators can contribute extensively by filling in the boring and obvious details after you’ve typed in a little bit of something. (I’ve seen this in action many times.) Python tends to not have that repetition in the first place, which makes the AI tools look less impressive, but it means programmers have less bugs to deal with. So why are we repeating name=name in so many places here, and do we actually gain anything by it?

These are questions that we can’t answer. Ultimately, once a feature is available, everyone has to make that choice, and no matter how non-controversial a feature is, there’ll always be someone who says “not in MY codebase you won’t”. Which is why we have a choice of which linters to use and how to configure them.

Agreed - brevity alone is not a goal. However, deduplication gives more than brevity - it prevents desynchronization. When xml_declaration= is the norm, it’s obvious that xml_declaration=html_declaration must be intentional and cannot possibly be a transcription error (okay, that particular example probably wouldn’t happen, but you get the idea). By eliminating the x=x possibility, we give people an easy way to recognize bugs.

hels15 · October 21, 2023, 8:30pm

I would be really interested in starting to write this up as a pep.

sirosen · October 21, 2023, 8:41pm

I think all of your points are fair. And thanks for pulling out that etree example!

I’m comparing this feature against my experience of more than once having to explain a snippet like the following:

render_subelements(weight:, depth: depth - 1)

Because not all of the names are lifted, people seem to find this one very confusing, and they often ask why the language has this feature. I’ve had that interaction with junior and experienced developers – though the experienced ones often more quickly take it in stride.

So we have two extremes – the trivial cases which don’t improve, and the highly nontrivial cases which (I agree, upon seeing it) do improve. I remain worried about the path that linters will chart through this, and that I’ll be pressured to change to this syntax even in cases where I find it not to be any better (and arguably worse for being that much less beginner friendly).

Rosuav · October 21, 2023, 8:59pm

Stephen Rosen:

I’m comparing this feature against my experience of more than once having to explain a snippet like the following:
render_subelements(weight:, depth: depth - 1)
Because not all of the names are lifted, people seem to find this one very confusing, and they often ask why the language has this feature. I’ve had that interaction with junior and experienced developers – though the experienced ones often more quickly take it in stride.

That’s very fair. But I think the feature’s still valuable; the edge cases like this are much less common than either “weight=weight” or something that’s completely different. And maybe this will look a bit better in Python, since the kwarg recursion call of render(.., depth=depth+1) is very similar to plain old assignment, depth = depth + 1, making a useful parallel.

Agreed, and I hope that linters will be smart about things. But we have no control over that (which is a good thing - the language shouldn’t control the linters!).

Maybe? Hard to judge. Obviously all complexity comes with a cost, but I suspect that this will be a relatively minor concern. It does mean there are several notations that are extremely similar, but since they all do similar things, I don’t think that’s a breaking issue.

But also, even if this IS another thing to be learned, that’s largely unavoidable. A brand new Python programmer, faced with a large codebase, is going to have a daunting task, and there’s no way around that. But a brand new programmer being taught a restricted subset of the language (assuming a competent instructor) can always avoid syntactic constructs that haven’t been learned yet. This shorthand notation is not crucial to the language, so it can be introduced whenever it’s appropriate to do so.

joshuabambrick · October 21, 2023, 9:09pm

Much appreciated! It is well underway. Will gladly send you a draft to review when I can. Again, there’s no rush here

hels15 · October 21, 2023, 9:12pm

I understand. If you are occupied or don’t have enough time, feel free to pass it to me.

tmk · October 22, 2023, 9:33am

This example has convinced me that the equal sign should be on the right, to make it clear that you can only put a single identifier there, instead of an arbitrary expression.

kfdf · October 22, 2023, 3:25pm

If allowed in function declarations it would make it easier to capture variables as default arguments, which might lead to fewer bugs when a closed over variable unexpectedly changes.

hels15 · October 22, 2023, 3:42pm

Could you elaborate?

def my_func(a=, b=):

what does this mean based on your understanding?

Rosuav · October 22, 2023, 3:50pm

This proposal isn’t at the function definition, it’s only for the function call. Any parallel change to function definitions would be an entirely separate proposal.

Presumably though, it would be equivalent to this:

def my_func(a=a, b=b):

which is an idiom far less common than the use in calling functions, but is often seen in tight loops where it speeds up a global lookup, or to allow a global default to be overridden.

jamestwebber · October 22, 2023, 10:31pm

I don’t think I would like this version–setting default values should be explicit, and it saves a lot less than the calling version.

I think I’m -0 on the idea overalI. I don’t think I’d use it and it feels like extra complexity for little benefit. But not that worked up about it.

ntessore · October 22, 2023, 10:38pm

Is it easy to express these as percentages, so that we may see how prevalent or rare the new syntax might become?