Syntactic sugar to encourage use of named arguments

From my experience teaching beginners, and as a senior Python developer with a few 100K pythonic code under my belt, I think this is not something Python needs. If anything it will make keyword arguments harder to understand, not less.

For example beginners already have a hard time to grasp why keyword arguments need default values, and why it should be None unless good reasons, where as positional arguments don’t.

This change would need the opposite understanding, i.e. why positional arguments need be passsed by variable name, whereas keyword arguments take just the keyword but no assignment, despite of the = sign, which everywhere else always necessitates a left=right assignment.

In short, it will be confusing. Also it is at odds with a lot of Pythonic ideas:

  • it is not obvious,
  • it is not explicit,
  • it adds yet another way of doing things,
  • it is inconsistent with the use of = as the keyword=value syntax
  • it adds complexity
  • if is bad namespacing (renaming the variable in the calling context will break the code and it is not obvious to fix)

All things considered, I am not in favor of the idea.

31 Likes

I agree - I wouldn’t use this feature myself, and I’d reject its use in projects that I mantain, for the reasons you give.

I guess I can live with the “if you don’t like it, don’t use it” argument, although Python does seem to be gaining features I’m preferring not to use much more than features I’m enthusiastic about, these days :slightly_frowning_face:

23 Likes

Indeed that’s also my observation. It’s off topic here but since you mention it, I think it has to do with the STC’s PEP sponsor model, whereby the STC seems to become, or at least risks to be used as, a courtyard for championing and deciding on ideas instead of, well, steering the language and thus the community into the future. :woozy_face:

6 Likes

An astute observation which deserves its own thread. (It’s commonly abbreviated as SC, not STC though.)

5 Likes

Well, based on my understanding it would definitely be ONLY a syntactic sugar.

I also think this syntax would make more sense f(x=, y=).
It might be confusing to deal with something like this: f(=a+b), if we predict the name of the keyword argument from the expression…

If f(x=, y=) will be used, under the hood it just means f(x=x, y=y).

In terms of how to teach it to beginners and whether it’s confusing or not, is totally relative.
Coming from this grammar rule

We could just say if the “expression” is not defined then we automatically look for the “NAME” as the name of the “expression”.

1 Like

Hmm, keyword arguments don’t need default values though.

Python has two ways of passing arguments: by position and by keyword.

Python has three ways of defining parameters: positional-only, positional-or-keyword, and keyword-only.

Any of the three parameter types may have a default or not have a default. The only exception is that, once any positional or pos-or-kwd parameter has a default, all subsequent ones have to as well.

But this proposal has nothing to do with defaults. It’s entirely at the call, and allows deduplication of a VERY common call notation. I put together this script to quickly search for them, and found 17 examples in my own shed (which is a pretty small repository), and 3858 - that’s nearly four thousand - in the Python standard library. Allowing this to be deduplicated prevents desynchronization bugs.

5 Likes

Ruby has this feature, of the form foo(bar:). Their description of it, IIRC, is that this calling pattern “lifts the variable bar from the surrounding scope”. It’s the same idea as this thread.
Rubocop now has a lint which enforces this pattern if you are using matching names like foo(bar: bar). We use Rubocop in the one Ruby project I maintain at work. I strongly dislike this lint (more on why below). I try not to advocate for turning off lints for a variety of reasons, which creates a situation with some undesirable tradeoffs and tension.

Typically, we favor brevity in programs. Shorter code is usually better. But brevity for it’s own sake misses that brevity is a great proxy for clarity. Shorter code is usually clearer and simpler.

I think you can argue that

x=,
y=,

is as clear as

x=x,
y=y,

But can you argue that it’s more clear?
(I don’t even fully buy that these two are equally clear, FWIW, but I’m open to it.)

Many other recent features pass the test for “shorter and clearer” – I think f-strings and the walrus are good examples. They don’t exist merely to code-golf down the number of lines or characters.


Sometimes the syntax would be handy and save some small amount of typing.

But based on my experience with Ruby, I’m very wary of this. As a language feature, if it were always truly optional, it’s fine and I don’t mind it.
But what if Ruff or flake8 or pyupgrade disagrees? What if my colleagues are split 50/50 over whether or not it’s a good feature?

There’s a lot of risk that this introduces stylistic debates and strife. I’m not seeing a huge benefit in exchange – I think the argument in favor needs to run deeper than just “the code is shorter”.


@Rosuav, thanks for checking this against the stdlib! I think that’s an important input.

My rhetorical question for the thread would be:
Is it enough for a language feature to make the stdlib shorter (in lines or characters)?
I think it’s a given that brevity on it’s own is not enough.

I used a trivial example above for a case where this is at best equally clear. I wonder if more elaborate real-world cases make a more compelling argument that the “name lifting” syntax is more clear.


EDIT: I just caught that Ruby was mentioned earlier in the thread, so I wasn’t bringing new information with that. Sorry! :slightly_smiling_face:

2 Likes

It is very common. Running it on some other codebases:

3 Likes

With names this short? No, I’m not going to argue that it’s more clear. And that’s the problem with toy examples. But what if it’s longer? The standard library contains 525 cases where the x=x notation is used with a parameter name of 10 characters or more. Just in case unit testing produces unfair results, I excluded Lib/test - still found 310 examples with >=10 character parameter names. And I would absolutely argue that there’s clarity to be gained here:

    ElementTree(element).write(stream, encoding,
                               xml_declaration=xml_declaration,
                               default_namespace=default_namespace,
                               method=method,
                               short_empty_elements=short_empty_elements)

# vs #

    ElementTree(element).write(stream, encoding,
                               xml_declaration=,
                               default_namespace=,
                               method=,
                               short_empty_elements=)

Notably, also, I would then change encoding to be a kwarg, as there’s now almost no cost to doing so.

Python, as a general rule, avoids massive amounts of boilerplate. When you write code in some languages, AI code generators can contribute extensively by filling in the boring and obvious details after you’ve typed in a little bit of something. (I’ve seen this in action many times.) Python tends to not have that repetition in the first place, which makes the AI tools look less impressive, but it means programmers have less bugs to deal with. So why are we repeating name=name in so many places here, and do we actually gain anything by it?

These are questions that we can’t answer. Ultimately, once a feature is available, everyone has to make that choice, and no matter how non-controversial a feature is, there’ll always be someone who says “not in MY codebase you won’t”. Which is why we have a choice of which linters to use and how to configure them.

Agreed - brevity alone is not a goal. However, deduplication gives more than brevity - it prevents desynchronization. When xml_declaration= is the norm, it’s obvious that xml_declaration=html_declaration must be intentional and cannot possibly be a transcription error (okay, that particular example probably wouldn’t happen, but you get the idea). By eliminating the x=x possibility, we give people an easy way to recognize bugs.

7 Likes

I would be really interested in starting to write this up as a pep.

I think all of your points are fair. And thanks for pulling out that etree example!

I’m comparing this feature against my experience of more than once having to explain a snippet like the following:

render_subelements(weight:, depth: depth - 1)

Because not all of the names are lifted, people seem to find this one very confusing, and they often ask why the language has this feature. I’ve had that interaction with junior and experienced developers – though the experienced ones often more quickly take it in stride.

So we have two extremes – the trivial cases which don’t improve, and the highly nontrivial cases which (I agree, upon seeing it) do improve. I remain worried about the path that linters will chart through this, and that I’ll be pressured to change to this syntax even in cases where I find it not to be any better (and arguably worse for being that much less beginner friendly).

That’s very fair. But I think the feature’s still valuable; the edge cases like this are much less common than either “weight=weight” or something that’s completely different. And maybe this will look a bit better in Python, since the kwarg recursion call of render(.., depth=depth+1) is very similar to plain old assignment, depth = depth + 1, making a useful parallel.

Agreed, and I hope that linters will be smart about things. But we have no control over that (which is a good thing - the language shouldn’t control the linters!).

Maybe? Hard to judge. Obviously all complexity comes with a cost, but I suspect that this will be a relatively minor concern. It does mean there are several notations that are extremely similar, but since they all do similar things, I don’t think that’s a breaking issue.

But also, even if this IS another thing to be learned, that’s largely unavoidable. A brand new Python programmer, faced with a large codebase, is going to have a daunting task, and there’s no way around that. But a brand new programmer being taught a restricted subset of the language (assuming a competent instructor) can always avoid syntactic constructs that haven’t been learned yet. This shorthand notation is not crucial to the language, so it can be introduced whenever it’s appropriate to do so.

1 Like

Much appreciated! It is well underway. Will gladly send you a draft to review when I can. Again, there’s no rush here

2 Likes

I understand. If you are occupied or don’t have enough time, feel free to pass it to me. :slight_smile:

1 Like

This example has convinced me that the equal sign should be on the right, to make it clear that you can only put a single identifier there, instead of an arbitrary expression.

1 Like

If allowed in function declarations it would make it easier to capture variables as default arguments, which might lead to fewer bugs when a closed over variable unexpectedly changes.

Could you elaborate?

def my_func(a=, b=):

what does this mean based on your understanding?

This proposal isn’t at the function definition, it’s only for the function call. Any parallel change to function definitions would be an entirely separate proposal.

Presumably though, it would be equivalent to this:

def my_func(a=a, b=b):

which is an idiom far less common than the use in calling functions, but is often seen in tight loops where it speeds up a global lookup, or to allow a global default to be overridden.

1 Like

I don’t think I would like this version–setting default values should be explicit, and it saves a lot less than the calling version.

I think I’m -0 on the idea overalI. I don’t think I’d use it and it feels like extra complexity for little benefit. But not that worked up about it.

Is it easy to express these as percentages, so that we may see how prevalent or rare the new syntax might become?

1 Like