I don’t think the function signature is a big problem either, but I was asking because it seems like it should be equivalently-okay in the function call.
Ah, that’s actually something I hadn’t thought to check. Hmm. I’m not sure where the threshold of readability would be. What would you consider the point at which the number of arguments starts to make keywording them beneficial - and would that point change if the syntax were less verbose?
They mean different, but very much related, things. Whereas the proposal to do this at the call site would have them mean quite different things.
In general I’m +1 on the idea. I find that repeating named arguments is a pattern that happens often when you have public APIs with similar signatures that defer to common utilities. And sometimes these functions have a lot of arguments.
To the readability argument, I would add that vertical space is very important to me (much of my time is spent reading and browsing through code). That is the reason I oppose the use of Black wherever I can, because Black consumes vertical space like crazy and that makes reviewing code much harder. Besides the obvious advantage of less typing, being able to reduce vertical space through tighter packing of argument lists is a welcome improvement.
I’m not fond of the proposed syntax but I agree there doesn’t seem to be a better alternative on the table. I would be ok with any of the following, or slight variations thereupon:
f(arg_name=)
f(=arg_name)
f(arg_name=<)
(the sigil meaning “look to the left for the argument’s value”)
I also remember that the @
syntax for decorators was widely loathed when it was standardized, and years later everyone came to find it entirely ok.
In any case, 0 argument functions definitely shouldn’t count, and I think most people would agree 1 argument functions should count too (unless the function name very bad but that’s another issue).
Hm I don’t really see such a big difference in these two situations. In both cases, *
has multiple roles, either capturing/unpacking or separating sections.
def f(a, *x, y)
means “x
captures args, y
is keyword-only”, and def f(a, *, x, y)
means “x
and y
are keyword-only”. Certainly related, but the *
is doing two rather different things.
On the call side, f(a, *x, y=y)
means “unpack x
and and pass y
by keyword” while f(a, *, x, y)
or f(a, *, x=x, y)
[1] means “pass x
and y
by keyword, potentially with shorthand”. Related but again the role of *
is pretty different in these two cases.
To be clear, I don’t think any of the above code is that confusing, which is why I have no issue with this version as a possible syntax. More familiarity, combined with IDE support and decent error messages, would make this all work fine IMO.
which is how I’d prefer it to work, explicit keywords are allowed after the
*
↩︎
As to where the limit of readability lies, I think it’s highly dependent on the context. I don’t think most people (well, developers anyway) would struggle to figure out what the arguments to create_vector3(x, y, z)
mean, nor their order, but other three-variable functions may be harder to reason about. A large part of this, I think, is familiarity with the code i question.
I’m strongly against this proposal, I don’t think the repetition is a problem, and of you think writing extra characters is the problem then I suggest creating a tool that autocompletes keyword arguments for you. Maybe make it an extension to a language server?
A remark kinda OT
Just as a side note, I do appreciate that using keyword arguments is better. We had a problem at $work where two C++ libraries that used quaternions used different calling conventions, one used w, x, y, z
and the other x, y, z, w
. However, had I rendered inlay hints with clangd
this wouldn’t have been an issue, so that’s an anecdote in favor of these kinds problems being solved with editor support.
Just to bring it to mind, the whole idea started with the common x=x
usage, as seen in the pytorch library. Then we moved on to encouraging more keyword argument usage by suggesting a syntax sugar, as if they weren’t being used enough. I am not following anymore.
0-arg (~15%) for sure shouldn’t count.
1-arg almost always shouldn’t count, but sometimes you have functions that take many keyword arguments and you are only overwriting one of them.
But assuming postional-or-keyword, I think 3-arg is about upper limit (thinking of to pygame APIs for example: pg.draw.line(surface, color, start, end, width)
is potentially not readable without keyword arguments).
But it also depends on the function definition, which the script can’t really analyze with. (and I am not even sure if I could define a good general pattern or statistic to look for)
Reading through the thread linked there, I realized I did forget about one aspect of call syntax: after unpacking a parameter you can still pass additional positional arguments. So I understand better how the usage *
could be confusing, that wasn’t obvious to me before. I was already thinking that f(a, *b, *, x, y)
would be necessary to combine unpacking and keyword-shorthand in one call, but it’s not pretty…[1]
prettier than
x=, y=
to me, but opinions vary ↩︎
Yeah, it’s at 2-3 parameters that it gets into the grey area. I did a few runs.
(Methodology note: Since *a, **kw
in a function call are shown in the AST as a single argument each, and you could have multiple of them, I took the simplest possible approach that seemed reasonably consistent and counted them as 1, and also counted them as 1 in function definitions. Since the last set of stats, I changed it so that *args
in a function definition counts as 1 additional positional-only parameter, and **kwargs
as 1 additional keyword-only parameter, again because it’s simple and reasonable, if not perfect.)
Statistics for functions with at least 2 parameters.
Total function calls: 22284
Calls with any kwarg: 4571 20.51%
Maximum kwargs count: 20
Calls with any 'x=x': 926 4.16%
- compared to kwarg: 926 20.26%
Maximum num of 'x=x': 11
Total keyword params: 9602 0.43 per call
Num params where x=x: 1514 15.77%
Total function defns: 9947
Function params: pos: 748 2.63%
Function params: kwd: 1696 5.96%
Function params: any: 26020 91.41%
Statistics for functions with at least 3 parameters.
Total function calls: 7840
Calls with any kwarg: 2793 35.62%
Maximum kwargs count: 20
Calls with any 'x=x': 643 8.20%
- compared to kwarg: 643 23.02%
Maximum num of 'x=x': 11
Total keyword params: 7549 0.96 per call
Num params where x=x: 1199 15.88%
Total function defns: 4595
Function params: pos: 487 2.74%
Function params: kwd: 1506 8.48%
Function params: any: 15767 88.78%
Statistics for functions with at least 5 parameters.
Total function calls: 1392
Calls with any kwarg: 802 57.61%
Maximum kwargs count: 20
Calls with any 'x=x': 218 15.66%
- compared to kwarg: 218 27.18%
Maximum num of 'x=x': 11
Total keyword params: 3729 2.68 per call
Num params where x=x: 624 16.73%
Total function defns: 900
Function params: pos: 67 1.19%
Function params: kwd: 772 13.66%
Function params: any: 4814 85.16%
Statistics for functions with at least 10 parameters.
Total function calls: 44
Calls with any kwarg: 19 43.18%
Maximum kwargs count: 20
Calls with any 'x=x': 12 27.27%
- compared to kwarg: 12 63.16%
Maximum num of 'x=x': 11
Total keyword params: 201 4.57 per call
Num params where x=x: 61 30.35%
Total function defns: 53
Function params: pos: 1 0.15%
Function params: kwd: 160 24.32%
Function params: any: 497 75.53%
For function definitions, the proportions don’t materially change. I’m going to consider my original statements to be broadly supported by these numbers. The “any” category has now been skewed down a bit, since *a, **kw
is adding to the other figures instead of being in their own (unshown) category, but even so, the overwhelming majority of function parameters are keyword-or-positional.
For function calls, though, it’s a LOT less clear. Since there’s no way to know how many parameters the function could have taken, it’s hard to judge. I would still say that, by and large, positional parameters continue to be used. Even at the most extreme end of both sets of statistics (minimum 10), function definitions have 75% of kw/pos parameters, while more than half of all function calls use exclusively positional parameters. I will admit, though, that the Python standard library isn’t a great demonstration of ten-arg functions, given that there are just 53 definitions and 44 calls making up that data set!
Using a more conservative minimum of 2 or 3, the stats remain heavily skewed in favour of positional args. Excluding 0-arg and 1-arg functions raised the kwarg fraction to 20%, but that’s still a pretty small proportion. Remember, this counts a call if even a single argument is passed by keyword, so something like print(x, end="")
will count as a call that uses kwargs.
TBH I consider this very much on-topic! This is an excellent example of how proper use of kwargs might have made this a lot easier.
It’s worth noting that, in languages with no kwarg support, it’s very common to pass a single mapping argument called “options” that does the same job. See for example the fetch() function in JavaScript, the Process.Process() constructor in Pike (this is part of a hierarchy, where each level of the hierarchy responds to particular options and passes the rest on - in Python, this would have each one accept **kwargs in addition to its own args, and pass the spares on), and many other examples. It’s a bit less obvious that way, but the intent is still the same.
Yeah, that’s why it’s confusing. A function call with *x
in the middle of it doesn’t change the significance of subsequent arguments; but a function definition with *x
in the middle of it does. Hence I describe them as different, but related. They use the same symbols precisely because their jobs are so similar (and parallel).
Ah, I was focused on comparing *x
in a function definition with *
(separator for keyword-only arguments). In a definition it can do two pretty different things, and that seems fine. I think it could do those two things in a call as well.
While keyword arguments probably would be nice in C++, all competent IDEs/language servers allow the option for inlay hints which essentially removes the need for them to help with readability
Quick note on inlay hints for those unaware
Inlay hints for function calls is where the editor adds the name of the parameter in front of the given argument so
test(variable_a, variable_b)
becomes, with the things within |
pipes|
added by the editor, for example
test(|a:| variable_a, |b:| variable_b)
In fact, it might be possible (but I’m not sure how likely it is) that people who use tens of positional arguments in function calls are using inlay hints, in which case the problem is solved for them. Doesn’t help much when reviewing code on Github perhaps, but it helps avoiding mistakes.
The more I think about this the more I’m thinking that this is a tooling problem. (Even though I don’t think repeated tokens in a function call is a problem at all.)
I frequently edit code in nano because I’m working on a remote server. Even for my regular work, I prefer a much lighter-weight editor than something like VS Code, and I use SciTE, which doesn’t have this feature. Nor does Idle. It strikes me as a bit elitist to demand that all programmers use full-power IDEs, especially since “full-power” is defined somewhat arbitrarily.
I would love to hear from people who use 5+ positional arguments as to whether this one feature makes or breaks it for them, and if so, what they do when unable to use their standard IDE.
My gut instinct is about 4. And no, verbosity of the syntax would make no difference (because my instinct is basically that up to 3 arguments are easy to understand positionally).
Also, many (in my experience, yours may vary) cases with smaller numbers of arguments that use keywords are more about being explicit in what you mean, rather than avoiding confusion over what argument matches which parameter. In something like sorted(seq, key=<something>)
, the use of key=
is to be clear that you’re specifying a key, and you’d never use the positional form in practice. For that sort of usage, I’d argue that key=
(picking up a local variable called key
) would defeat the object. Again, your opinion may differ.
I’d suggest ignoring any cases of functions called with fewer than 5 parameters when analyzing this data, if you want a “gut feeling” suggestion.
Well, five is one of the samples that I posted above, so that works out nicely! About half of all calls in the stdlib with 5+ arguments pass them all positionally. It’s very “gut feeling” though, and I don’t think there’s any way to precisely identify the useful threshold, beyond that it’s somewhere in that very rough range. It’s like the debates about line length - smart people will have different opinions, but most people agree that 30 characters is too short and 500 is too long.
While I agree that IDEs tend to be cumbersome, language servers are typically not especially heavy to run, but they do require some setup. In any case, I don’t think it’s elitist to mention that this problem can be resolved with tools, especially since pylance, the ‘default’ language server of VS Code, the most popular code editor atm, supports this feature. One could just as easily spin it the other way, why should the language change because some terminal elitists can’t use the most common tool? But I don’t think discussing elitism is contributing to this discussion and will likely only make people upset, so I think we should stop that particular discussion.
I have to say I’m becoming confused as to the point of all this. Are you trying to suggest that all of those calls that pass all 5+ arguments positionally are going to be improved by using keyword arguments? How could you possibly make a broad generalisation like that?