PEP 570: Python Positional-Only Parameters

I think it is a valid point that simple punctuation can be hard to remember when you don’t see it all the time, but as a counter-point, I will say that I almost never see the * marker for “keyword-only”, but as soon as it was explained to me, I retained it forever. I also only had to see / once to remember it, even though it’s not particularly memorable.

I’m not trying to brag about my memory here because I don’t think of myself as having a particularly good memory. I always have to look up the syntax for C function pointers, for example. For me, both / and * are simple enough that you can just remember them.

Also, once you know there’s a marker for positional-only, I think it will almost always be obvious which marker does what just from the context:

def concat(x, y, z=None, /, axis=1, *, inplace=False):
   ...

The names get a lot more meaningful once you are in “positional-or-keyword” territory.

This doesn’t help anything in the “first time you see this” situation, but you’ll get a runtime error anyway if you try to call by name (assuming you try to do this at all - most functions with positional-only parameters are the ones where a huge majority of people won’t be trying to call them by keyword anyway).

1 Like

I would like to point out that I’m very much in favour of a language supported positional only arguments. A lot of the time in my line of work ran into the use case of writing frameworks where you have entry points (callbacks) that have some mandatory objects and then various optional ones. With the current state of the world, there’s no real way to not make your positional argument names part of the API. You may want often to rename these though as time progresses, here’re two use cases that come in my mind:

  1. As the language evolves the built-in keywords are also changing. A positional argument name that was valid variable name 5 years ago might be a really bad one nowadays. I definitely burned myself with both input (instead of payload), and async.
  2. Sometimes it just does not make sense to have public names for arguments. Probably true whenever you want to perform some operation on two arguments. arg_1, arg_2 are probably two decent fallbacks, but I would not like any of the library users to pass those in as keyword arguments (which the IDE may happily offer up).

The downside of this is still that:

    1. Runtime only (not at syntax level), so will have potentially performance penalty (I really don’t want performance penalty for passing something positional only). For example the following code will fail at runtime only which is kinda bad IMHO
def f():
...    @magic_pos_arg_decorator
...    def g():
...      pass
...    return f
    1. Python is at the moment kinda asymmetric, you can have keyword only arguments, but not positional only arguments. Given keyword-only argument is language supported would be nice to have positional only arguments similar (symmetrical); errors to be raised by the same system/level. Otherwise, it’s difficult to teach and understand by newcomers.

I’m overall +1 on this proposal. It’s something that helps library writers very much need.

2 Likes

Another thing worth mentioning about per-argument syntax is that currently mypy (and likely some other type checkers) use double underscore to indicate positional-only arguments. This is currently used mostly in stub files because one can currently only define positional-only arguments in C modules.

I don’t have strong preference, but I like this syntax more. The / looks a bit cryptic, while __arg has a precedent as “private” attributes. This is kind of similar to arguments being “anonymous”.

2 Likes

I just wanted to propose the same idea!

Using ‘/’ or punctuation as a per-argument mark would require changes in the grammar, AST, compiler, etc. But using single or double underscore could require changes only in the code that unpacks function arguments before starting to execute a function.

This is an interesting idea if I understand you correctly. It will cover all current use cases in Python functions (except MutableMapping.update() which will be covered partially). It will need less changes that other solutions. I do not think this will cause any problems, because this should be allowed only when function has the **kwargs parameter.

For solving the problem with signatures of builtin functions generated by Argument Clinic, we could use just a naming convention (double underscored names or like).

This would just be the parameter names, right? I think that would be a backwards-incompatible change, because it would change the semantics of functions with __ names.

I’m sure @pablogsal can comment on how difficult it was to change the grammar, AST and compiler and what maintenance costs that brings with it, but I think it’s better to pay the up-front cost of a change to the language syntax rather than making a backwards-incompatible change to the language’s semantics.

The AST needs minimal changes (less that 10 lines of changes) and the compiler needs none. There are very little maintenance costs there. I find interesting using the __ but I am not convinced to the per-argument marker because I still think is more verbose and there is a lot of value on keeping consistency on how keyword-only arguments are specify (using the “*” marker). This will also avoid having to deal with declarations like:

def f(__name, something, *args, __name2):
   pass

Also, technically if we only allow names with underscores before *, this is not backwards compatible as someone could have already a signature like the one listed before. I think there is value in having the same structure for positional-only and keyword-only (the markers) and also regarding how a static analyzer will fetch said information (checking the FunctionDef.args.kwonlyargs and FunctionDef.args.posonlyargs that return rich ast structures so you can directly check annotations, offsets, line numbers…). I think symmetry is very important here and this is what among other things lead us to propose the “/” in this PEP.

The advantage of the leading underscores or decorator approaches is that libraries can use them today (it’s easy to make a fake decorator that does nothing for earlier versions).

Adding syntax means very few libraries besides the stdlib will be able to use this until everyone has moved off earlier versions of Python. There’s no way around a syntax error.

1 Like

You need to change the AST to add the information about positional-only parameters. You need to change that walks the AST tree: optimizer, unparser, symtable. You need to change the compiler to use this information and produce corresponding bytecode. You need to change the bytecode to encode this information for MAKE_FUNCTION. You need to change the ceval loop to interpret new bytecode. And of course you need to change the code that sets parameters from arguments. Much more than 10 lines.

It is expected that double underscored names are special in Python. They are already used in some projects (jinja2, click), and they are used for marking positional only parameters.

def call(__self, __context, __obj, *args, **kwargs):
def gettext(__context, __string, **variables):
def function(__value, *args, **kwargs):
...

Do you know projects which use dunder parameter names for other purposes?

1 Like

I think it’s only actually incompatible if people are passing the arguments with leading underscores by name. The definitions should still work, especially if we were to only make it apply to the leading arguments (that is, after a non-underscored name it goes back to normal).

1 Like

You can check an initial draft for the implementation here:

https://github.com/pablogsal/cpython_positional_only

This is not intended to be the final proposed implementation and serves only as a minimal set of changes to try out the feature. Looking at it is true that there are some changes in the compiler and the symtable, but they are minimal.

With that, I was referring to the changes only in ast.c

This is strictly backwards incompatible, as if someone is using leading underscores (even as a leading arguments) now they acquire an entire different meaning and pre-existing function can behave differently, no matter how uncommon they are.

I just finished the minimal implementation of the support of positional-only parameters using dunder names.

Branch:

Diffs:

It is simple. Less than 10 added lines in ceval.c, also changes in the inspect module to support this syntax and in Argument Clinic to generate signatures in new syntax. Most of the rest is generated.

1 Like

I think this approach is worse because:

  • Is strictly backwards incompatible.
  • Is not symmetric on how the keyword parameters are marked (the * marker) - and I have said multiple times before, I think this is very important.
  • Querying the ast for positional-only parameters requires now checking the normal arguments and inspecting their names as opposed on how keyword-only parameters have a property in its own right (FunctionDef.args.kwonlyargs). This further highlights the asymmetry between positional-only and keyword-only.
  • You need to read every parameter to know when the positional-only arguments end, as opposed to the marker that just marks the frontier.
  • The marker ("/") makes a minimal change needed to function signatures to use the feature, in the same way using the “*” does: this forces you to mark all the parameters.
  • Is more verbose as requires the leading underscore in every parameter (longest function declarations). If you may, using the “/” is O(1) where the per-argument marker is O(n) in changes required. :slight_smile:
  • It somehow conflicts with the idea that in class bodies two leading underscores invoke name mangling.
  • The approach for positional-only args using the “/” is not complicated at all (is basically inject in the code object the posonlyargcount and managing that in _PyEval_EvalCodeWithName). The rest is auto-generated or just passing down information or initializing the new field. The changes in ceval.c are mainly for good error messages, because the code to manage the arguments is very small.

For this reasons, the PEP is centered around the “/” marker and we have dismissed any per-argument marker.

5 Likes

As library maintainer breaking backwards incompatibility is a big no-no for me :thinking: Even making async/away keywords caused plenty of issues and would prefer not to go down that path again (even if it’s uncommon normal applications have tens of dependencies and all you need is one of those to use it to make your day bad) :slight_smile: Breaking symmetry with keyword only arguments makes it hard to understand and teach.

3 Likes

It might be worth updating the “rejected ideas” section of the PEP to note the __ syntax for per-argument markers. The __ has going for it that it would be valid syntax in earlier versions of Python (though with a different semantic meaning), and the fact that it’s already used in mypy. These are unique to __ and don’t apply to the . notation, so a few sentences specifically rejecting __ in the section about per-argument parameters is probably prudent. Uniquely against the __ notation is the fact that it would be a backwards-incompatible change to the language.

2 Likes

We will update the PEP soon for some style and structural changes, so I will make sure to update the rejected ideas section with this.

1 Like

I have used the dunder-prefix convention. It is a nice hack to support this feature which is quite important to static type checkers in some situations. (Which is why it’s in PEP 484.) But I don’t like promoting it for runtime usage, because it clashes with the other meaning of dunder-prefix, class-private variables through name mangling. E.g. it could be interpreted as meaning “callers within the same class can pass this as a keyword arg but others can’t.”

I am going to ponder things some more but basically I am ready to accept PEP 570 as it currently stands (with minor edits for clarity).

@pablogsal Can you update the section in PEP 570 with the various alternative proposals that have come up in this thread and a brief (!) explanation of why we’re not doing that? (It doesn’t seem Greg’s proposal of using parentheses is mentioned yet, not the PEP 484 convention of using dunder-prefix.)

It is also fine to mention that basically all alternatives are also ugly and arbitrary and mysterious until you have been told what they mean.

6 Likes

Unfortuantely true. But I really like the leading dunder approach as it can be used today in all language versions without new syntax.

Some data on compatibility concerns - TL;DR exceedingly rare in my initial searches.

I’m doing a simple regex line based search for __[a-zA-Z0-9_]+= on a line and manually investigating that code. This trivial method brings up a lot of false positives.

I found splunk-sdk-python/splunklib/client.py at master · splunk/splunk-sdk-python · GitHub, but that turns out to just be using **kwargs to construct a query string for a URL, not actually taking a __conf= arg directly. False positive.

I also saw this pattern in older versions of the webob library, thought I don’t see the same things in the current version on Github. A line of code was once: TrackableMultiDict(__tracker=self._update_get, __name='GET'). But that code was refactored, webob.TrackableMultiDict no longer exists, and it was accepting **kwargs anyways rather than specifying those in the parameter list. So never a problem. False positive.

The dunder arg name legitimately occurs only a couple times within our internal codebase at work (many tens of millions of lines of Python code…) in the context of test injection. Some code used __named parameters for test injection purposes by unittests on a few APIs. Uncommon, easy to detect and refactor if needed.

Repurposing the __ prefix probably warrants a from __future__ import positional_only__args cycle for a release or two with a deprecation warning so that existing code gets updated gracefully. But it doesn’t seem to be a common pattern in my searching.

I did not go so far as to try and run that search via BigQuery over GitHub code.
BigQuery could be used with some better constructed regexes to get a better feel for if this pattern actually exists in the open source world.

A better regex looking for def’s; perhaps r"def\ \w+[(]__[a-zA-Z0-9_]+([)]:|)$" does turn up things such as click, jinja2, flask, celery, pluggy, swift lldb libraries, typeshed, OpenMM simtk, logilab, and IPython. All of which are using the idiom for positional only args. That was just from our internal third_party codebase, not a github bigquery. Making that match multiline ending at the ): might find non-positional test injection arg uses in the wild?

2 Likes

I’ve looked at Numpy, Pandas, Dask, Numba and PyArrow. There’s no usage of double underscore names there that would be broken by making them denote positional-only arguments.

By the way, the command I used is:

$ rg "\b__\w+[a-zA-Z0-9]\b" `find -name "*.py"`
1 Like