Ambiguities about Positional-only Parameters

Positional-only parameters are discussed in PEP 570, which introduced the / separator to demarcate positional-only parameters from those that can be specified by keyword.

Prior to PEP 570, a convention was established that a parameter whose name starts with a double underscore was to be considered positional-only. Here is the relevant part of the typing spec.

As I’ve added better enforcement of positional-only parameters in pyright, I’ve discovered a couple of ambiguities in PEP 570 and the typing spec. I’d like to clarify the desired behavior and update the typing spec accordingly. I think it’s important for library authors to see consistent behavior across type checkers here.

Ambiguity #1: Should the self parameter in an instance method and the cls parameter in a class method be considered implicitly positional-only even if it is not explicitly marked as such? In other words, should the following signatures all be considered identical for purposes of type checking?

class A:
    def method1(self, x: int) -> None: ...
    def method2(__self, x: int) -> None: ...
    def method3(whatever, x: int) -> None: ...
    def method4(self, /, x: int) -> None: ...

PEP 570 mentions in its background section “…it is undesirable that a caller can bind by keyword to the name self when calling the method from the class”, but the specification section doesn’t indicate whether all self and cls parameters should be implicitly positional-only.

I think the desired behavior here is to treat self and cls as implicitly positional-only in instance methods and class methods. Can anyone think of a reason not to do this?

If we decide not to make self and cls implicitly positional-only, then it creates more ambiguities that we will need to discuss.

Ambiguity #2: When using the backward-compatibility convention for positional-only parameters, how should a type checker interpret the situation where a parameter with a double-underscored name follows a parameter without a double-underscored name?

def func(count: int, __mode: str) -> None: ...

In the example above, should count be considered positional-only because it is followed by a positional-only parameter __mode? Or should __mode be considered not positional-only because it follows a parameter that is not positional-only? Or should this be considered an error on the part of the developer and flagged as such by a type checker (and the resulting behavior unspecified)?

This situation came up in a recent bug report for pyright. The issue involves a callback protocol definition in the pydantic library:

class ModelBeforeValidatorWithoutInfo(Protocol):
    def __call__(self, cls: Any, __value: Any) -> Any:
        ...

I don’t feel strongly either way. But I lean towards count being positional-only.
This interpretation allows the __ to still mean something, while interpreting __mode as not positional-only discards the meaning of the __.

Some usefulness I could imagine for this is to avoid having to write a lot of __ in a long list of parameters.

def foo(a, b, c, d, e, f, g, h, i, __j, k, l, m):

This could specify where the positional-only parameters end, without having to repeat __ a bunch of times.

If we go with that interpretation, it raises the question “how should the following be interpreted?”

def foo(a, *, __b): ...

Is this an error? Is __b simply treated as a keyword-only argument in this context?

It is legal at runtime to do A.method1(self=A()), and your proposal would mean that type checkers would reject such code. However, that seems unlikely to matter in practice. I would support making self/cls implicitly positional-only.

I think this should be an error. “In the face of ambiguity, refuse the temptation to guess”.

Worth noting that the double underscore syntax is only needed for Python 3.7 and lower, and Python 3.7 is EOL. Therefore, it doesn’t make sense to extend the syntax with additional features.

7 Likes

What’s the benefit of making this implicitly positional-only? Does this catch common errors?

This example leads me to kind of disagree with your position. I kind of like that self= there, as a reminder of what parameter is being filled (because it could be easy to forget about the self parameter).

From the perspective of a typing user, I would say that typing tools should only treat something as positional only if the runtime treats them as positional only. If __x is just a convention and passing __x=y as a keyword does not produce a runtime error, then I would just leave them alone. The API designer has made their preference known and the user can decide whether to abide by it. If that isn’t enough, the designer can switch to explicit positional-only syntax.

I also don’t see a good reason to raise an error on A.method1(self=A()). It’s contorted enough that nobody will do that by accident, so what bug will raising a type error catch?

2 Likes

The issue with using runtime interpretation is for a long time <=3.7 was supported while 3.8+ had positional only syntax only. So stubs adopted a rule (well typeshed did and type checkers generally follow typeshed) that convention had type system meaning. Without that rule stubs that needed to support both 3.7 and 3.8 couldn’t use positional only parameters.

Now 3.7 is EOL, but there are still maintained stubs that have followed __x positional convention. We could deprecate that as having meaning and follow runtime (no special treatment of __x anymore), but it’d lead to fair bit of maintenance toil at this point. I’d be curious what impact on mypy/pyright-primer would look like.

1 Like

The Python ecosystem has grown some pretty good tools (Ruff, in particular) which can be used to safely upgrade code from one style to another when they have the same meaning but the older style is non-preferred (or even deprecated). It’s still work, just a lot less work than doing it by hand :slight_smile:

3 Likes

Is what we’re seeing here a short-coming of the new / system for positional-only parameters?

Do we have any way in the new / system to say “This parameter must be in this position, and you are allowed to use the parameter name ( foo(x=3) ) if you want to.”?

I think that’s what should (edit: Paul’s example changed my mind) be the case for the A.method1(self=A()) example:

  • self must be the first argument.
  • At the same time, you’re allowed to use self= when calling the function if you want to.

Do we have any way of specifying this?

I think you’re describing positional-or-keyword parameters, which is how parameters work by default.

Eric alluded above to other ambiguities that would arise if we did not make self implicitly positional-only. I’ll let him say what he had in mind, but here’s one.

from typing import Protocol

class HasLower(Protocol):
    def lower(self) -> str: ...

Does str fulfill this protocol? If self isn’t implicitly positional-only, the answer is no, because the self parameter to str.lower is in fact positional-only. So users writing protocols would in effect have to always mark the self parameter as positional-only, which would be annoying and error-prone.

1 Like

If we’re talking about a runtime change making self/cls implicitly positional-only, then that would break a usage like A.method1(**kw) where kw included self by name. It’s definitely going to be a very unusual edge case, but I could imagine code generators doing something like that. And given that there’s no compelling benefit here for the runtime change, I think it’s better to leave it as it is.

I don’t feel strongly about what type checkers should do, beyond the generic “don’t reject code that’s valid at runtime” principle.

What I’m describing is not how parameters work by default.
By default, the position is not restricted.

>>> def foo(a, b, c):
        return a + b * c

>>> foo(c=3, b=4, a=1)
13

I’m talking about something where the position is restricted.

So you mean allowing A.method(self=some_obj, b=1, c=2) but not A.method(b=1, self=some_obj, c=2)? Given that keyword arguments are passed as a dictionary, which is logically an unordered key-value mapping, this seems difficult to justify as an additional argument passing convention. And that’s quite apart from the implementation difficulties - is the following valid or not?

kw = { "self": some_obj, "b": 1, "c": 2 }
A.method(**kw)

What if kw was a user-defined mapping type that didn’t preserve order?

That’s not what positional-OR-keyword means, what you’re thinking of would be postional-AND-keyword, which is not a thing that exists.

positional-or-keyword means you can either pass the argument positionally without specifying a name, or by keyword, i.e. using its name.

Your example has convinced me that the position of self should not be restricted.

But more generally, if we can restrict the position of an argument (PEP 570), I would like a way to specify that the position is restricted, but I’m still allowed to use the name when calling it.

That is why I asked. (Except “postional-AND-keyword” makes it sound like the keyword is required, and I think it should be optional.) I think it’s a shortcoming of PEP 570 that this doesn’t exist.

I’m confused, either an argument goes into *args or into **kwargs, I don’t understand what a positional-and-keyword argument would do or what you would gain from it. Could you provide an example of where this would be useful?


To summarize: Semantically there’s the following cases (ignoring default arguments for simplicity):

def foo(
    positional_only,
    /,
    positional_or_keyword,
    *varargs,
    keyword_only,
    **kwargs
): ...

Where would a positional-and-keyword argument fit into this calling convention?

edit to collapse text not relevant to this discussion
def foo(
    positional_only,
    positional_and_keyword,
    /,
    positional_or_keyword,
    *varargs,
    keyword_only,
    **kwargs
): ...

Note that if the caller chooses to use a keyword for positional_and_keyword (because, again, it’s optional), that would force everything after to be keywords.
This is similar to the choice the caller could make to use a keyword for positional_or_keyword which means they can no longer use positional for any of the following positional_or_keyword and no longer can use varargs.

As for the motivation for it: Changing the order (or not specifying the order) of the parameters is not the only reason for keyword arguments.

I think you could come up with a few reasons. If you can’t, just ask anyone who has ever chosen to use a keyword for positional_or_keyword without changing the order and without omitting any arguments. It happens a lot.

Then apply those motivations to positional_only.

Let’s stop this discussion of “positional and keyword” parameters. Adding such a feature would be a language change, off topic for this thread about typing. Instead, let’s focus on the questions Eric posted above.

5 Likes

For Eric’s two questions,

  1. Yes I think self/cls should always be treated as positional only. This disagrees sometimes with runtime, but I consider passing name to be relying on implementation details as I can think of libraries that documentation for name self/cls is inconsistent with implementation at times (decorators can lead to it differing) and that discrepancy hasn’t been noticed in years of millions of downloads (tensorflow does this as stubtest noticed). I also think replicating runtime here would make protocol definition more complex/confusing for most. Having str not match HasLower example feels like a bad footgun for typical usage.

  2. I’d lean to make __x usage unspecified behavior and mark it as deprecated. Type checkers may support it for legacy compatibility (or may eventually drop it), but edge cases are low value to fix/work out and instead recommend move to / syntax. / syntax should be fully specified.

1 Like