Reviving the "hybrid keyword-arrow syntax" for callables

tmk · May 3, 2024, 11:31am

One reason for why PEP 677 was rejected, was that the syntax required backtracking in the parser. The SC recommended:

When proposing a syntax change, low complexity is better. While not always possible, it’s ideal if it could still be described using our old <=3.8 parser. It is important to have in mind that adding syntax that only our modern PEG parser can handle could lead to greater cognitive load and external tooling implementation costs.

This was the proposed syntax:

def flat_map(
    func: (int) -> list[int],
    l: list[int]
) -> list[int]:
    out = []
    for element in l:
        out.extend(f(element))
    return out

But the PEP also mentions another syntax in the Rejected alternatives: Hybrid keyword-arrow Syntax:

def flat_map(
    func: def (int) -> list[int],
    l: list[int]
) -> list[int]: ...

i.e., def is added as a keyword introducing the callable.

I think this would be parsable with the old parser (def is now allowed at the beginning of any expression but will throw an error if not followed by ().

This should also be allowed:

type StrTransform = def (str) -> str

Returning a callable:

def f() -> def (int, str) -> bool:
    pass

I admit this looks a bit confusing at first, but I think you’d get used to it quickly. (I think it’s less confusing than the original PEP 677, actually.)

The main objection in the PEP was this:

But we think this might confuse readers into thinking def (A, B) -> C is a lambda, particularly because Javascript’s function keyword is used in both named and anonymous functions.

I agree that this is may be a problem, but it should help that this syntax will only appear in type annotations, so it should be clear from context that it’s not a lambda.

Generally, I’d expect someone who knows Python but doesn’t use type annotations to understand this:

def flat_map(
    func: def (int) -> list[int],
    l: list[int]
) -> list[int]: ...

much easier than this:

def flat_map(
    func: Callable[[int], list[int]],
    l: list[int]
) -> list[int]: ...

All the other advantages of PEP 677 remain, like the nicer Concatenate syntax:

f5: def (int, **P) -> bool
f5: Callable[Concatenate[int, P], bool]

franklinvp · May 3, 2024, 1:36pm

I could be wrong in interpreting def as indicating that a function is about to be defined. This would change that meaning for me.

Isn’t there type, indicating that a type is about to be defined?

guido · May 3, 2024, 3:41pm

I’m all for reviving PEP 677, but I don’t believe there’s still a need to make it parseable with old non-backtracking parsers. IIRC the key objection was that there didn’t seem to be a useful run-time interpretation of using signatures as values, like

callback = (int, int) -> None

(If I’m wrong, and there are still tools out there that couldn’t handle this but can handle all other Python 3.13 syntax, I’d like to know.)

Nineteendo · May 4, 2024, 9:14am

How about the Extended Syntax Supporting Named and Optional Arguments?

These functions have the same signature as the names of positional-only arguments don’t matter:

def foo(x, /): ...
def bar(y, /): ...

That makes named positional-only arguments in the arrow notation pointless, so we can assume they’re positional_or_keyword or keyword_only. If desired, you could still use the notation of Mypy:

Function = (__x: int) -> Any

tmk · May 4, 2024, 12:21pm

Sure, all the proposed syntaxes are forward-compatible with that. But it maybe should be done in a second step.

Nineteendo · May 11, 2024, 8:41am

Could you clarify what you mean? It’s not in the rejection notice.

stroxler · May 11, 2024, 2:53pm

It was at least a concern - adding syntax to types means adding syntax to expressions, which can appear anywhere so it’s a pretty big shift in the language (PEP 695 is a good example of a major syntax change that’s much less invasive: the new syntax all happens in statement so the effects are very limited, and in fact all of the statements with new syntax are explicitly typing-related ones like class/function signatures and type declarations).

In expressions, there is a bigger question of runtime behavior and for PEP 677 the proposal was that this syntax would only be used for static typing. That’s not necessarily a deal-breaker but I think the bar is / was pretty high to get a change like that accepted.

My impression was that the SC’s primary concern was around the composability of the syntax and how easily the eye can follow it on complicated signatures. Examples like

(int, int) -> str
(int, (float) -> str) -> str

look nice in a presentation but once you drop them into real code (especially the kind of real code where Callable is clunky) it’s less clear how readable they are:

Here’s a type found in typeshed before-and-after:

Callable[[Callable[..., Generator[Any, Any, Any]]], Callable[..., None]]
((...) -> Generator[Any, Any, Any]) -> (...) -> None

It’s not super easy to read either way and the arrow syntax is certainly more compact, but I think the second example isn’t enough more readable that the Steering Council was inclined at the time to take a big syntax change.

Another unfortunate thing is that the -> is overloaded a bit. This is by intent, to make the callable types feel inspired by signatures, but when you look at something like this typeshed example:

    def get_overloads(func: Callable[..., object]) -> Sequence[Callable[..., object]]: ...
    def get_overloads(func: (...) -> object) -> Sequence[(...) -> object]: ...

it’s a little weird to have a chain of 3 arrows show up here where the first and the last are part of an expression but the middle one is part of a statement.

On bigger examples it can also start to get hard to figure out which right parenthesis is even the end of the function signature; this was actually my biggest worry after transforming some example code - reading real function signatures occasionally became pretty hard!

To be fair, consistently formattting top-level functions in the style of

def f(
    x: (int) -> float,
) -> (int) -> float:

does at least eliminate the “where does the parameter list end” issue so it’s not hard to work around, but the same issue pops up to a lesser extent inside of complicated types.

None of this is to suggest we shouldn’t propose callable syntax again, I’m just sharing my impressions of why PEP 677 was rejected.

guido · May 11, 2024, 3:34pm

Thanks, I agree that’s also a concern, although to some extent it ought to be addressed using type aliases. E.g.

type IntFloatConversion = (int) -> float

def f(x: IntFloatConversion) -> IntFloatConversion

looks much better. And hopefully we can agree that by itself (int) -> float looks better than Callable[[int], float].

I also like that the new syntax can be extended to support all other special signature shapes, e.g.

(int, *, angle: float = ..., **float) -> float | None

(I have a specific syntax in mind that allows both x: int and int to be used in parameter positions, the latter implying positional-only.)

stroxler · May 12, 2024, 6:40pm

After 677, thought about a syntax with something on the left side other than a bare parenthesis, which I think might help with some of these visual issues. So for example instead of

(int, *, angle: float = ..., **float) -> float | None

using something like

Fn(int, *, angle: float = ..., **float) -> float | None

With that change

    def get_overloads(func: (...) -> object) -> Sequence[(...) -> object]: ...

becomes

    def get_overloads(func: Fn(...) -> object) -> Sequence[Fn(...) -> object]: ...

which is maybe slightly clearer.

But at this point I actually favor this very strongly for reasons that go beyond syntax: currently we have no way of binding type variables in generic callable types, but they make perfect sense. Using something like Fn gives us an obvious place to apply PEP 695 style syntax to make this explicit:

Fn(T) -> T

would indicate a function type where T is a type variable that should be bound by some containing scope, whereas

Fn[T](T) -> T

would be the identity function as a generic type (in textbook polymorphic lambda calculus style syntax, the former would be T -> T where T has to be a type var in scope whereas the latter would be forall T. T -> T). I think being able to make that distinction by syntactically binding T (or not) would be very valuable, and is a stronger argument than just readability.

stroxler · May 12, 2024, 6:44pm

Agreed on both points!

Although I also think that Łukasz’s idea of allowing functions themselves to represent types with compatible signatures (maybe via a decorator) gives most of the same benefits if people were going to use type aliases most of the time anyway.

I think a closed-form syntax would be nicer for several reasons, but if we try again and get rejected again then I’d really want to move forward with this as an alternative (which I’m almost certain would be accepted).

Nineteendo · May 12, 2024, 6:52pm

How about def(...) -> ... as outlined in the thread description? As def is already a keyword.
Or can we make Fn a soft keyword if followed by -> ...?

stroxler · May 12, 2024, 7:09pm

No strong opinions on the exact name; Fn happens to be what Rust uses. I’m not sure whether def is good because it’s a keyword, or less good because it comes with connotations (such as being suggestive of a lambda) that could be problematic.

I suspect users would get used to it either way, but there might be a little more pushback from less-typing-enthusiastic people about reusing the keyword in a typing-specific syntax.

Nineteendo · May 12, 2024, 7:15pm

If you forget the return annotation this can have some problems:

def foo(fn: Fn()): ...  # NameError: name 'Fn' is not defined

While with def, that’s no problem, fn simply returns None:

def foo(fn: def()): ...  # OK

Also works outside type annotations.

Yes, you can use from __future__ import annotations, but some people might not want to do that in every file.

Daverball · May 12, 2024, 8:48pm

My only minor gripe with the proposed syntax is that while it helps with terseness for defining a Callable with a complex signature, where we would currently be forced to define a callback protocol, it does not at all help with custom generics that use a ParamSpec, so you always have to bind your ParamSpec indirectly through another callable if you want to specify a complex signature.

I wonder if we can come up with a syntax for just the parameter portion of the signature, so it can be used in places other than just Callable, or alternatively a way to reuse this hybrid syntax with other generics.

I think it would make the proposal a lot stronger, if it could apply more generally and solve actual expressivity issues, rather than just help with terseness. It would also potentially give us a type expression we can pass to bound/default for ParamSpec, which we currently lack.

tmk · May 13, 2024, 9:52am

Would this be addressed by the

Fn[T](T) -> T

syntax mentioned above?

I’m guessing in your case you’d want something like

Fn[**P](int, **P) -> str

?

Though, can’t this already be expressed with type aliases?

type MyFunc[**P] = Callable[Concatenate[int, P], str]

Daverball · May 13, 2024, 10:07am

No, I mean generic classes that take a ParamSpec. The use-case you’ve demonstrated is indeed already possible, just like anything else the syntax currently would enable.

What I am talking about is having a custom generic class that takes a ParamSpec, e.g. gevent.Greenlet.

How you bind that ParamSpec currently is usually through some sort of generic constructor that uses the ParamSpec either directly through P.args/P.kwargs or indirectly through another callable you pass in, for simple positional-only arguments you can specify them yourself, but not for keyword/optional/variadic arguments.

I.e. you can write Greenlet[[str, int], None] but not Greenlet[[x: str, y: int = 0, *args: str, **kwargs: int], None], you have to use an intermediary constructor to create this type, you can’t express every possible signature the type checker can internally represent using a type expression^[1].

This makes some things nearly impossible, such as creating a subclass that uses a fixed signature that comprises of more than positional-only arguments, since you can only create the type through inference, you can’t directly write a type expression that is equivalent ↩︎

ntessore · May 13, 2024, 10:16am

Does the run-time interpretation have to be a signature? Or could it be something like a generic Arrow((int, int), None) instance at run-time?

tmk · May 13, 2024, 10:56am

Ah, I see! Yes, I’ve also wished for such a syntax multiple times in the past.

So, maybe it’s simply a better “signature” syntax we need? For callables, you could then do something like:

from typing import Fn

type MyFunc = Fn[<great new signature syntax>, None]

Daverball · May 13, 2024, 12:02pm

Yes, that’s essentially what I am proposing. Although in order to be maximally useful it probably needs to be something that unambiguously can be parsed anywhere as a regular expression^[1], but since [int] is already a valid expression we can’t really use [] unless we’re fine with getting a different runtime value based on whether we’re defining a simple positional-only signature or a complex one, that would only be enabled through new syntax.

Potentially we could use the currently unused <> brackets to define a parameter set^[2].

We could even experiment with things like supporting overloads, by allowing a sequence of parameter expressions, rather than a singular set of parameters, although that would not work if one of the other parameters, such as the return type on Callable, were dependent on which set of parameters were matched, so it may not be worth the extra complication. Maybe this case could also just be handled by allowing a union of two separate sets of parameters, i.e. Callable[<int, str> | <str>, None].

Additional advantages to using a different kind of bracket would include a more clear visual separation between what is a ParamSpec parameter in a generic and a scalar one, I think Callable[<str, int>, None] is a little bit easier to read than Callable[[str, int], None].

otherwise we can’t pass it to functions that accept an annotation expression or use it for things like bound/default on ParamSpec ↩︎
although there are some potential issues with that, since a<b>c is already a valid expression in Python, so we would need to be careful to not change the meaning of previously already valid expressions ↩︎

stroxler · May 13, 2024, 3:18pm

If we had powerful enough notation for signatures, I think overloads would be pretty easy to add as a separate form, OverloadedCallable[<list of signatures>] which I think is the right way to do it since one of the primary use cases is varying return types.

So for now my instinct would be to focus more on the question of whether there’s a sufficiently powerful syntax for the signatures themselves that the community would be happy adopting. I’d really like it to be fully general over at least the non-paramspec part of the langugae:

able to handle positional, named, keyword-only args
able to handle default vs non-default versions of ^^
able to handle varargs and varkwargs
able to handle binding type vars (this one is my new killer feature, we need new syntax for this)
able to handle paramspec