Spec change: ambiguous arguments in overload call evaluation

I’d like to propose two changes to the overload call evaluation algorithm in the typing spec. I’ve opened a PR with the proposal here.

Change #1:

Mypy, pyright, zuban, pyrefly, and ty all implement a rule that isn’t currently in the spec: when deciding whether a call is ambiguous, arguments that have the same parameter type in every candidate overload do not contribute any ambiguity. I’d like to add this rule to the spec.

Here’s a concrete example:

@overload
def f1(x: str, y: Literal['o1']) -> str: ...
@overload
def f1(x: str, y: str) -> int: ...

def g1(x: Any):
    # spec: Any, all type checkers: str
    reveal_type(f1(x, 'o1'))

This matters for overloaded functions like builtins.open, where this rule allows for the mode argument to select an overload even when the file type is unknown.

Here’s the proposed language in the spec (bolding is only for emphasis in this post):

For each of the remaining overloads, determine whether all arguments satisfy at
least one of the following conditions:

  • All possible :term:materializations <materialize> of the argument’s type are
    assignable to the corresponding parameter type, or
  • The parameter types corresponding to this argument in all of the remaining overloads
    are :term:equivalent.

If so, eliminate all of the subsequent remaining overloads.

Change #2:

When a call is ambiguous, the spec says to fall back to a return type of Any. Some type checkers sometimes return a more precise return type instead. This leads to different type checkers giving different results in packages like numpy and scipy-stubs - in particular, spec-compliant type checkers produce a lot of Any instead of precise types.

Here’s a concrete example:

class A[T]:
    x: T
    def f(self) -> T:
        return self.x

@overload
def f2(x: A[None]) -> A[None]: ...
@overload
def f2(x: A[Any]) -> A[Any]: ...

def g2(x: Any):
    reveal_type(f2(x))  # spec, zuban, and ty: Any, pyright: A[None], mypy and pyrefly: A[Any]

I propose to change the spec to say that the most general return type among the candidate overloads should be returned - in this example, A[Any].

Here’s the proposed language (again, bolding only for emphasis in this post):

Once this filtering process is applied for all arguments, examine the return
types of the remaining overloads. If these return types include type variables,
they should be replaced with their solved types. Eliminate every overload for
which there exists a :term:materialization <materialize> of another
overload’s return type that is not assignable to this overload’s return type.

This rule picks the most general return type, if one exists.

A couple more examples:

  • If the candidate return types were bool and int, we would pick int because bool is a subtype of int.
  • If the candidate return types were A[int] and A[str], we would fall back to Any because neither return type is assignable to the other.

Pyrefly already implements this; you can play around with it in the sandbox. (Toggle the spec-compliant-overloads field in pyrefly.toml to compare the proposed behavior to what’s currently in the spec.) We made this the default behavior in Pyrefly version 0.59.0.

Side note: I’m aware that there’s an alternate proposal to use an AnyOf type in ambiguous cases ([1], [2]). I’m proposing this materialization-based selection instead because:

  • It doesn’t require any new concepts or constructs (less complexity in the type system, less work to adopt).
  • I already implemented this in Pyrefly, so I can pretty confidently say that it seems to work well.
  • If we decide at some point in the future that this isn’t good enough, we still have the option to move to something like AnyOf. Personally, I think pinning down the semantics of AnyOf seem harder, and I’m not sure of the incremental benefit, but this change wouldn’t close the door on it.
8 Likes

I think both these changes make sense.

The second change is not mutually exclusive with AnyOf. Even with this new rule, there will remain some ambiguous matched overload sets for which a “most general” return type does not exist, and the spec mandates inferring Any. These cases would still be improved by instead inferring AnyOf the matched overload return types instead.

4 Likes

Even in this case, having A[Any] would be much nicer than just plain Any, no?

Couldn’t one define a “gradual” version of the join operator[1], so that

  • classical join: join(str, int) -> object, and join(list[int], list[str]) -> list[object]
  • gradual join: gradual_join(str, int) -> Any, and gradual_join(list[str], list[int]) -> list[Any]

With this definition, I believe we should have: Any ⪰ gradual_join(A, B) ⪰ AnyOf[A,B], where denotes a “more precise gradual type”, in the sense that A⪯B if and only if the set of materializations of A is a subset of the set of materializations of B.

Hence, a gradual join is a better approximation of the AnyOf than just plain Any.


  1. Define gradual_join(A1, A2, ..., An) as the least upper gradual bound, i.e. the gradual type T such that: A1, ..., An are all materializations of T, and for any gradual type S such that A1, ..., An are all materialization of S, then T is more precise than S, i.e. T⪯S. ↩︎

3 Likes

Interesting idea, thanks. I agree that falling back to Any rather than A[Any] in my last example is the least satisfying part of this proposal. Just to clarify, is the suggestion that:

  • We replace proposed rule #2 with gradual join, or
  • When #2 doesn’t find a most general type, we fall back to a gradual join rather than Any?

As defined above, I don’t think gradual join would work as a replacement, since we’d want gradual_join(bool, int) to be int, but bool is not a materialization of int. As a fallback, it seems promising, but the spec today doesn’t even define a “classical” join operator, so there’s a bit of a gap to get to the point where we could define a gradual join. (And I don’t know that I have the capacity right now to see through something like that, unfortunately.)

1 Like

After thinking through this more and working on an implementation in ty, I think it is not the case that int is the desired result here. That is, int is not the preferable return type for this scenario, encoded in the proposed conformance suite tests:

@overload
def example9(x: str, y: Literal['o1']) -> bool: ...

@overload
def example9(x: bytes, y: Literal['o1', 'o2']) -> bool: ...

@overload
def example9(x: bytes, y: str) -> int: ...

def check_example9(x: Any):
    # All three overloads are candidates. The parameter types corresponding to
    # argument `x` are `str` and `bytes`, which are not equivalent, so none of
    # the overloads can be eliminated. We pick the most general return type.
    ret1 = example9(x, 'o1')
    assert_type(ret1, int)

Inferring int here violates the gradual principles that motivated the original specification to return Any in this situation, which is that a more dynamic input type should not result in a “stricter” output type that may cause false positives. It may be that in reality the input type here is str, and thus bool is the real output type, and subsequent code may depend on this. Having less static knowledge of the input type should not cause false positives in that scenario.

I think the proper result of gradual_join(bool, int) should be the gradual type with lower bound bool and upper bound int, which is the same type as AnyOf[bool, int], or could be represented as (Any & int) | bool. This type “gradually” permits use as a bool without false positives.

I think the proposed new rule here works well when the “most general” return type is gradually “more general” (that is, less precise), but not when it is a statically wider type. Barring introduction of AnyOf or intersection types, the best available rule, I think, would be to consider a return type “most general” only if it is both assignable from every materialization of every other candidate-overload return type (the currently proposed rule), and it is assignable to every other candidate-overload return type (this would rule out int from being selected from candidates bool and int).

1 Like

Thanks for trying this out in ty.

Barring introduction of AnyOf or intersection types, the best available rule, I think, would be to consider a return type “most general” only if it is both assignable from every materialization of every other candidate-overload return type (the currently proposed rule), and it is assignable to every other candidate-overload return type (this would rule out int from being selected from candidates bool and int).

I implemented this in pyrefly to see what the effect would be, and it looks like basically all the overloaded calls in numpy and scipy-stubs that go from an Any return type to non-Any with my original proposal revert back to Any with this change. So it doesn’t seem like we’d really gain anything anymore.

Let me think a bit more about this.

1 Like

I wish I had more availability to flesh out the remaining work on intersections prior to now.

I believe the answer to this is the same as what came up for intersected methods, with a slight difference on the interior overlap as overloads are “one of” where intersections are “each that applies”

The process for intersections is that the application of parameters to a function a & b is based on whether the parameters are in the applicative domain of only a, only b, or both and b. The resulting outcome is either the return of a, the return of b, or the the intersection of each of those.

For overloads, that would turn into the return of a, the return of b, or the union of a and b.

AnyOf is more permissive if we add it to the language, but I don’t think that the conformance suite current behavior is correct even in the absence of it.

I don’t think we would want to think of the answer as a gradual join, each invocation is a discrete outcome.

given this one, the type should be int | bool, not int, even if it appears as if this may be semantically identical, it becomes apparent that it isn’t if the result of example9 is applied to heterogenous data in a list comprehension (unfortunately, treating it as just int creates variance issues in gradual typing use).

Sorry I haven’t had time to come back to this. What I want to do next is put together some more realistic overload examples we can look at, rather than the toy examples I created to show the specific rule I proposed. If this takes me much longer, I might spin change #1 out into its own PR, since it seems to be less controversial.