Draft typing spec chapter for overloads

erictraut · August 13, 2024, 11:12pm

I’ve written a first draft of an updated “Overloads” chapter for the typing spec.

This update attempts to specify type checker behaviors for “overload matching”. The lack of standardization for overloads has been a pain point for a long time, so my hope is that we can settle on a clear standard and get all major type checkers to adopt it.

I’ve also specified behaviors for overload consistency checks, overlapping overload detection, and implementation consistency checks.

For minor feedback (typos, sample bugs, wording suggest, etc.), please post comments directly in the github PR. For more significant feedback or questions, please post below in this thread.

samwgoldman · August 14, 2024, 12:15am

Hi Eric – thanks for kicking this off! I’ll probably have more thoughts on this, but the union expansion part jumped out at me, so I’ll start there.

It’s notable to me that TypeScript does not perform the union expansion. For example:

function foo(x: number): number;
function foo(x: string): string;
function foo(x: number | string): number | string {
  return x;
}

declare var x: number | string;
var y = foo(x) // not allowed

TS Playground

I agree that it is correct to expand unions and theoretically admits more programs. I’m curious if you have any real world cases in mind where this has come up.

In Flow, we used to support overloaded calls with unions like this, but we removed it recently because of pathological performance with large unions. The performance issues were solvable, but these kinds of calls were rarely needed and it didn’t seem worth the effort to optimize, so we removed the behavior instead.

erictraut · August 14, 2024, 12:47am

I’m curious if you have any real world cases in mind where this has come up.

Yes, this comes up quite frequently in Python programs. This may be because polymorphic functions are more prevalent in Python than in TypeScript.

FWIW, I prefer TypeScript’s approach here. It’s simpler to explain, it’s principled and doesn’t take you down a slippery slope of exceptions, and it doesn’t have a (potentially large) negative impact on type analysis performance.

As much as I prefer the simplified overload matching model of TypeScript, I don’t think this works in the world of Python. Mypy implemented union expansion years ago and established a precedent. Library and stub authors have taken a dependency on this until-now-unspecified behavior. Removing union expansion at this point would cause major compatibility issues.

Jelle · August 14, 2024, 1:19am

Thanks for putting this together. Overloads are an extremely tricky and underspecified part of the type system and it would be good to have clarity on how they should work.

Step 3: In pyanalyze I implemented something similar in a different way:

    If an overload does not match, but one of the arguments passed was a ``Union``, we try all the components of the ``Union`` separately. If some of them match, we subtract them from the ``Union`` and try the remaining overloads with a narrower ``Union``. In this case, we return a ``Union`` of the return values of all the matching overloads on success.

This may be more efficient because it does not require repeatedly evaluating the overload as much, but I’m not sure the behavior is the same in all cases.

Step 4: This rule is new to me. It seems to me that it can only apply in cases where overloads overlap unsafely. For example, your Example 3 has unsafely overlapping overloads. What is the motivation for this rule?

Step 5: I haven’t fully wrapped my head around this rule, and the examples you gave seemed incorrect (I commented on the PR).

Argument type expansion: I think all of cases 1–4 should be covered but case 5 need not be. The types bool and Literal[True] | Literal[False] are equivalent (they contain the same set of values), so it feels wrong if they behave differently. Similarly, type[A | B] and type[A] | type[B] are equivalent and should behave identically.

Use of context: I’m not sure we should specify that type checkers should use context to evaluate a type. It should be allowed, for example, to evaluate {'x': 1} to an internal type that is assignable to both dict[str, int] and a TypedDict with x: int.

carljm · August 14, 2024, 5:11pm

The fact that type-checkers currently don’t do union expansion over constrained typevars (although they do for overloads) recently came up as a practical problem for standard library functions typed with AnyStr: Proposal: relax un-correlated constrained TypeVars

erictraut · August 16, 2024, 7:00pm

There are likely many algorithmic variations that produce equivalent results. I think it’s important for us to document one algorithm — preferably the simplest one to explain, and individual type checkers can then choose to implement the “base” algorithm or an equivalent algorithm that is more optimal.

I think your algorithm is equivalent to what I documented as long as there are no overlapping overloads present. We could say that when overlapping overloads are present, the behavior is indeterminate. I’m OK with that stance. However, I see overlapping overloads appear frequently in libraries and stub files. (Search for # type: ignore[overload-overlap] in typeshed, and you’ll find more than 200 instances.) So it might be preferable not to leave the behavior indeterminate in this case.

This rule comes from mypy. I implemented it in pyright when I was attempting to match mypy’s behaviors. It is not well described in the mypy documentation for overloads. The phrase “taking into consideration both the argument types and arity” leaves some room for interpretation.

I don’t think this rule applies only where overloads overlap unsafely. It comes up in situations like the zip constructor in builtins.pyi. Similar patterns show up frequently in the pandas stubs.

from typing import Iterable

def func(a: list[Iterable[int]]):
    v1 = zip(*a)

    # Without Step 4, this evaluates to zip[tuple[Iterable[int]]]
    # With Step 4, this evaluates to zip[Any]
    reveal_type(v1)

Yes, my examples were incorrect. I’ll work on better examples.

“Need not be” or “should not be”? If we want consistent behavior from type checkers, I think we need to be more prescriptive in the spec. I’m a bit uncomfortable saying that tuples shouldn’t be expanded to their equivalent subtypes.

It’s interesting that you thought “it feels wrong if they behave differently” for bool but not for tuple. After all, tuple[int | str, int | str] is equivalent to tuple[int, int] | tuple[int, str] | tuple[str, int] | tuple[str, str], and the typing spec spells this out in the tuples chapter.

Use of context: I’m not sure we should specify that type checkers should use context to evaluate a type. It should be allowed, for example, to evaluate {'x': 1} to an internal type that is assignable to both dict[str, int] and a TypedDict with x: int .

Hmm, allowing a type to be assigned to two different (incompatible) types doesn’t sound type safe to me. That would imply that one could assign the same value to both of those types, and that could lead to type violations.

I admit that I’m uneasy bringing up the topic of “context” in the overload chapter because there’s no other mention of it (currently) in the typing spec. However, without mentioning it here, I couldn’t figure out how to make the spec clear enough such that we can achieve consistent results between type checkers.

mikeshardmind · August 16, 2024, 8:35pm

We can avoid leaving this indeterminate in at least 3 ways.

Formalizing that overlapping overloads are allowed, and the first matching overload should be selected. I dislike this particular one, but it’s the simplest.
Formalizing that all overloads are implicitly disjoint, so in the case of overlapping overloads, only non-overlapping use is described. This would be the most sound, but would likely be too disruptive.
In the case where multiple overlapping overloads apply, the return type is the determined based on the co-domain of the overloads as restricted by the actual observed usage (This is mostly equivalent to the second option, but has a handling for the overlap for determining a type)

3 closely matches what elixir is pursuing for it’s gradual type system and of the options I’m aware of for this kind of resolution, should provide the fewest false positives.

carljm · August 16, 2024, 9:14pm

I think the conceptually simplest understanding of overloads, from a set-theoretic type perspective, is as an intersection of callable types (that is, the overloaded function has all of the listed signatures.) In this view, the natural handling of overlapping overloads is that the return type is the intersection of the return types of all overloads matching the call.

I think this might be what @mikeshardmind was describing in option 3; not sure.

That’s just theory, though – I haven’t done an audit of real-world cases where people write overlapping overloads to see how often this would result in a sensible/useful outcome.

mikeshardmind · August 16, 2024, 9:43pm

Not exactly. With overloads, we don’t have an intersection of callable types, we have a selection of possible callable signatures, and one or more of them must be picked.

In the case of overlapping overloads, ie. there are multiple overload signatures for which a usage is consistent, there are a few possible interpretations.

The overloads are not specific enough to pick, so the result should be a union (not intersection, only one code path is picked) of return types.
The overloads are not specific enough to pick, so the overloads themselves should be an error (This breaks the gradual guarantee, and doesn’t work well in a gradually typed code base)
The overloads are not specific enough to pick, so the result should be an intersection of the return types (This reduces the number of false positives from 1, but increases the false negatives)
That if we can observe the usage, then examining the codomain (the set of return values possible) constrained by things we can determine (such as removing overloads that dont apply) so long as the usage is consistent with the codomain, we don’t need a precise type.
That if we can’t make a decision, the type of implementation should be used.
That if we can’t make a decision, Any should be used.

The above bit on codomains in option 3 presented most closely aligns with interpretation 4, but interpretation 1 is still effectively part of it when the return type escapes to another scope (such as in a return type).

Interpretation 3 is flawed for python

There’s a bit of a trick here in that overloads are very similar to intersections of callables, but only in that they can be used in any of those ways, when solving for a return type, it’s a dynamic selection of a union of return types, not an intersection. This can be seen with many overloads where there is no possible intersection of return types. you can’t extract the return types as an intersection in most cases, and most cases of overloads don’t actually involve code with a synthesized intersection, most cases involve branching code based on types/values.

Interpretation 6 is currently in the proposal in one form, but interpretations 4 and 5 would introduce fewer false negatives

carljm · August 16, 2024, 9:48pm

Yeah, I think I mis-spoke. I meant to propose your (1) – if overloads are an intersection of signatures, the return value should be the union of return types, where overloads that don’t match the call signature contribute Never to that union.