Inference on Literal Types

oscarbenjamin · November 26, 2024, 12:32pm

I’m not sure in practice how easy it is to take this into account though. I assume that this is what Mike means by analysing the whole program graph and it implies that when reaching an error the type checker should backtrack and see if a compatible narrower type could have been chosen somewhere. As far as I know neither mypy or pyright ever does this: type propagation only applies in the forward direction. Apart from the runtime and complexity difficulties of backtracking I imagine it would be more difficult for a programmer to understand because later code affects the types retroactively and the location at which a type checker should report an error would be ambiguous.

In general using the inferred type to infer new types reduces the need for annotations and even makes it possible to do things that cannot be expressed with annotations like:

class B:
    pass

class C(B):
    def call_c_method(self):
        pass

def func(b: B) -> B:
    if isinstance(b, C):
        a = [b] # list[C]
        a[0].call_c_method()
    else:
        a = [b] # list[B]
    return a[0]

In my experience this kind of inference is more useful and avoids a lot of boilerplate like introducing new variables a1, a2 etc just so they can have different types in each branch. Here pyright can fully understand func and infer and check all of the types without needing any annotations in the body of the function. On the other hand there are no annotations that would satisfy mypy in this case: instead the code has to be rewritten somehow.

I find this situation frequently when adding annotations in a large existing codebase: I can annotate function parameters and return values in a few places and then pyright understands the bodies of functions by inferring the types of all local and intermediate objects. Annotating the signature of a function though causes mypy to start checking the body and then requiring annotations or code changes.

The issue in this thread is really just that inference with Literal types is ambiguous because they aren’t really intended to be nominal types like enums whose type must be strictly preserved. Instead they are supposed to be literal values:

from typing import Literal, TypeAlias

Days: TypeAlias = Literal[0, 1, 2, 3, 4, 5, 6]

def next(day: Days) -> Days:
    return (day + 1) % 7

rsdenijs · November 26, 2024, 12:44pm

Simply that this is not about correctness but about usefulness. Since there is no spec it is up to the preferences of the type checker and the user needs to familiarize themselves with them.
As pointed out by Oscar, the pyright behavior does make sense in quite common scenarios. Ultimately I think this is a failure in the docs for not educating on the implications of having an incremental type system.

Liz · November 26, 2024, 12:49pm

In a classically defined gradual type system, type checkers are not supposed to error for things that they cannot definitively say is an error.

The issue above would not happen for pyright if pyright did the obvious thing
There is an inference that does not require inserting Any that is valid
pyright creates an error.

I would say that pyright is failing to be a well-behaved participant in gradual typing here then, and that this is still something that needs fixing.

pytype ~~and pyre both~~ (edit: It looks like pyre doesn’t for lists here anymore, at least not fully identically to the more permissive inference of pytype that is only bound by knowns an possibilities) uses this approach, and both have better list inference, only erroring when a program is in an inconsistent state that requires conflicts, rather than eagerly picking something that might not be correct in context. This is absolutely possible to do, pyright just doesn’t, and as we’ve all agreed, this is in a place where there isn’t enough specification. The original post came from a user that came here after they were told as much by pyright’s maintainer trying to improve the situation, so I’d hope we could figure out a way to improve this rather than just say “it’s not specified”

oscarbenjamin · November 26, 2024, 1:49pm

Here is the last example I showed:

from typing import Literal, TypeAlias

Days: TypeAlias = Literal[0, 1, 2, 3, 4, 5, 6]

def next(day: Days) -> Days:
    return (day + 1) % 7

While pyright can accept this because it propagates the literal values, pytype, pyre and mypy do not accept it:

$ pytype days.py 
Computing dependencies
Analyzing 1 sources with 0 local dependencies
ninja: Entering directory `.pytype'
[1/1] check days
FAILED: .pytype/pyi/days.pyi 
.venv/bin/python3 -m pytype.main --imports_info .pytype/imports/days.imports --module-name days --platform linux -V 3.12 -o .pytype/pyi/days.pyi --analyze-annotated --nofail --quick days.py
days.py:6:5: error: in next: bad return type [bad-return-type]
           Expected: Literal[0, 1, 2, 3, 4, 5, 6]
  Actually returned: int

    return (day + 1) % 7
    ~~~~~~~~~~~~~~~~~~~~

For more details, see https://google.github.io/pytype/errors.html#bad-return-type
ninja: build stopped: subcommand failed.
Leaving directory '.pytype'

And pyre also fails:

$ pyre check
ƛ Found 1 type error!
tmp/days.py:6:4 Incompatible return type [7]: Expected `Union[typing_extensions.Literal[0], typing_extensions.Literal[1], typing_extensions.Literal[2], typing_extensions.Literal[3], typing_extensions.Literal[4], typing_extensions.Literal[5], typing_extensions.Literal[6]]` but got `int`.

I will check out pytype and pyre which I haven’t used before but after a quick review I have decided not to use them because neither of them handles __new__ properly. A first test reveals that pytype completely fails to understand __new__:

from __future__ import annotations

class A:
    def __new__(cls) -> A:
        return super().__new__(cls)

reveal_type(A()) # nothing

And pyre has the same bug as mypy so apparently only pyright actually understands __new__.

Dr-Irv · November 26, 2024, 2:32pm

This is an interesting example. It shows that the propagation of arithmetic on Literal values is useful.

On the other hand, my colleagues and I found that the propagation of operators done by pyright produces unexpected results:

from typing import Literal


def add_to_five(one: Literal[1], five: Literal[5]):
    x = one + five
    reveal_type(x)


def add_two_lits(one: Literal[2, False], five: Literal[5, True]):
    x = one + five
    reveal_type(x)


def add_two_strs(one: Literal["abc", "def"], two: Literal["ghi", "jkl"]):
    x = one + two
    reveal_type(x)

pyright produces:

  inferliteral2.py:6:17 - information: Type of "x" is "Literal[6]"
  inferliteral2.py:11:17 - information: Type of "x" is "Literal[7, 3, 5, 1]"
  inferliteral2.py:16:17 - information: Type of "x" is "Literal['abcghi', 'abcjkl', 'defghi', 'defjkl']"

mypy produces:

inferliteral2.py:6: note: Revealed type is "builtins.int"
inferliteral2.py:11: note: Revealed type is "builtins.int"
inferliteral2.py:16: note: Revealed type is "builtins.str"

My colleagues and I find the pyright result non-intuitive and unexpected.

If pyright had an option to treat a declared Literal as a first class type without inferring new Literal types from calculations done on declared Literal types, I think that would solve the issue that I originally brought up.

mikeshardmind · November 26, 2024, 2:36pm

Going back to something I said earlier cause I’d like to focus on how we can fix the user experience here, I think this needs specification. I don’t want to attempt to fully specify all inference right now, I think that would be unproductive. I do want to specify that:

Generator expressions should be seen as equivalent for the purpose of type variables as the equivalent inlined generator function.
container comprehensions should be seen as equivalent for the purposes of type variables as if the comprehension was passed as a generator to the constructor of that type.
That type checkers are free to infer when no annotation exists for a value, but should respect annotations that exist even when they are of a supertype and are not as specific when provided, including in situations where this interacts with other inference based on that value.

beauxq · November 26, 2024, 3:14pm

I would like to make it clear that I don’t think the narrowed type should be completely ignored - just that the declared type should take priority.

class B:
    pass


class C(B):
    pass


p: B = C()
# At this point, p has declared type B and narrowed type C

# Both declared type and narrowed type should be available for later inferences,
# with the declared type taking priority.

# first try with declared type list[B],
# find that it's not valid,
# so use narrowed type list[C]
q: list[C] = [p]  # ok

mypy doesn’t accept this.
pyright accepts it by accident, not because it’s doing it correctly.

If I had to choose between these based on only this detail ^[1], I would choose mypy.

I’m not choosing a type checker based on only this detail, that’s why “just use mypy” is not a solution. ↩︎

mikeshardmind · November 26, 2024, 3:34pm

I think that that’s reasonable, but I would want it to result in something like (inlined type checker messages as comments):

p: B = C()  # "Note: Later use requires this be C"
q: list[C] = [p]

or

p: B = C()  
q: list[C] = [p]  # "Note: Using inferred type for p of C rather than declared type B"

to avoid this being surprising, while still being permissive when the program should be type-safe. I don’t think subverting what the user wrote without informing them serves the users that use annotations expecting them to actually mean something.

carljm · November 26, 2024, 3:42pm

Can you say more about what you find unintuitive about the pyright results here? They seem correct and expected to me. Mypy’s results look also correct, but less precise than pyright.

The only thing I can see that looks possibly surprising here is that False and True are 0 and 1 in arithmetic with integers, but that’s an accurate reflection of how they behave at runtime.

Dr-Irv · November 26, 2024, 3:58pm

It’s because we think of Literal as being a shorthand for Enum. The non-intuitive aspect is that pyright created a new type in each case - saying that the type of the results are a new Literal type with a set of different values.

carljm · November 26, 2024, 4:01pm

Ok. I think that’s an overly-restrictive understanding of the meaning of a Literal type, that isn’t supported by the typing specification. I don’t think there’s any issue with pyright’s behavior in this case.

mikeshardmind · November 26, 2024, 4:02pm

I find it non intuitive because it isn’t a Literal. It’s a value constrained type (a refinement type) when literals have no such language saying that this is allowed (I don’t mind it being allowed, but arithmetic on literals resulting in new concrete literals isn’t specified behavior best I can tell), and at least for the string literal example, there is a separate broader LiteralStr specifically for strings that are the result of composing Literals that are strings.

oscarbenjamin · November 26, 2024, 4:13pm

How does that apply to this example?

def func(b: B) -> B:
    if isinstance(b, C):
        a = [b] # list[C]
        a[0].call_c_method()
    else:
        a = [b] # list[B]
    return a[0]

There is an annotation on b but the narrowed type is needed to get the type of a correct in the first branch.

carljm · November 26, 2024, 4:14pm

I agree that the specification doesn’t require inferring precise literal types as the result of operations on literal types, but I also don’t think it supports the strict reading that a literal type must always represent a value that is present in the source code as a literal (the spec says simply that “Literal types indicate that a variable has a specific and concrete value.”), and I don’t think this limitation would serve users well.

I also don’t think the existence of the LiteralString type is relevant here, as all Literal string types are a subtype of it, and its motivating use case is as a function parameter annotation to accept any literal string type.

beauxq · November 26, 2024, 4:23pm

# Days is a type defined somehow. It shouldn't matter how.


def myfunc(y: Days):
    print(y)


def iterate_this(daylist: list[Days], adict: dict[Days, Days]):
    for x in set([adict[x] for x in daylist]):
        myfunc(x)

It shouldn’t matter whether Days is a class or a Literal or a Union or whatever.
This code should type check the same no matter what Days is (as long as it’s something valid in a type annotation).

pyright is inconsistent because it treats it differently depending on what Days is.

mikeshardmind · November 26, 2024, 4:24pm

Oscar Benjamin:

How does that apply to this example?
def func(b: B) -> B:
    if isinstance(b, C):
        a = [b] # list[C]
        a[0].call_c_method()
    else:
        a = [b] # list[B]
    return a[0]
There is an annotation on b but the narrowed type is needed to get the type of a correct in the first branch.

This would be B in the outerscope of the function, using the annotation, and C in the inner scope of the isinstance due to narrowing behavior of isinstance. your comments match the types I would expect, and I would expect no type-checking errors.

That said, narrowing on value of a mutable container is super unsafe, and can cause issues in code with threading (more so in the future with free threading), and even type safety issues when used with TypeIs. I’d rather forbid narrowing values inside of a mutable container.

I’ve brought this up before, and others have said they don’t think the average user would be served well by such a rule, but this can be rewritten more safely, and more obviously in terms of type impact by extracting the value, then checking.

mikeshardmind · November 26, 2024, 4:27pm

Agreed. I’d rather see a note about this being explicitly allowed though, and not an odd quirk, and for it to interact better with developer intent (see the original motivating examples)

In the absence of a statement that typecheckers may do string arithmetic with string literals and synthesize a new literal, and the presence of a dedicated type for exactly that, it seem more “in line with specification” to just return this for the string literal example, but I don’t have a reason to say that pyright is wrong here where it doesn’t interact poorly with other developer provided constraints. My primary issue with pyright’s literal handling is places where it conflicts with other declared information without needing to.

carljm · November 26, 2024, 4:43pm

The core problem in this thread (IMO) is that, with generics, an inferred type ends up implicitly treated like a declared type, in that it restricts further use of the generic container. If we have an unannotated x = [y], the user hasn’t clearly declared an intention for the type of x, and if we infer a precise type for the type parameter of x based on the type of y, that (wrongly, IMO) limits what the user can later add to x, even though the user hasn’t requested any such restriction.

I think that debating the exact details of this inference, and how it uses prior declared or inferred types of y, is unproductive, because it doesn’t address the core problem. No matter how this inference is done, it will cause false positives in some cases, because we are guessing about the author’s intent.

One way to address this class of problem is to analyze all uses of the container in the entire scope, construct a set of constraints based on that analysis, and then attempt to choose a type for its type parameter that satisfies all the constraints. The main downside of this approach, IMO (aside from the complexity and potentially performance cost of implementing it) is that it leads to retroactive effects, where code much later in a function can change inferred types earlier in the function. I think that this kind of behavior is difficult to understand and debug, and should be avoided if possible.

Another approach, which isn’t used by any current type checker that I know of, but I think may lead to better behavior in practice, is to infer bounded gradual types in this case, based on the information available. So given x = [y], if the inferred type of y is Literal[1], I would infer x as list[Any | Literal[1]], meaning that its intended contained type is unknown but must be at least as large as Literal[1]. Further use of x could then make its type more precise: for example, if we see x.append(some_str), this is not a type error because the user has never declared that strings are outside the intended use of the list, but the type of x is now list[Any | Literal[1] | str].

Of course, if the author wants the type checker to enforce a particular type for the type parameter of x, it’s easy enough to do so by explicitly annotating x.

In many ways this leads to similar behavior as solving a system of constraints, except that we build up the constraints incrementally, and always in a forward direction, avoiding retroactive effects.

I think this leads to better behavior than attempting to eagerly infer a precise type, because it avoids false positives from trying to guess the user’s intent in the absence of clear annotations.

It would work fine for the motivating example in this thread. Note that soundness is preserved because of the lower bound on the gradual type; if the user tries to use x (or values from x) in a way that is not compatible with values that x is known to contain, this will still be a type error.

mikeshardmind · November 26, 2024, 4:47pm

I don’t see this as guessing about the authors intent when looking at generator expressions and comprehensions that are loop variables.

Especially the motivating example in the original post, it just clearly cannot be “guessing” the expression is composed of a getitem into a dict with typed keys and values from a list with type values that match the key’s type, to construct a set.

This should clearly result in a set with the dict’s value type, and there is no room for guessing here all of these are invariant container types.

You’re asking people to write non-idiomatic code to separate out the generator from the loop itself. This is a bad tradeoff that pushes typing as hard to use, and makes people resent it where it happens already, why wouldn’t we want to define inference for these cases in a way that works for idiomatic python code?

carljm · November 26, 2024, 4:51pm

I am not. The behavior I’m proposing would work fine for the original example, as originally written, with no extra annotation.

I agree that pyright’s behavior in this case is not ideal, and I don’t like the “never infer a literal type for a type parameter” heuristic. I just don’t think that the right way to address this problem is to try to define ever-more-detailed inference rules based on specific examples; that’s going to lead to a never-ending game of whack-a-mole. I think it’s more productive to address the root cause, which is that (in general) inferring a precise type for the type parameter of an unannotated generic does require guessing author intent, and will lead to false positives.