Type hints for bool vs int overload

mikeshardmind · January 21, 2025, 6:41pm

Except when you actually encode this correctly for the type system, and don’t rely on behavior that happens when you encode it incorrectly then ignore the error you get from it, here’s what happens:

Code sample in pyright playground

from typing import Protocol, override, Self, reveal_type

class Just[T](Protocol):
    @property
    def __class__(self) -> type[Self]: ...
    @__class__.setter
    def __class__(self, t: type[Self], /) -> None: ...


only_int_tn: Just[int] = 1  # accepted
only_int_tp: Just[int] = True  # accepted

jorenham · January 21, 2025, 6:44pm

Michael H:

Code sample in pyright playground

from typing import Protocol, override, Self, reveal_type

class Just[T](Protocol):
    @property
    def __class__(self) -> type[Self]: ...
    @__class__.setter
    def __class__(self, t: type[Self], /) -> None: ...


only_int_tn: Just[int] = 1  # accepted
only_int_tp: Just[int] = True  # accepted

This T@Just is covariant (bivariant even, as it’s redundant), so this makes sense.
If you wouild be able to define Just[T: Self], then you’d get the same behaviour as the original, invariant, Just.

mikeshardmind · January 21, 2025, 6:49pm

There’s multiple valid reason to not allow T: Self here which really wouldn’t be constructive to the discussion right now. The end result is that currently, there isn’t a way that is supported and not subject to issues of fragility and compatibility to spell what you want to spell here.

I don’t think the type system should allow this, but I’m sympathetic to what it’s supposed to be solving. I think the proper way to fix this is type negation, either automatic and implicit in overloads such that the first overload removes possibilities from latter ones, or explicit and user-expressible.

A way to spell “exactly this type” is likely to have issues where things that didnt use it are either erroneously compatible (See above examples of this when returned by other functions) or require everyone use it everywhere, breaking the intended use of subtyping, but could be accepted (As an actual propsoal in a supported form) even aware of this if there’s enough demand.

jorenham · January 21, 2025, 7:08pm

…

optype.Just[int] works perfectly fine in mypy and pyright at the moment, and I still don’t any reason why that shouldn’t be the case

Why not? It’s type-safe, very useful, intuitive, and easy to implement. I honestly don’t understand why ignored questionable error would make it a bad idea. By that logic, looking at the typeshed stubs for the builtins, we should also remove type, int, dict, set, etc.

There are many problems with overloads, e.g. overloaded methods don’t work with structural typing. As I’m not a big fan removing Protocol, I’d say we should fix @overload, and not lash out at everything that can be overloaded.

Huh?

There’s no subtyping here: it’s structural typing we’re talking about.

Are you suggesting that we should PEP it or something?

MegaIng · January 21, 2025, 7:26pm

It is not type safe:

@overload
def f(a: bool) -> A: ...
@overload
def f(a: Just[int]) -> B: ...

def f(a: bool | Just[int]) -> A | B:
    if isinstance(a, bool):
        return A()
    return B()


def func(arg: int) -> B:
    value: Just[int] = arg
    res: B = f(value)
    return res

outer: int = True

b: B = func(outer)

assert isinstance(b, B) # fails at runtime, but static type checkers can't see this

I would describe the above program as having a static typing issue. Where is it? Even if you don’t want it to be, bool is currently a subtype of int. Which of the above statements are incorrect using the rule for typing so that usage of Just is type-safe?

mikeshardmind · January 21, 2025, 8:38pm

There’s no subtyping in Just. Structural types do participate in subtyping, and what you want here breaks intended uses of subtyping in a type system.

You can see this with return types in examples above that are broken even on the type checkers you say this works with, which is the same problem I was referring to with

You can find the examples of that up here Type hints for bool vs int overload - #26 by MegaIng

What would solve this for overloads in a type safe way is more complex, because we actually want the variance here to remain by parameter type in the functions being overloaded, and not attempt to make classes generic over themselves, while also not allowing subclasses (making Self invariant would require no subclasses, and no subtyping)

A lot of the problems in the type system right now are actually because of such inaccuracies, it’s unwise to build things that rely on those inaccuracies remaining inaccurate.

carljm · January 21, 2025, 10:42pm

It’s often helpful to think about types in terms of the sets of possible runtime objects a type represents.

The Just[int] type is supposed to represent “all instances of the int type, but no instance of any subclass of int, including instances of bool.”

This means that the type int must not be assignable to the type Just[int], because the type int is a wider type (includes more objects). That’s the source of the unsoundness shown by @MegaIng above; the line value: Just[int] = arg (where arg: int) should be a type error, but it is not.

If Just[int] were a consistent/sound implementation of an exact integer type, the only things you could assign to a variable typed as Just[int] (or pass to a parameter typed as Just[int]) would be other expressions also typed as Just[int], and integer literals. This is what @mikeshardmind meant above about an exact type “requiring everyone to use it everywhere.”

jorenham · January 22, 2025, 5:22pm

Cornelius Krupp:

It is not type safe:
@overload
def f(a: bool) -> A: ...
@overload
def f(a: Just[int]) -> B: ...

def f(a: bool | Just[int]) -> A | B:
    if isinstance(a, bool):
        return A()
    return B()


def func(arg: int) -> B:
    value: Just[int] = arg
    res: B = f(value)
    return res

outer: int = True

b: B = func(outer)

assert isinstance(b, B) # fails at runtime, but static type checkers can't see this
I would describe the above program as having a static typing issue. Where is it? Even if you don’t want it to be, bool is currently a subtype of int. Which of the above statements are incorrect using the rule for typing so that usage of Just is type-safe?

Like I said before, I’m pretty convinced that the overloads are behaving incorrectly here, and are what’s causing this type-unsafe situation.

If overloads aren’t used incorrectly, as has been done in this example, then Just is type-safe.

jorenham · January 22, 2025, 5:36pm

You keep saying that, but I still fail to see why you think that. As I’ve explained before, there is no reason why T@Just would be incompatible with Self@Just.

Yes, and as I’ve pointed out several times now, this is because the overloads are used incorrectly here, which shouldn’t be allowed.

See my previous answer

Let’s not make this a discussion about the details of overloads, because that will get us nowhere. Overloads are complex, and their specification isn’t complete. So it’s not useful to use overloads as an example of “why X doesn’t work”, because that’s unfalsifiable. This is, in fact, why I feel like we’re going in circles now.

I believe I proposed a valid solution to this problem. I don’t mind discussing the details of this Just type, and will welcome anyone to do so, but I don’t think that this is the right place for this. We could continue this conversation at e.g. GitHub · Where software is built. I’m open-minded when it comes to alternative solutions, and am even willing to consider writing a PEP if needed. But just know that by nature, I’m a wee bit neurotic, so it might help to keep the conversation constructive, and oriented towards the common goal that we all share.

Anyway, I love you all, and I’m gonna grab myself something to eat before the typing meetup. I’ll try to respond to the remaining comments after that.

MegaIng · January 22, 2025, 5:50pm

Ok, let me construct an example that doesn’t use overloads although that means it’s a bit more contrived:

def foo(a: Just[int]) -> int:
    assert type(a) == int # this should always be true, that is the promise of the `Just` type
    return ~a # An operation that has been considered to emit a warning for bools

def bar(b: int) -> int:
    c: Just[int] = b
    d: int = foo(c)
    return d

bar(True) # triggers the assert

Where does this code violate type safety? You can’t blame overloads anymore.

jorenham · January 22, 2025, 7:39pm

Cornelius Krupp:

def foo(a: Just[int]) -> int:
    assert type(a) == int # this should always be true, that is the promise of the `Just` type
    return ~a # An operation that has been considered to emit a warning for bools

def bar(b: int) -> int:
    c: Just[int] = b
    d: int = foo(c)
    return d

bar(True) # triggers the assert

Where does this code violate type safety? You can’t blame overloads anymore.

That’s certainly a valid point.

In a way, I can see how c: Just[int] = b can be considered valid if we have b: int, since b.__class__.fget: () -> type[int] and b.__class__.fset: (type[int]) -> None are both satisfied.

But on the other hand, b: int itself can be considered covariant, so at runtime it might also assume e.g. True. This means that it’s also possible that b.__class__.fget: () -> type[bool] and b.__class__.fset: (type[bool]) -> None. So in other words, b: int can be viewed as b: int | bool, since int is equivalent to int | bool. And as it turns out, mypy will correctly reject it if we change b: int to b: int | bool. But unfortunately, pyright won’t:

def takes_int(x: int) -> Just[int]:
    y: Just[int] = x  # mypy: no error / pyright: no error
    return y

def takes_int_or_bool(x: int | bool) -> Just[int]:
    y: Just[int] = x  # mypy: error [assignment], pyright: no error
    return y

The fact that mypy rejects Just[int] = b when b: bool | int, but not when b: int, even though these are supposed to be equivalent, looks to me like we’re within unexplored typing territory here, of which the laws aren’t yet fully understood by the type-checkers.

I wasn’t yet aware of this problem, so thanks for showing me — I’ll raise it with mypy and pyright .

MegaIng · January 22, 2025, 7:52pm

Ok, and now that you are on the same page as us, please reread the rest of the thread following your original message.

Part of the point is that even conceptually Just[int] is annoying because of how infective it is, require adoption of it by all downstream typing users since APIs using Just[int] can’t correctly be used if all you have is int (which is going to be far more common).

Also, consider foo(c * 2). Is this valid? How does the static type checker now the result is still Just[int]?

jorenham · January 22, 2025, 7:59pm

Yes exactly! There are some cases where this doesn’t apply unfortunately. For example, consider the overloaded function (A) -> X, (B) -> Y, (A | B) -> Z (which has overlapping overloads with incompatible return types). If you pass it a value of type A | B, then both mypy and pyright infer its return type as Z. This is strange, because the first two overloads already cover the set of possible runtime objects, i.e. {a1, a2, ...} | {b1, b2, b3, ...}. But even so, we landed on the third overload, as that’s the only one that returns Z.

Correct. More formally, you could describe it as x: Just[T] iff. x.__class__ is T. Or at least, that’s how I expect it to behave.

You’re absolutely right. And I don’t (completely) understand why type-checkers don’t (always) accept it either way. And I believe that this is a problem with the type-checkers, and not a problem with Just itself.

That’s indeed the conclusion if a-priori assume that type-checkers are always correct. But in this case, I don’t see why that would be true.

jorenham · January 22, 2025, 8:13pm

There’s no need for that

This, again, assumes that type-checkers are behaving correctly. I don’t think that they are. If this were to be fixed, then this wouldn’t be needed, and you could use Just just once.

I assume that c: Just[int] here? In that case, that’d be valid iff. int.__add__: (Just[int]) -> Self. But currently, this isn’t the case. And because of (another) issue with overloads, we can’t add this as an overload to int.__add__, because that would cause int to not be accepted anymore by Protocols with __add__: (int) -> int method (see pyright#9663).

Now let me ask you a question:
Do you believe that this should be accepted (as pyright does), or rejected (as mypy does)?

def main(x: int | bool) -> Just[int]:
    return x

And should that then also be the case if we change x: int | bool to x: int?

MegaIng · January 22, 2025, 8:17pm

Since you are explicitly asking me: The Just type itself should be rejected, as pyright does. You are intentionally ignoring error messages and are then surprised that the behavior isn’t consistent.

I agree that there are usecases for such a type, but I don’t believe using poorly documented static type checker edge cases (which you are doing) is a viable long term strategy. If you care about this, write a typing documentation page, clarifying the behavior. And that will then give you the ability to control this behavior. You will need buyin from the type checker authors, and I am not sure for which parts exactly you will get one.

jorenham · January 22, 2025, 8:21pm

It’s only pyright that reports an error when defining Just, FYI.
But let’s say that wouldn’t be the case, and for example it wouldn’t require a # pyright: ignore in the next pyright version, then what would your answer be?

MegaIng · January 22, 2025, 8:25pm

That depends on what the actually defined behavior is that was assigned to this construct. It is currently undefined behavior. You can’t just waive away the error with “well, what if the error wasn’t there”. The error in this case means that the type checker doesn’t like it and isn’t sure what to do with it.

I am not sure what it should do. I think Just needs to be a special form, I don’t believe you can get sensible semantics without it. If it were a special form, it should probably be rejected, and so should x: Just[int] = y.

jorenham · January 22, 2025, 8:33pm

That’s why I’m asking you how you would define it.

I have explained why I think it’s a false positive, so I don’t see what’s wrong with this.

I don’t see any need for a special form, but I guess I’ll take this for an answer .

So this means that we fully agree on how Just should behave, but disagree on whether my current implementation of Just is valid. Did I get that right?

erictraut · January 22, 2025, 9:19pm

There is currently no way in the type system to write an annotation that means “the set of all values that are instances of X but not instances of subclasses of X”. You can effectively do this if X is marked @final, which precludes the possibility of subclassing, but this isn’t possible for non-final classes like int.

Your proposed Just protocol is clever, but it makes assumptions that are tenuous because it relies on an unsound type definition (an LSP violation), which leads to type checker behaviors that are undefined. Use it at your own risk.

As @mikeshardmind said, expressing the concept of “an instance of X but not any of its subclasses” would require a new mechanism in the type system that doesn’t exist today.

This is something we could add if there’s sufficient need. I’ve heard this need expressed on a few occasions, but it’s relatively rare. Code that honors OOP principles shouldn’t check for a specific class X but should rather use isinstance and issubclass to determine the compatibility of an object or class. I understand that some existing APIs break this rule and use is rather than isinstance or issubclass, but I think this should be discouraged.

The annotation int | bool describes exactly the same type as the annotation int. The two types are equivalent. If a type checker treats them differently, that’s a clear indication of a bug in the type checker. In this case, pyright treats the two types as equivalent and mypy incorrectly treats them differently, so this is a bug in mypy. That said, I appreciate that detecting equivalency between types is difficult and costly (prohibitively so in some cases). If you search hard enough, I’m sure you would find cases where pyright’s behavior differs between two equivalent types.

You’ve pointed out a case where the overload resolution algorithm appears to treat arguments of type int | bool and int differently. In that example, however, the overloads are overlapping in an unsound way. If you decide to ignore this unsoundness, then uses of that overload may be unsound and produce inconsistent results. I don’t consider that a type checker bug.

The typing spec intentionally doesn’t mandate type checker behaviors based on unsound type definitions. Instead, it mandates that type checkers flag unsound type definitions and relies on users to fix those issues. If you create unsound type definitions (e.g. overrides that violate the LSP or overlapping overloads that have different return types) and ignore the unsoundness reported by the type checker, you are on shaky ground. Any type checking behaviors that you rely on after that point are type-checker-dependent and could change over time.

With your Just class, you’ve created an unsound override of the __class__ symbol — one that violates the LSP, but you’ve chosen to ignore the unsoundness that pyright has reported.

You mentioned that overloads are complex and currently underspecified, which is true. In case you weren’t aware, there’s an effort underway to improve the spec in this area. This draft addition to the typing spec is nearing completion, and it defines behaviors for overloaded call evaluation and overlapping overload detection, among other things.

carljm · January 22, 2025, 9:22pm

No, I don’t think there are any cases where this doesn’t apply. There may be cases where type checkers don’t correctly apply it, though I don’t think the example below is one of those cases.

This is just an invalid overload definition, as type checkers will tell you, and as you point out yourself: it has overlapping overloads with incompatible return types. If it were a valid overload (e.g. if you replace the Z return type with X | Y instead, which makes it valid), then there is no longer anything odd about the behavior. If we have precise information that the argument is within the set A, then we know the return type must be X; if we know for sure the argument inhabits B, we know the return type is Y, otherwise all we can say is that the return value must be somewhere in the set X | Y.