Clarifying the float/int/complex special case

Just going to say it one more time and then I’m bowing out, because at this point we’re just talking past each other

int is currently treated as a subtype of float by all type checkers. Regardless of your opinion on that, it is the status quo and changing that is a major backwards compatibility break that would affect millions of lines of code

The Subtype relationships section of PEP 483 gives 3 examples:

  • Dog is a subtype of Animal
  • int is a subtype of float
  • List[int] is not a subtype of List[float] (because List is invariant)

and has this code example

lucky_number = 3.14    # type: float
lucky_number = 42      # Safe
lucky_number * 2       # This works
lucky_number << 5      # Fails

unlucky_number = 13    # type: int
unlucky_number << 5    # This works
unlucky_number = 2.72  # Unsafe

The numeric tower section of PEP 484 says:

PEP 3141 defines Python’s numeric tower, and the stdlib module numbers implements the corresponding ABCs (Number, Complex, Real, Rational and Integral). There are some issues with these ABCs, but the built-in concrete numeric classes complex, float and int are ubiquitous (especially the latter two :-).

Rather than requiring that users write import numbers and then use numbers.Float etc., this PEP proposes a straightforward shortcut that is almost as effective: when an argument is annotated as having type float, an argument of type int is acceptable; similar, for an argument annotated as having type complex, arguments of type float or int are acceptable. This does not handle classes implementing the corresponding ABCs or the fractions.Fraction class, but we believe those use cases are exceedingly rare.

Both assigning an int to a variable annotated as float and passing an int as an argument to a function parameter annotated as float are unambiguously stated as allowed behavior.

The discussion was started here because one of the people working on implementing a type checker correctly pointed out that the current language in the typing specification does not clearly and unambiguously state how allowing those behaviors should be implemented.

Changing what behavior should be allowed is a different topic

3 Likes

I believe that most people agree that the current promotion rules were a mistake. But unfortunately, it looks like we’re going to have to play by these suboptimal rules. We’re here to figure out how to do that in the least painful way.

In my previous comment, I have shown that, given the rules that we have, it will be possible to emulate the rules that we want. Dismissing this because you don’t like the current rules is not helpful. It introduces unnecessary negativity is this discussion, lowering the chances of us being able to reach consensus.

6 Likes

I don’t see any reason why we shouldn’t just fix the mistake. We don’t have to stick with them. Presupposing that we do means we should never accept any typing feature until proving it’s “right” if we’re not allowed to fix blatant errors.

We have a lot of real use cases that are hurt by this, @oscarbenjamin brought up the impacts on array libraries in the competing thread.

Again, this Requires everyone use that for it to work. The moment anyone doesn’t, it breaks interaction. Which if we have that level of disruption either way, why not go with the option that actually fixes the underlying problem?

2 Likes

I believe that backwards compatibility is the main reason for this.

But don’t get me wrong; I completely agree with you that we should actually fix this, and any other bug far that matter. It’s just that I’m afraid that the majority disagrees with us.

If we’re not talking about literals, then yea that would indeed pretty annoying for downstream users of libraries that would use this. But there will be other situations where it would be a good workaround.

Because I don’t have seen any reason to think it’s an option. If it would be, then I’d be first in line cheering for it.

I realized there’s a case where the proposed change doesn’t work well: NewTypes.

It is currently possible to write a NewType like this:

MyFloat = NewType("MyFloat", float)

The spec says that:

  • The base type of a NewType is a type expression
  • The base type of a NewType must be a “proper class”. This is not defined but seems to mean something that can be used as a base class.
  • (This proposal) In a type expression, float means the union float | int.

Those things imply that type checkers should reject a NewType with float as the base type: float represents a union, and a union is not an acceptable base type for a NewType. But of course, that’s not a very satisfactory solution, since such NewTypes work today.

Pyright, which already basically implements the proposed spec change, handles such NewTypes unsoundly (playground link):

from typing import NewType

MyFloat = NewType("MyFloat", float)

def f() -> None:
    f = MyFloat(1)
    if isinstance(f, int):  # pyright error: Unnecessary isinstance call; "MyFloat" is never an instance of "int"
        pass

A few possible solutions:

  • Specify that the base type of a NewType is not a type expression, but more like a base class. This is consistent with the way NewType is specified (it’s as if you create a child class), but it would mean that NewTypes over float would not accept ints as arguments; in the above example, type checkers should reject the call MyFloat(1). This feels a bit inconsistent and would likely break some users.
  • Allow NewTypes to have arbitrary other types as base types, including unions. I like this solution as a useful generalization, and I have ideas for how to integrate it into the type system, but it might imply large changes to how existing type checkers implement NewTypes.
  • Add some special case specifically for NewTypes over floats, both in the spec and in type checker implementations. This is unsatisfying since a goal of this proposed change is to make the special case more self-contained.

For reference here are some public repos using NewTypes over floats: grep.app link.

You’re going to keep finding more and more places this doesn’t work well and creates inconsistencies within the type system because the underlying issue is trying to shove two inconsistent things together and then hide the negative effects of doing so.

Here’s a running list of the ones we already know of, and it keeps growing:

  1. NewType
  2. discrepancy with isinstance
  3. discrepancy between isinstance and cast
  4. discrepancy between isintance and TypeIs
  5. discrepancies in overloads

The time and effort would be more productively spent removing the special case than endlessley chasing all of the places it creates problems even after an attempt at scoping it narrowly.

It also has to be reconsidered for possible interactions on every new type system addition.

Here’s a question, how should float work with a TypeForm ? The proposal here means it should be treated as the union, this will change the behavior for libraries like pydantic when they adopt TypeForm use, possibly in a way that breaks their users if they stay specification compliant.

6 Likes

I stopped commenting in this thread because people pointed out (reasonably) that arguing against the special case is off-topic. I’m coming back now to express myself differently: if we keep the special case then the proposal here is the wrong way to try to reduce the inconsistencies. This is a move in the wrong direction whichever way you look at it.

The overloads shown are correct because that is simply the reality of real Python code. Under the proposal here how should the annotations be written?

The proposal here is trying to increase consistency between type checkers and reduce internal inconsistency inside the type checkers. In exchange it is increasing inconsistency between type checkers and the type system on the one hand and the external reality of real Python code on the other.

It is a basic fact that the types int and float are distinct and are mapped to different types in the foundational libraries that are used as the basis for all scientific, mathematical, ML etc code in Python. We need a type system that is compatible with that.

Yes, it is possible in combination with these overloads to construct code that returns one type at runtime but that is inferred by mypy to return another:

def f(x: float) -> Float:
    return convert(x)

reveal_type(f(1)) # Float

That problem is caused by the special case though and if you want to resolve it then the way to do so is to remove the special case. If we keep the special case then I think that this is a problem that just has to be accepted. Here mypy infers the type wrong but saying that float means float | int does not provide any way to make this correct either given the runtime implementation of convert.

Examples of the convert function that I showed above exist in many places. They predate type annotations and are very widely used today and there is nothing wrong with them because it is entirely reasonable to map int and float to different types. We need a type system that is compatible with the convert function.

The special case forces the existence of inconsistencies somewhere. If you don’t like that then let’s remove the special case. If you don’t want to remove the special case then we need a better place to put the inconsistency rather than making the type system incompatible with real Python code.

1 Like

Thanks for sharing your perspective; it’s useful feedback!

My motivation for all this is that the spec currently says:

Python’s numeric types complex, float and int are not subtypes of each other, but to support common use cases, the type system contains a straightforward shortcut: when an argument is annotated as having type float, an argument of type int is acceptable; similar, for an argument annotated as having type complex, arguments of type float or int are acceptable.

But this wording is both vague and does not describe how any type checker actually implements the special case (see discussion here). The special case does not just apply in function arguments, and it must have implications for isinstance() narrowing, not just for assignability, because the special case necessarily implies that isinstance() doesn’t behave in the same way as assignability to the float type.

So perhaps we should specify mypy’s behavior. But mypy’s behavior here is complex, hard to describe, and still subject to changes (for example, a change to its behavior was merged just last week). Concretely, mypy’s treatment of isinstance() on float is currently unsound (examples in `int` inconsistently considered a subtype of `float` · Issue #17223 · python/mypy · GitHub), and similarly it treats overloads with float and int unsoundly (as mentioned in some of your previous posts). Mypy treats float | int and float as separate types, at least sometimes, but these two types contain the same values.

The change I proposed, treating float as if it means float | int, does not compromise soundness: if type checkers implements unions soundly in general, then this rule does not add any new cases of unsoundness. It does compromise expressiveness because it means there is no way to express the runtime type “float”, but that seems like a necessary consequence of the special case and the basic idea that two types that contain the same values are the same. It’s definitely not the most elegant part of the Python type system, no matter how we express the special case.

I see lots of claims that the proposed change leads to inconsistencies or discrepancies. I don’t see that (other than the issue with NewType I brought up above); the proposed rule is simple and applies throughout the type system. Any way to implement the float/int special case will lead to surprising outcomes, and it is clear to me that the proposed rule overall leads to more consistent outcomes than any alternative I have seen that preserves the special case.

We’re more than 60 posts into this thread, and I think the other thread (about removing the special case) is even longer. I don’t know if what I’m saying is going to persuade anyone, though hopefully it clarifies why I am pushing for this change. If anyone reading this wants to specify the float/int special case in some other way, I suggest you come up with a proposed wording in the spec that encapsulates how you want things to work.

4 Likes

you can define consistency in specific ways to have this be true, but I think despite the continuous updates to it’s heuristics, and bugs in it, mypy’s current behavior makes it’s typechecking of float more consistent with the actual runtime than the proposed change. The proposed change makes the type system more internally consistent, with some edge cases, but amplifies the issues where this disagrees with the actual runtime from what mypy users currently have.

I personally view python typing as informative and descriptive, not prescriptive, and believe the moment something can be more accurately expressed, it should be because that improves things for users. These changes are a step backward there, as even though both the current state and the proposed change are incorrectly describing the situation, the status quo allows more room for typecheckers to have heuristic rules that more accurately reflect reality.

I don’t think formalizing this in a way that pushes typecheckers into a corner and unable to even try to do better for users while remaining specification compliant here is a good goal.

A lot of the recent activity in the other thread is agreement that the current case is bad, but with no clear path forward because there is no documented policy for breaking typing. The CPython breaking policies don’t apply here because it doesn’t break runtime or change at the language level at all. Despite multiple queries about this from multiple users in both threads, no typing council member has responded directly to any question about what the process is for breaking changes in the specification.

3 Likes

The unsoundness is that all the foundational numeric libraries that define the basic numeric types in the Python ecosystem will have to use type: ignore throughout their basic public API. You didn’t answer the question about how to write the annotations for convert presumably because the answer is just that this proposal makes it impossible to do correctly.

For me it is unfortunate that mypy allows assigning an int to a float because type checkers would be more useful to me if they could pick on cases where that is done incorrectly. That just means that I have to be careful about using floats as needed which was already the case before type checkers came along.

From my perspective though mypy handles float as I need it to in all the other situations like overloads and so on whereas pyright does not. The proposal here is to make it so that mypy behaves like pyright but I would rather have pyright behave like mypy.

I am sure that type checker authors will have much better knowledge than me of the internal details and the edge cases where mypy’s current behaviour is murky and very awkward to maintain. I suspect that the various heuristics that mypy uses have evolved over time based on the reality of Python code though.

Would it be problematic to specify that the special case says just that int is assignable to float but not that float should be handled differently in any other way?

I’m sure that there are plenty more details needed to flesh that out in a specification but it is basically the impression I have of mypy’s current behaviour and it is more useful than what pyright does.

Yes, sorry for missing that. The type system does not allow annotating this function in the way you want.

That would interact very poorly with the behavior of isinstance. If int is assignable to float but we don’t otherwise handle float specially, type checkers will think this code is unreachable:

def func(x: float):
    if isinstance(x, int): ...
func(1)

Mypy patched some cases like this, but not others; I linked some examples in my post above.

Also, a set of overloads like this:

@overload
def f(x: int) -> str: ...
@overload
def f(x: float) -> bytes: ...

Should still be rejected by type checkers if we merely say that int is assignable to float, because the overloads overlap unsafely. I feel the fact that mypy doesn’t reject these overloads is a bug.

I think you may be conflating two issues here. If I understand you correctly, you are taking issue with pyright’s reporting of overlapping overloads. This is independent of the handling of float and complex. There is currently no standard for what constitutes an “overlapping overload”. I proposed a standard definition based on type theory. This is what pyright uses in its implementation. We failed to reach consensus on this, so I retracted my proposal. Mypy’s overlapping overload check is currently implemented as a collection of (largely undocumented) heuristics that flag some but not all cases that are unsound.

Maybe a heuristic-based approach is preferable here. It sounds like you prefer it. But heuristics are hard to standardize because the dividing line between an “acceptable overlapping overload” and an “unacceptable overlapping overload” is subjective if it’s not grounded in type theory. As Jelle indicates above, the fact that mypy doesn’t report an overlapping overload in this case doesn’t mean that it’s sound. It just means that it’s allowing you to ignore some unsound overloads. Maybe this is OK because this particular unsoundness doesn’t contribute to many bugs in real-world usage?

In any case, I think the “overlapping overload” issue is unrelated to the proposed clarification to the spec about float and complex. With Jelle’s proposed clarification, mypy is still free to implement overlapping overloads any way it sees fit because that behavior isn’t covered by the typing spec.

Other than overlapping overload reporting, are there any other situations where you prefer mypy’s handling of float and complex over pyright’s?

I agree that heuristics are hard to standardize, but I disagree with even trying to standardize this further. It’s frustrating to try and actually get to the heart of the matter because all the arguments about it not following from theory are ignoring that the special case itself is problematic. This comes up for various reasons. @Liz’s list above highlights that the cases it comes up are primarily where there are runtime differences and people try to use what should be appropriate tools to inform, and places where the type expression is used both statically and at runtime.

Even just shifting this to “in a type expression, float means float | int” continues to have holes in it, because even if the type system itself appears more consistent, it isn’t consistent with the language the type system is meant to model.

This causes real problems for numeric libraries, which you aknowldeged as a kind of library that really needs help with improving typing, though if I recall correctly, you were at the time looking into modeling array shapes. Tensorflow will crash in some cases when given an int instead of a float.

You ended up retracting this prior to me taking the time to get into the analysis I’m using for intersections, but there’s a better path forward here. I’d be happy to revive this with a more expressive set of rules that remains grounded in theory and may be easier to reach consensus with as it is also more compatible with people’s existing overloads, however I do agree with you that this is possible to resolve seperately from the int-float case, including in a sound interpretation for what overlapping overloads should mean without having to reject overlapping overloads.[1]


  1. Placing this in a footnote for anyone interested, but further discussion should be elsewhere. Treating overloads as a set of functions with domains mapping inputs to outputs, overlapping overloads are simply the union of return types which match an overload’s domain. This interpretation is theory-sound, and allows overlapping overloads, while modeling the safe interface of a resulting call accurately ↩︎

It’s also a breaking change in the semantic meaning, which was the argument some people implicitly made against just removing the special case. Maybe the crater is smaller short term, but people pushing for this are glossing over that too.

Does the type system not allow it?

Currently mypy allows the convert function as shown but pyright does not.

I haven’t read all the typing specification but I’ve read the part you quoted in the OP and I’ve read your proposed changes to that specification. As far as I can tell the proposal here is to add the text to the specification that would disallow annotating the function in that way.

It is not just a case that I happen to “want” to annotate the function that way. Those overloads are simply a reflection of a lot of real code. We need to have a type system that can express that.

Ideally checkers would error at func(1). The special case literally says that type checkers should allow that so they do and that means that the bug in this case is missed. I consider that unfortunate but I think it is better to limit the blast radius from the special case right there: it simply means that func(1) is not recognised as a bug.

I consider this to be a poor interaction with isinstance:

def func(x: float | str) -> str:
    if isinstance(x, float):
        return str(x)
    else:
        return x # pyright errors here

On what basis do you say that they overlap at all, let alone unsafely? There is no overlap between the int and float types at runtime. There is the special case but you are extrapolating some distance from the text in PEP 484 and the typing spec to call this an overlap.

The proposed text here says:

When a reference to the built-in type float appears in a type expression, it is interpreted as if it were a union of the built-in types float and int. Similarly, when a reference to the type complex appears, it is interpreted as a union of the built-in types complex, float and int. These implicit unions behave exactly like the corresponding explicit union types, but type checkers may choose to display them differently in user-visible output for clarity.

Are the annotations in an overload statement “type expressions” or not?

I think it is clear that the intention in this proposal is that it would apply to float in an overload. At some point in the future overloads will be specified more tightly (which is clearly needed) and this text about float will be expected to apply there as well.

The definition of unsafe overlapping might not be well specified yet but would there be any disagreement about this case:

@overload
def f(x: int) -> str: ...
@overload
def f(x: float | int) -> bytes: ...

The overloads are a big part of it because because it affects typing the public API of libraries so that type inference works for users. A library doesn’t get to choose what type checkers downstream consume its public annotations. The specification and type checker consistency are important for that but if the spec itself is incompatible with existing public API that is a big problem.

More generally though mypy treats x: float as more like meaning that x is a float which is what I want type checkers to do. For checking my own code that actually uses floats this is better.

Put it like this: I know how to make sure I have a float when I need one and a type checker that picks up on accidental ints would be nice but if I do use floats correctly then I don’t want a type checker to reject correct code as pyright does in the isinstance example above.

3 Likes

mypy not raising an error is just a false negative (where it is not detecting the overlapping overloads), mypy does not allow the convert function to work the way you want, evidenced by

var: float | int
reveal_type(convert(var))  # Revealed type is `Float`, but should be `Float | Int`

I think you might be missing the fact that “is assignable to” is a precisely defined term

Because int and float are both fully static types, saying “int is assignable to float” is exactly equivalent to saying “int is a subtype of float” (the difference between “assignable” and “supertype/subtype” is only relevant for a gradual form with Any)

Yes

A type expression is defined as:

An expression that represents a type. The type system requires the use of type expressions within annotation expression and also in several other contexts. See “Type and annotation expression” for details.

The linked section for details says:

The terms type expression and annotation expression denote specific subsets of Python expressions that are used in the type system. All type expressions are also annotation expressions, but not all annotation expressions are type expressions.

A type expression is any expression that validly expresses a type. Type expressions are always acceptable in annotations and also in various other places. Specifically, type expressions are used in the following locations:

  • In a type annotation (always as part of an annotation expression)
  • The first argument to cast()
  • The second argument to assert_type()
  • The bounds and constraints of a TypeVar (whether created through the old syntax or the native syntax in Python 3.12)
  • The definition of a type alias (whether created through the type statement, the old assignment syntax, or the TypeAliasType constructor)
  • The type arguments of a generic class (which may appear in a base class or in a constructor call)
  • The definitions of fields in the functional forms for creating TypedDict and NamedTuple types
  • The base type in the definition of a NewType

An annotation expression is an expression that is acceptable to use in an annotation context (a function parameter annotation, function return annotation, or variable annotation). Generally, an annotation expression is a type expression, optionally surrounded by one or more type qualifiers or by Annotated. Each type qualifier is valid only in some contexts. Note that while annotation expressions are the only expressions valid as type annotations in the type system, the Python language itself makes no such restriction: any expression is allowed.

What mypy is currently doing is saying that int is actually a subclass of float, which is almost certainly not what you want when you actually care about the difference between them at runtime, e.g.

var: float | int

if isinstance(var, float):
    # do something
    ...
elif isinstance(var, int):
    # mypy thinks this block is unreachable
    # and will not warn you about any type errors
    ...

I am using the precise term deliberately.

That is usually what assignability would imply. I am suggesting that the special case could be right there in the definition of assignability as implemented in the type checker though:

def is_assignable_to(lhs: Type, rhs: Type):
    # Obviously needs to be more complicated e.g.
    # list[float], complex, float | str etc:
    if lhs == float and rhs == int:
         return True
    elif has_any(lhs, rhs):
        ...
    else:
         return is_subtype_of(rhs, lhs)

Perhaps is_assignable_to is not the right place if it is used for more things than just deciding whether to report an error though. A different way of putting it is that the type checker simply does not report an error when checking the validity of an assignment in the code if the assignment is invalid because of assigning an int to a float: the special case is nothing more than an implicit, selective type: ignore.

Okay, well yes that can be improved. Treating int as a subclass/subtype of float is not good. Treating float as float | int is not good. There is no way to allow the special case without some inconsistency but we should not allow that to mean that int and float are not real distinct and not in any way overlapping types when used in annotations. It is never the case that an int is a float or vice versa:

>>> class A(float, int): pass
...
TypeError: multiple bases have instance lay-out conflict
2 Likes

My bad, misunderstood that you were intentionally proposing the special case be “int is assignable to float, even though int is not a subtype of float

Personally, I think that’s a harder rule for a typing user to “wrap their head around” and use correctly (and for a type checker to handle consistently) than this proposed change, which essentially gets rid of any exceptions to the subtype and assignability rules around complex, float, and int by instead pushing the special case “up” to the type expression syntax level instead.

The missing piece either way remains the ability to annotate “float is assignable and int is not”, but that topic remains better suited for a later proposal (perhaps one from the other ongoing thread)

1 Like

To be clear that is not really my idea: that is basically just what PEP 484 says. I don’t think you can turn it into something more consistent without saying that int is not assignable to float which directly contradicts the wording of the PEP.

2 Likes

As this discussion has gone on, a typing council member (@Jelle ) has published something suggesting not adding type negation to python: Gradual negation types and the Python type system | Jelle Zijlstra

This was linked in the typing github issue for adding Not along with the statement:

I conclude that explicit negation types currently shouldn’t be added to the type system, but I discuss how they fit into the system and how they would behave if added.

Given such a stance, that really limits options forward to fix it to just doing it, and then telling people to pin their tools. This isn’t something that we can just patch over and punt to the future saying negation fixes it if we change the special case now to treat it as a union, if negation is being ruled out at the same time.


quote from Jelle in this thread for contrast

4 Likes