Options for a long term fix of the special case for float/int/complex

I don’t find any of the non-breaking options viable. Using float & ~int to denote only float is overly complex for a fundamental data type used by everyone. This also seems like just another special case, as complex and float would be the only annotations requiring intersections to represent themselves, and it still relies on users knowing about this special case. Most people will continue using bare : complex and : float out of unawareness.

typing.StrictFloat (or similar) also falls short. It still requires authors to be aware of the special case, meaning most users will likely continue to use bare : complex and : float out of unawareness. Furthermore, people dislike importing typing for basic data types—and rightly so. We’ve recently moved away from typing.List, Dict, Set, etc., and this option would just send us back in that direction for one or two data types, leading to the same confusion that plagued list versus typing.List.

Crucially, neither of these options actually resolves the core issue: the annotations remain ambiguous. As a user, you’ll have no way of knowing whether the author truly intended “only float” or if they were simply unaware of the special case that would require float & ~int or typing.StrictFloat.

So, I strongly advocate for eliminating this inconsistency in the type system entirely. How many type hint users employ a type checker but not a formatter? Tools like Pyupgrade, Ruff, and others have been correcting annotations for a long time, and they are likely to add an autofix for this case as well. A PEP could even require at least one tool to transform existing annotations (e.g., float to float | int), and then MyPy and Pyright could recommend that tool in their error messages. MyPy already did this when implicit Optional was removed; for example, foo: str = None would trigger a warning and link to a tool to automatically transform it to foo: str | None = None. The bytes type also faced a similar issue where a bare bytes annotation implicitly meant bytes | bytearray | memoryview. This is now on the path to being fixed; it’s deprecated, and MyPy provided helpful errors and added a --strict-bytes flag to opt into this future default.

7 Likes
float & ~int

I am not even using typing but I think this should be the way to go.
-Traditionnally, we use python to make things work first and fix edge cases later.
-Duck typing is historically the default paradigm.
-The case where int should be managed differently than float is quite rare.

Furthermore, that convention extends to a wider range of usages : Generally, if the user have a class Case and a class BadCase that inherits from Case but should be avoided in some function. The user could do

def function(case : Case & ~BadCase) : ...
3 Likes

It is true that many libraries including the stdlib will accept ints in places that floats are expected. This is not done by allowing the types to be ambiguous everywhere though. What happens is that the types are coerced at the API boundary. This same pattern can be seen in the stdlib, numpy, other array libraries, sympy, gmpy2, mpmath, and many more:

def public_function(arg: Convertible) -> MyTypes:
    arg = convert(arg)
    return _private_function(arg)

def _private_function(arg: MyTypes) -> MyTypes:
   ...

It is nice to accept a broad range of inputs as Convertible but it would be madness to pass around ambiguous types like int | float everywhere internally or to have ambiguous return types on public API.

The type of Convertible is generally more complicated than just float | int but if it allows floats then it will also allow ints. That does not mean that the library will treat them as equivalent though:

>>> np.array([1])
array([1])
>>> np.array([1.0])
array([1.])

>>> np.sqrt(-1+0j)
np.complex128(1j)
>>> np.sqrt(-1)
RuntimeWarning: invalid value encountered in sqrt
  np.sqrt(-1)
np.float64(nan)

>>> sympy.sqrt(2)
√2
>>> sympy.sqrt(2.0)
1.41421356237310

>>> gmpy2.add(1, 1)
mpz(2)
>>> gmpy2.add(1, 1.0)
mpfr('2.0')
>>> gmpy2.add(1, 1j)
mpc('1.0+1.0j')

>>> gmpy2.sqrt(-4)
mpfr('nan')
>>> gmpy2.sqrt(-4+0j)
mpc('0.0+2.0j')

>>> mpmath.sin(1)
mpf('0.8414709848078965')
>>> mpmath.sin(1j)
mpc(real='0.0', imag='1.1752011936438014')

(All of these are examples of things that cannot be typed properly with @overload because of the special case for int, float, complex.)

These libraries have different types for int, float and complex for exactly the same reasons that the stdlib does. They allow you to mix the types in arithmetic in the same way the stdlib does with analogous promotion rules. Those that have integer/exact types will generally want to use those for integer inputs which means always distinguishing very carefully between exact int and approximate float. They will also distinguish carefully between real vs complex typed inputs treating those as indicating not just different types but in some cases also different mathematical domains: there is only one np.sqrt rather than math.sqrt vs cmath.sqrt.

There can be other constraints that don’t apply in the stdlib e.g. NumPy arrays are homogeneous so you cannot just casually store non-integer values in an integer array:

>>> a = np.array([1])
>>> a[0] = 7.5
>>> a
array([7])

If you understand the internal representation then it is clear that it is physically impossible for a to contain the number 7.5. If you used array([1.0]) then it could.

NumPy consistently uses the rule:

int ** int -> int

Hence:

>>> np.array([3.0]) ** -1
array([0.33333333])
>>> np.array([3]) ** -1.0
array([0.33333333])
>>> np.array([3]) ** -1
Traceback (most recent call last):
  File "<python-input-22>", line 1, in <module>
    np.array([3]) ** -1
    ~~~~~~~~~~~~~~^^~~~
ValueError: Integers to negative integer powers are not allowed.

I think this behaviour was always there but NumPy (as of 2.x) is now consistent in many other places as well about having types that depend only on types rather than value-dependent casting.

The int.__pow__ method special cases negative exponents:

>>> 2 ** 2
4
>>> 2 ** -2
0.25

This is impossible to type. In typeshed it is:

_PositiveInteger = Literal[0,1,2,...,25]
_NegativeInteger = Literal[-1,-2,...,-20]

class int:
    @overload
    def __pow__(self, value: _PositiveInteger, mod: None = None, /) -> int: ...
    @overload
    def __pow__(self, value: _NegativeInteger, mod: None = None, /) -> float: ...
    # positive __value -> int; negative __value -> float
    # return type must be Any as `int | float` causes too many false-positive errors
    @overload
    def __pow__(self, value: int, mod: None = None, /) -> Any: ...

Hence:

def f(x: int, n: int):
    reveal_type(x**2)  # int
    reveal_type(x**-2) # float
    reveal_type(x**n)  # Any

NumPy’s behaviour is better here: it isn’t hard to force the float conversion if you want it and that is a small price to pay for having well defined types (regardless of whether you use type annotations). It does mean though that your code can blowup on int vs float so there is a need for type checkers to distinguish these.

It works to write e.g. np.cos(1) not just because the function is liberal about input types but precisely because it is also strict about output types: you can give it an int but it won’t return an int. This kind of implicit type promotion often results in code that mixes int and float just happening to end up with the expected float types but a corner case where a function returns an int instead of a float should always be considered a bug.

The places where int | float is acceptable as an annotation represent a tiny fraction of any code that uses floats a lot. It is really just the public API boundary of a library where you assume that users who are likely not using a type checker might pass something like [1, 1.0] but those API functions will convert their inputs to well defined types immediately. This is why typeshed or library stubs will massively over-represent examples where float | int seems like a reasonable annotation at least for function parameters.

5 Likes

Unless you annotate the overloads with list[int] -> Array1D[np.int_], list[float] -> Array1D[np.float64], …, which don’t overlap because list is invariant :slight_smile:

It’s this ambiguity that is such a problem. There is no valid automatic upgrade path, because there are places that are annotated float that absolutely are not intended to accept int, and some that are.

This is one of the problems with the StrictFloat option (As well as “fix it in the future with intersections”). Any use of float remains ambiguous. Was this just not updated? If it wasn’t, do they actually accept both? It would still require people to write int | StrictFloat to remove that ambiguity, because it’s ambiguous in a way that silences type checkers detecting actual incompatibility.

Meanwhile, just removing this special case, there’s little ambiguity on not having updated, and less than exact level of ambiguity we currently have. At worst, libraries haven’t been updated to declare it accepts int and errors will allow libraries to be aware they should change float to either int | float or SupportsFloat if they actually intend to accept more. The errors here actually allow this to be fixed with less ambiguity.

This is only happening at static analysis time, not runtime, so it also doesn’t break anyone’s working code. This also means that people can avoid the new errors in CI by pinning the versions of typecheckers they use[1], or running multiple versions, at least one before and after[2] the update and use this to help them migrate.


  1. They should already be doing this is they allow typechecking to fail CI pipelines ↩︎

  2. Set this up to generate a comment, annotation, or some other form of CI output, again, dont fail CI on this. ↩︎

6 Likes

As a end-user of typing, i.e. not someone writing libraries necessarily, but small-to-medium sized scripts/applications, I do add type annotations to functions so that my IDE can help me.

Essentially always I want int | float. And honestly, I would find it massively annoying if I couldn’t call a function with 0 instead of 0.0, just because the typing complains and I didn’t originally think of writing int | float and just wrote down float instead.

So for the possibility of removing the special case to be considered, IMO there needs to be a solution that is exactly as convenient, meaning no import, no writing beyond a single word, just as intuitive to use as just float. And I don’t think such a solution exists.

1 Like

I’m aware but I still think a tool like that would be helpful for people who don’t care and just want the existing behaviour back. If anything, I hope running such a tool would make people double check their public APIs and adjust some of the float annotations to be more accurate.

Can’t that just be a setting in your IDE if it’s popular enough, not part of the typing specification? I know pylance (Visual studio code’s default language server) doesn’t error by default for all typing errors despite being built on pyright.

I don’t see how it gets any better than just writing float | int if that is the intended type. It is already short and unambiguous. No alias for this would ever express as clearly what is meant.

You would still be able to use math.cos(1) etc so this is only about the annotations you wrote yourself. If you had to choose between writing float | int in various places or just using float consistently I think that you would realise that using float | int is not really that useful and that changing 0 to 0.0 is not that bad or is in fact an improvement. These are just the sort of constraints that type checkers will impose on your code and I’ll bet other things bite you much harder than this.

This is no different from annotating a parameter as list and then trying to pass a tuple. You should either stick rigidly to the expected types or widen the annotation. Different situations call for one approach or the other. It probably doesn’t make sense to just change the annotation to tuple | list though.

2 Likes

These extra constraints are fundamentally opposed to existing design goals in python, and you know this.

I don’t see how it gets any better than just writing float | int if that is the intended type.

Having to write float | int requires additional thought at the moment of writing the annotation, is more verbose and easy to forget. The common case (i.e. float|int, which yes, is the common case) should be easy & obvious - this would be an eternal drawback of annotations. It’s better to accidentally do the common case then accidentally do the uncommon case.

This is no different from annotating a parameter as list and then trying to pass a tuple.

This is a strawman. list and tuple have very noticeably different interfaces[1] and the places where you would use one or the other interchangeable exists, but are not the common case. As you have already acknowledged yourself:

It probably doesn’t make sense to just change the annotation to tuple | list though.

Instead you would want a very noticeably different thing to a union of two types, namely Sequence or Iterable. If proper protocols for the types defined in numbers.py exists, they would most be an acceptable alternative to the special case if they were moved into the builtins. But also, those Protocols don’t gain you much over a simple union of the two types that dominate 99% of code.


  1. Mutability vs Hashablitiy and Heterogenity vs Homogeneity being the most common sticking points. ↩︎

I don’t see how the ambiguity can go away anytime soon without adding some other way to spell “just runtime floats”

If the special case is removed and we say that from here on out, annotation float should just mean runtime float, then the ambiguity will remain and you can never unambiguously say “no really, actually just runtime float” because old code can’t change and you can’t be sure someone is actually following the spec change
(edit: rephrased this in another comment below because it is not quite what I meant)

Annotation Runtime meaning
float | int float | int
float float or float | int
int int

If you instead add some other spelling (I’ll keep using typing.StrictFloat for the example, but it doesn’t have to be that) and accept that float will be ambiguous, you would at least be able to to be unambiguous when it matters

Annotation Runtime meaning
float | int float | int
typing.StrictFloat | int float | int
float float or float | int
typing.StrictFloat float
int int

The only way that I can see to be able to be sure that someone means just float not float | int is to have some other way to say it

I don’t think it is a strawman. Floats and integers also have different interfaces. You might not care about that, but other people do.

You are making assertions about what is and is not “the common case” in a way that I find uncomfortable. I don’t presume to know what “the common case” is because I’m not conversant in all of the domains where Python is used. Numerical libraries like numpy care a lot about the differences between integers and floats.


I don’t really understand why anyone would be that bothered about writing float | int . It’s 6 characters to clarify your intent, and the alternative is that the type system is ambiguous around one of the fundamental primitives in the language. And alternative ideas like StrictFloat introduce more problems than they solve.

Maybe there’s some real harm to this, but so far the pushback seems to be “I don’t want to write it that way”. That’s fine, but I don’t very much like the way that a number of things are written and I’m still happy with the language. You don’t need to like it, but “I don’t like it” isn’t something anyone can respond to. I wish there were a solution which is agreeable to everyone, but there isn’t. So we have to compromise.

11 Likes

That’s unnecessarily pessimistic. The update will take time but past a certain point we’ll treat anyone who uses float to mean int | float as having a bug in their types.

Python types aren’t guaranteed to be perfect. They are sometimes inaccurate and need to be fixed. This isn’t all that different.

4 Likes

That was indeed poor phrasing on my part. Rewriting that to be closer to what I actually intended to mean:

If the special case is removed and we say that from here on out, annotation float should just mean runtime float, then the ambiguity remains.

You would not be able to unambiguously say to downstream consumers “no really, actually just runtime float” because you can’t be sure that they are keeping up with the spec change.

You can’t rely on the annotations from outside code, because old code from before the spec change will still be out there unchanged where a float annotation probably means float | int, but might mean just float, and even for anything written in the transition period after the spec change, you can’t know if the author was keeping up with the spec change.

That’s in the short term. Do you think it looks the same 2 years after type checkers switch from “warning” to “error” on this spec violation?

I think this is just describing what happens in the immediate period after the change. Not what we live with for decades afterwards.

You actually can. Because the post-change version would be more (correctly) restrictive, anyone following the new version will not be passing invalid values whether they are calling code that is or isn’t updated. The worst effect here would be false positive errors if int is actually allowed, but hasn’t been updated yet, this “false positive” (accurate to the annotation, but the annotation needs updating) then prompts people to update the involved type hint.

The numeric tower only really defines one method that is actually useful for writing code that can accept general Reals and that method is __float__. The protocol for it exists and has been mentioned many times above: SupportsFloat.

If you think 99% of code uses float and int rather than something like a NumPy array then we clearly move in different circles.

Static typing itself breaks many of the original design goals of the language. When you use float | int what you are doing is really just a floating point calculation with sloppy typing. There is not necessarily anything wrong with that but it means that your types are sloppy. One of the design goals of Python is that it should be okay to do calculations with sloppy types in simple cases. In more complex cases though over time people have considered that sloppy typing is problematic so they now use static type checkers.

Type annotations are optional though so the question is why are you using type annotations and a type checker if you don’t want it to complain about sloppy typing? If you are using it because you just want the type inference then great but consider that the special case here breaks inference for every library like NumPy that builds over the top of int, float and complex.

The option that you have missed here and that many others have as well is that you don’t have to change your annotations to float | int if you want well typed numeric code. The way to make your float code well typed has always been to use the float type consistently rather than mixing int and float. These days you should be able to have a type checker verify that your floats really are floats.

I did not miss this. In fact, I explicitly addressed it.

I just find it absurd that you think requiring to write 0.0 instead of 0 is an improvement in any way on existing code.

I don’t think this discussion is moving in a productive direction. Whether you believe the special case should go away or not, the crux of the matter is the transition and the associated churn with it. I think it’s disrespectful of people’s time to just assume that people will not be bothered by this change, or the inconvenience is so small it can be ignored, because the end-state is better than the start-state.

If you deeply care about getting this special case removed, your goal should be to come up with a transition plan where the people that don’t care about whether float means just float or float | int, never even have to know this transition took place. Their code should keep passing static type checking, no new false positives should suddenly pop up on day X prompting them to audit their code for mistakes that aren’t actually mistakes.

The fact of the matter is the float special case has been around for a long time and tools have actually been teaching users to write float instead of float | int, since the latter is redundant. It’s a hard sell to force everyone to unlearn that knowledge and to make everyone aware that these lint rules should now be disabled/changed.

So your best bet is something like the Rust Editions feature (or more realistically a project-specific feature-flag) I’ve seen proposed. That way the new behavior is opt-in and the places where people strongly care about the distinction in the ecosystem can be gradually improved, without causing unnecessary churn and disruption for every little fun side-project, there will also be no rush this way and every project can decide to switch to these new semantics at their own leisure.

9 Likes

I think the actual transition would end up being much longer than 2 years, and would be a mess during the entire time.

The current behavior is outlined in both an informational PEP (PEP 483 - Subtype Relationships) and in a standards track PEP (PEP 484 - The numeric tower).

Changing the meaning would then have to be in a new standards track PEP, which I’m sure would take a long time to go through drafting, debate, and acceptance.

This change would fall under the Backwards Compatibility Policy outlined in PEP 387. The basic policy there says:

In general, incompatibilities should have a large benefit to breakage ratio, and the incompatibility should be easy to resolve in affected code.

Any PEP specifying this change would have to demonstrate that the benefit is large enough to justify changing the meaning of every single float annotation out there today (i.e. probably affecting every typed Python codebase in existence).

Part of accepting that new PEP would probably include seeing an implementation in at least two type checkers, at which point you could start opting in. Once the PEP is finally accepted, the transition period has a minimum of two years, during which time, type checkers would probably have to issue a warning, but not error, on every single bare float annotation (that is not marked by some explicit opt-in to the new behavior).

During that period, I would expect to see a flood of complaints here and on issue trackers asking why warnings are suddenly showing up everywhere. There are existing lint rules today that recommend changing float | int to just float, so a user could easily be in a case where conflicting type checker and linter versions means that you get a warning/error either way, which will only add to the confusion.

Most places (following the design intent of Python) should accept an int whenever they allow float, so the proper guidance is not “just turn off the warning”, it should be “change every float annotation to float | int”, so this change then recommends that the vast majority of users make changes across millions of lines of code just to actually not change anything.

How would I know that anyone is following the new version? You could assume that anything that says it takes a float should not be given an int and anything that says it returns a float (or has a float, like an attribute or in a container) could actually be int, but that’s the same scenario that we’re in right now