My question is more about whether x = 1.0; x = 2
would continue to be considered correct. Presumably, to ensure isinstance(x, float)
remains true at all times, even literals and constructors would need to be given the new, float-only type?
I think that some code will be good to details and explain the concept
I think we can leave this up to type checkers about what to do without an annotation, but that the annotation should be respected if given. I imagine that for pyright at least, in keeping consistency with it’s other behavior’s, if you intend only float, you’ll need to annotate as such for local assignments. I don’t want to cut off pytype’s behavior of inferring from total consistent use, and while I think other type checkers should do this as well, changing that for all type checkers would require a much larger discussion.
Can we get discussion on this going again?
If we’re allowing this to be expressed, one of these has to happen with either the use of negation or un-special casing it.
I think it’s pretty clear that one of these is more natural than the other.
There are a lot of clever proposals in this thread (type differences, intersections, AnyOf
, more specific float types, …), but I agree with the original poster that removing the special behavior seems like the cleanest and least likely to have subtle issues.
How painful would it really be?
It would be long transition. The first steps would be:
- Mark the existing special behavior as deprecated in the typing spec and encourage users to write
int | float
instead.- Perhaps a type alias in the standard library would be helpful:
type Floating = int | float
but I think writingint | float
isn’t too terrible and it has the advantage that it doesn’t require an import
- Perhaps a type alias in the standard library would be helpful:
- Add linting rules to all popular linters to flag
int | float
usage- for this it would be useful to have a type alias like
type StrictFloat = float
in the standard library, in order to signal to the linter that you really just meantfloat
; I think it’s fine if this requires an import, because it should be relatively rare
- for this it would be useful to have a type alias like
- For projects like
typeshed
, running the autofix of a linter should be able to easily convert everything; some manualStrictFloat
s might need to be inserted
As the second step, type checkers would implement strict handling of float annotations behind a flag, like --strict-float
.
Type checkers would then slowly ramp up the strictness. First make the warning be default and then make the error be default. This would take many years.
Having just read this thread I see that there are a few mentions about the idea that mixing int and float is intentionally part of Python the language so it is good that float
is interpreted as float | int
. I also see a few mentions of the idea that this only creates problems because of methods like .hex
or .fromhex
. I just want to clarify that this significantly underestimates the issues here.
The relationship between int
and float
that they are often allowed to be mixed is not unique. Many libraries have analogous types that relate in a similar way for example NumPy has np.int64
, np.float64
and many more. There are functions that convert int and float into these:
>>> np.array([1])[0]
np.int64(1)
>>> np.array([1.0])[0]
np.float64(1.0)
This is absolutely not something that is unique to numpy e.g. there are many array libraries and even the array API standard specifies this behaviour. There are also many other mathematical libraries like sympy, gmpy2, mpmath, and many many more. These libraries all have their own types that are analogous to int
, float
and complex
and their own functions for converting different possible inputs into their own types.
It is currently impossible to express using type annotations the fact that a function turns an int
into one type and a float
into another:
from typing import overload
class Int: ...
class Float: ...
@overload
def convert(x: int) -> Int: ...
@overload
def convert(x: float) -> Float: ...
def convert(x: int | float) -> Int | Float:
if isinstance(x, int):
return Int()
elif isinstance(x, float):
return Float()
else:
raise TypeError
The error from pyright is:
$ pyright c.py
c.py
c.py:7:5 - error: Overload 1 for "convert" overlaps overload 2 and returns an incompatible type (reportOverlappingOverload)
1 error, 0 warnings, 0 informations
There is no “overlap” in the runtime types here but pyright interprets float
as meaning float | int
in the overloads. If Int
was a subclass of Float
then it would be accepted. Having Int
be a subclass of Float
makes just as much sense as having int
be an actual subclass of float
though: it makes no sense at all because the internal representation of the data is completely different.
The inability to properly type the convert
function here makes it impossible for type checkers to infer types correctly for users of mathematical libraries that allow passing int or float in place of their own native types. For precisely the same reasons that the stdlib allows math.cos(1)
a library such as NumPy will allow users to use int
or float
in place of its own native types:
>>> import numpy as np
>>> np.square(2)
np.int64(4)
>>> np.square(2.0)
np.float64(4.0)
>>> np.square(np.int64(2))
np.int64(4)
>>> np.square(np.float64(2))
np.float64(4.0)
>>> np.square(1j)
np.complex128(-1+0j)
It is impossible though for a type checker to infer the types of any of these things if it cannot distinguish int and float properly even in overloads. These different types can be mixed just as freely as int and float can. However it is not possible for a library like NumPy to replicate the weird special casing of int, float and complex in the type system.
If there were a StrictFloat type then that could be used in the @overload
annotations to avoid the overlap but then the overloads would still be incompatible with anything annotated as float
:
def f(x: float):
y = np.square(x)
reveal_type(y) # np.int64 | np.float64
If it seems nice that you don’t have to write int | float
consider that in exchange you now have confused types like np.int64 | np.float64
everywhere else even though your actual code is not confused about the types at all. NumPy cannot reach into all the type checkers and hard-code a special rule that says np.int64 | np.float64
should be treated as being equivalent to np.float64
in annotations/inference just because users like to mix these types. Allowing float
to mean “maybe float” is an inconsistency in the base of the type system that breaks everything built on top.
The proper type to use most of the time for a public function parameter that can accept int or float should really be SupportsFloat
. This is better than float | int
because it also accepts Fraction
, Decimal
, np.float64
etc. The good way of allowing the different types is to do the conversion as early as possible at the public API boundary:
def cos(x: SupportsFloat) -> float:
return _cos(float(x))
def _cos(x: float) -> float:
...
Internal numeric code like _cos()
should be much stricter about the difference between int and float regardless of whether the public API nicely allows int. Having an int in place of a float (or a 32-bit float in place of a 64-bit float etc) is a serious bug in proper numeric code. If you do want to aloow float | int
somewhere then you can write that but I would not use that for any of the parameter/return type annotations for either the public cos
function or any internal function like _cos
.
The fact that both the Python runtime and also many Python programmers allow ints and floats to be mixed freely is precisely why it would be useful to have static type checkers enforce the distinction strictly. A single int
in a large list[float]
is unacceptable and static analysis is the way to enforce this properly. Accidentally writing return 0
in place of e.g. return 0.0
(or some other kind of zero) is a common bug and is precisely the kind of thing that a type checker should be able to help with.
Currently type checkers cannot even check SupportsFloat
properly because of this special case. All type checkers accept this:
import math
class Number:
def __float__(self):
return 1
math.cos(Number())
At runtime though:
$ python c.py
Traceback (most recent call last):
File "~/c.py", line 7, in <module>
math.cos(Number())
~~~~~~~~^^^^^^^^^^
TypeError: Number.__float__ returned non-float (type int)
The math.cos
function is implemented in C. It can allow non-floats by calling __float__
but it needs that to return the real nominal float type whose C-level struct has a real 64-bit float as payload in actual bytes in memory. There is no duck-typing or subtyping or whatever here: __float__
must return an actual float
.
The SupportsFloat type should be able to capture this requirement perfectly to support ducktyping with e.g. math.cos
in a well typed way. Type checkers wilfully misunderstand its return type though:
class SupportsFloat(Protocol):
def __float__(self) -> float:
...
There is quite simply for float
(just like every other type) a need to be able to refer to the actual nominal type.
I don’t see why a difference type (or intersections and negations) doesn’t meet this need
If the annotation float
means “the set of all possible runtime floats and runtime ints” and the annotation int
means “the set of all possible runtime ints”, then the way to refer to just “the set of all possible runtime floats” is float - int
or float & ~int
(which could be added as a special type in the typing
module without requiring support for arbitrary user-denoted difference, intersection, or negation types).
I don’t disagree that writing
class SupportsFloat(Protocol):
def __float__(self) -> float & ~int:
...
or
@overload
def y(a: float & ~int, b: float) -> float & ~ int: ...
@overload
def y(a: float, b: float & ~int) -> float & ~ int: ...
@overload
def y(a: int, b: int) -> int: ...
def y(a: float, b: float) -> float: ...
is more inconvenient than just being able to use float
directly, but that inconvenience for a minority use-case seems like a worthy tradeoff when compared to breaking backwards compatibility for every single float
annotation that exists in Python code today.
It has the problem I explained with StrictFloat:
Unless everyone uses StrictFloat
everywhere then it doesn’t work.
Keep in mind what is actually happening here which is that most “users” don’t write type annotations or at least don’t write complicated type annotations. For most users what happens is that the annotations are written in libraries and then interpreted for them by e.g. pylance in vscode. Think of the reveal_type
as being someone hovering their mouse over the variable.
It is one thing to say that NumPy contributors will write some ridiculously complicated overloads in the stubs so that pylance can consume them behind the scenes and figure out the types for the user. We can’t really expect NumPy users to know that their editor is showing Any
because they wrote float
rather than float & ~int
though. How should they know about an absurd bug in the type system that float
does not literally mean float
?
That seems like it works exactly as intended to me
In [1]: import numpy as np
In [2]: def f(x: float) -> np.int32 | np.float64:
...: return np.square(x)
...:
In [3]: type(f(1.0))
Out[3]: numpy.float64
In [4]: type(f(1))
Out[4]: numpy.int32
If we had a StrictFloat
type then I could say either
@overload
def f(x: StrictFloat) -> np.float64: ...
@overload
def f(x: int) -> np.int32: ...
def f(x: float) -> np.int32 | np.float64:
return np.square(x)
or
def f(x: StrictFloat) -> np.float64:
return np.square(x)
f(1) # type checker error
var: float
f(var) # type checker error
I’ll concede again that this is inconvenient, but that doesn’t mean that a difference type wouldn’t work
What you couldn’t do is write annotations for the library function f
so that reveal_type
would do what it clearly should do in this situation:
def user_function(x: float, t: int):
y = f(x)
z = f(t)
reveal_type(y) # should be np.float64
reveal_type(z) # should be np.int64
That is what the runtime types do and have done since long before type annotations in Python. It is also how a static type checker should infer the types.
Unless everyone including end users use StrictFloat everywhere instead of float
it doesn’t work.
And my counterargument is that putting an annotation of float
means that a user CAN pass an int at runtime (which is consistent with how Python has been designed to work from the very beginning), so a revealed type of np.float64
just means that the function annotations are incorrect
Isn’t this an inherent problem with fixing this special case? We currently have a situation where there are a very large number of annotations using float
, where some use float
to denote float | int
and others would prefer to indicate StrictFloat
. No matter how you slice it, resolving that ambiguity requires going through a period where some type annotations are either too broad or too narrow. The status quo is that some minority of annotations are already too broad, but there exists no other option that is more precise.
If we simply introduce typing.StrictFloat
and walk away, then yes, it will be largely useless. But, presumably, introducing typing.StrictFloat
would be paired with updating typeshed to correctly annotate the standard library regarding the output of functions.[1] As more functions are correctly annotated to return StrictFloat
, more functions would be able to accept StrictFloat
inputs without issue. And in the meantime, one can include TypeGuard
checks or type coercion at the boundaries of your own code, when it matters.
The only alternative that I can see would be some extended transition period (perhaps with a from __future__ import strict_float
or # type: strict_float
incantation) that would allow the mass of existing annotations to be updated from float
to float | int
where appropriate.
Given that choice, the question then becomes balancing disruption to users with having the type system more closely match the runtime and deciding which path to take. I think reasonable people can disagree about how to make that balance, but neither side of that balancing act is impossible.
For my own position (which is likely not worth much), I lean towards adding a StrictFloat
. That is mostly because, in my experience, the vast majority of uses of float
in annotations actually mean float | int
.[2] Forcing the vast majority of these annotations to be mechanically updated from float
to float | int
risks turning people away from using type annotations as all. Those instances where it actually matters are much more specialized and affected users are likely more motivated to move to new syntax that better conveys their intent.
The biggest counter-argument is the discrepancy with runtime behavior. Having isinstance(x, float)
behave differently from x: float
is less than desirable. But I suspect that letting this edge case remain is going to be more tolerable to the majority of users than the alternative.
Re-reading my post from a year ago, in the light of the comments made here recently, I no longer agree with my position back then. I don’t have an answer that I am comfortable with, but the arguments made by @oscarbenjamin that float
is the right way to spell an annotation meaning “you should pass a floating point value here” feel compelling to me. I also don’t like the idea of doubling down on the “weird special case” for float/int, and making it a permanent part of the language.
Long term, I feel that it’s pretty self-evident that float
should mean “a value of type float
” and int
should mean “a value of type int
”. The types int
and float
are unrelated at runtime, so they should be unrelated in the type system.
Getting there is the problem. But staying where we are is just as much of a problem - having to write the type “must be an actual float value” as float & ~int
feels like a terrible hack (even if we ignore the fact that intersection types and type negation don’t even exist yet!), and named aliases like ExactlyFloat
are just papering over the problem.
The biggest transition issue I see is with literal values. People expect to be able to use integer values like 0 or 1 in floating point contexts - having to write 0.0 or 1.0 feels annoyingly bureaucratic. Needing to fix type errors in statements like x: float = 0
feels like busy-work of the sort that gives typing a bad name with people like me (“low patience for type hints”). But without giving type hints a runtime effect (auto-conversion of int literals to float values) I don’t see how to address this.
So I’m sympathetic to people’s frustration with the short term situation, and I don’t have any good answers for the question of how we move away from it, but I do think that in the long term, float
should mean float
, not float | int
.
One issue here is some array libraries will let arrays of different types mix. Others will hard crash and strongly expect type consistency. If you try to add a float scalar value with int scalar value as tensorflow tensors, it just hard crashes on you. If you try to do same with numpy it does auto-conversion for you and lets them match.
Avoiding basic runtime crashes is a major goal of the type system. And for some popular libraries with millions of installs, mixing 1 vs 1.0 will easily lead to crashes.
edit: In practice, tf type stubs are work in progress and currently they don’t even try to handle things like checking data type/shapes as complexity added there with generics is higher then I (one of maintainers of those stubs) wanted to invest in.
I’ve been trying to get my head around the idea that there are Python users who would want to use static typing along with all of the other difficulties and restrictions imposed by it but would then find that needing to convert an int to a float was somehow problematic. If you can handle half a page of “Mapping is invariant in KT@Foo but … list[tuple[Foo[V], …]] is not assignable to …” then you can surely handle changing e.g. 0
to 0.0
. The error “int is not a float” is trivially understandable and trivially fixable and actually has clear reasons for being fixed. Obviously changing things retrospectively is disruptive but I really can’t imagine that if it had always been this way then this would be something that typing users would complain about.
It has never really been the case that you can just mix float and int without caring because they just are different. On a basic level for example they print differently. In Python 2 they used to divide very differently. When you actually want an int it is never acceptable to have a float because e.g. range(2.0)
is an error so any Python programmer needs to have a mental model that keeps the two types separate.
I suspect that if the situation came to pass then what you would realise is that you would rarely want to change the annotations from float
to float | int
. There are very few situations where that actually makes sense as an annotation. What you would find yourself doing is changing e.g. 2
to 2.0
or converting inputs with float(num)
so that the rest of your code uses float consistently. The most awkward change I can think of is needing to write sum(nums, 0.0)
.
If you had a class like:
class Stuff:
value: float
If this change came along would you want to change it to float | int
or would you want to ensure that the value is consistently of type float? I find it very hard to think of a situation where I would to change this to float | int
rather than adding a float(data)
in the constructor or fixing some code like s.value = 1
.
Of course if you think about public interfaces then there will be situations where you want to accept both int
and float
but you will still want to be consistent about the return type and I think that if you think about it SupportsFloat
is more likely to be a better parameter annotation than float | int
anyway.
I fully agree that there should be an easy way to distinguish a runtime float
value and that there are some domains where this is a crucially important difference.
If it were 2014, I’d agree that type annotations shouldn’t mix float
and int
together and PEP 484 should just include a note that most functions should prefer float | int
over float
since that’s usually fine
But that’s not the case and there is now a decade’s worth of code written that has a float
annotation implicitly also include int
and I don’t think there’s any realistic transition plan to move to those semantics now that doesn’t create even more problems and questions in the very long transition process.
Comparing 2 potential paths forward
Option 1: Remove the special case
Ideal - float
just means float
, no special casing and no weirdness
In practice
- Would probably require an opt-in flag for a multi-year transition
float
meansfloat | int
if it was written before 2026(?), unless the person who wrote it actually meant justfloat
but there’s no way to know for sure- If it was written from 2026(?) to 2028(?)+ and includes the opt-in,
float
for sure meansfloat
- If it was written after 2026(?) without the opt-in,
float
might meanfloat
and might meanfloat | int
, depending on what linters and type checkers do to help the transition, what tools the person writing it was using, how familiar the person writing it is with typing, how much they keep up with changes, etc.
Anytime you see the float
annotation for the next decade (and maybe longer), you’ll have to remember that could mean two different things and check when it was written and whether the writer actually knew or cared about the difference.
I don’t have any experience maintaining widely used tools like mypy
or pyright
, but I wouldn’t be surprised if they would be stuck with having to support even more fragile heuristics than they currently do to try to guess what float
means for at least the next decade as well.
Option 2: Accept the special case is here to stay and add something like typing.StrictFloat
to mean actually just float
If you don’t care about distinguishing runtime float
from fake typing world float = float | int
, you don’t have to do anything.
If you do care, you can now clearly and unambiguously declare that difference and everyone knows exactly what you mean (but might have to first read a paragraph in the PEP that adds StrictFloat
to understand why annotation float
doesn’t mean runtime float
)
Support can be backported to older Python versions by typing_extensions
Most typing novices won’t know that annotation float
doesn’t mean runtime float
, but that’s probably fine because whatever they are writing will probably work fine with an int
anyways. If they’re using type hints with a library that does care about the difference and only accepts typing.StrictFloat
, they’ll hit a type error they don’t understand and then find out about the difference.
If you see a type error from a library that uses typing.StrictFloat
, get bit by a bug, or notice a weird artifact like the np.float64 | np.int64
example in an IDE’s hover text, you go search something like “Python typing float shows float | int” in Google and immediately get an AI overview that says “float
actually means float | int
, if you want to only allow float
, use typing.StrictFloat
” followed by actual search results with links to the documentation that explains it, a PEP that defined it, a bunch of blog posts complaining about it, etc.
If you need to accept strictly float
s from other libraries that don’t add the new stricter version to their return annotations, you can add a type guard or an explicit defensive float()
call.
Option 2 sounds like a much better plan to me, even if that means we are forever stuck with an ugly wart of
float == float | int
typing.StrictFloat == float
As far as I am concerned this is already the case. I have used complex
with the intention that it implicitly means complex | float | int
and I have also written float
with the intention that it really does just mean float
. The annotations as they are today cannot be trusted if you want to disambiguate what would have been written if these things could have been expressed differently at the time.
Much of the standard library intentionally accepts int
when float is intended, and that ethos extends to third party libraries like numpy
, and, critically, in user code that calls these libraries. I would hope and expect that behavior to continue: it makes Python more beginner friendly.
I appreciate your other point that a developer knowledgeable enough to use python’s type annotations can handle changing float
to float | int
. My main worry is in the make-work involved in changing those annotations will be viewed as a sign that Python type annotations are more of a chore than a benefit.
Why do you find it hard to believe? If the program works correctly with float | int
why should I spend time “fixing” something that isn’t broken?
What about Option 1b: Create typing.StrictFloat
, discourage/deprecate use of bare float
in type annotations (through lints and/or type check errors), and give type checkers leeway in how to interpret float
annotations outside of unions with int
. Wait until “in the wild” uses of ambiguous float
annotations reaches an acceptable level (maybe 5 years?), then issue a new PEP formalizing the new meaning of float
and soft deprecating typing.StrictFloat
.
Code whose annotations aren’t updated in that timeframe then (maybe) raises type check errors and gets fixed (or silenced).
This is already true today. Every float annotation ever written is ambiguous. As a user, I have no idea whether the author was aware of the special case or not when I see a : float
annotation.
Maybe we could a project-specific (or package-specific) configuration, similar to Rust’s editions:
When creating editions, there is one most consequential rule: crates in one edition must seamlessly interoperate with those compiled with other editions.
In other words, each crate can decide when to migrate to a new edition independently. This decision is ‘private’ - it won’t affect other crates in the ecosystem.
In Rust it’s specified in the Cargo.toml
, but we could use the py.typed
or pyproject.toml
for that.
And instead of year-labelled “editions”, a minimal key-value format will suffice for our purposes, so that we can write something like promote_types = false
.