Options for a long term fix of the special case for float/int/complex

I don’t think anyone has done that? The people arguing for this have pointed out various options here that allow users to actually switch. I pointed out above that people can just pin their typechecker version until they are updated, and because the new case is both correct and more restrictive, such an option does work as expected.

There’s no way to do a coordinated update (thanks to every typechecker having it’s own release schedule, and specification intentionally being decoupled from the language), and this behavior is broken in a way that has real negative impacts. The best we have here is actually commiting to doing something, and then letting typecheckers decide how they pass that to their users. If anything, the people that have been saying “We’ve had this too long, we can never change it despite that we accept it is broken” are actively discounting the people who this negatively impacts, because they havent offered any alternatives that actually remove the ambiguity (and some have gone so far as to say it shouldn’t be changed at all)

This is a problem with the way the specification exists. Paired with people assuming we can’t fix mistakes, and leaving implementation up to every single tool seperately means having holes in type system that are intended to be “pragmatic” is actually harmful because it blocks accurate progress. If the specification isn’t pragmatic, typecheckers can choose what level of pedanticness is appropriate (or provide multiple levels of pedanticness), but there’s no rule actively preventing a compatible expression of what is correct.

3 Likes

I have actively been suggesting an alternative that allows removing the ambiguity in a backwards compatible (if unideal) way.

While I disagree that your proposed alternative removes the ambiguity involved, I was not including you in the people who have expressed that this can never change and are creating impossible dichotomies.

1 Like

How would this work with library code? How does a library using one typechecker communicate to another typechecker what “edition” it is using. We dont have a mechanism for this, so any attempt to add it is just as breaking as changing it without it, but I concede we could get more benefits to future changes.

I hit ctrl+enter rather than enter there :frowning:

More to the point, how does it work if you have multiple libraries on different editions? Are typecheckers required to implement all prior editions and know how to appropriately handle the boundary?

I think that it would be easier to intentionally tie the type system’s behavior to the python version than this, and that’s including libraries that support multiple python versions.

I think that’s is a very pessimistic way to look at this and part of the reason you’re seeing so much pushback. I think we can do better than that, if we really try to.

There’s no reason the type specification can’t include language for per-project feature flags that allows projects that have this feature enabled to live side-by-side with projects that have not.

Yes, it needs buy-in from type checkers. But I think you will have a larger chance of success getting that, than convincing every type checker to do this breaking change and deal with the consequences and flurry of user-complaints on its own terms.

We already have one setting in py.typed that every type checker needs to understand[1]. So there is precedent for configuration that is common among all type checkers, so we could add more, given a good enough motivation.


  1. partial ↩︎

1 Like

Adding to that mechanism when it doesnt exist already is just as breaking. I’m not saying we can’t coordinate with typecheckers about this in any way, but that the current tools we have don’t lend to doing that in a way other than socially (which somewhat by neccessity, means organize with them, they then organize with their users)

If a project adds a marker to py.typed, they still break compatability with anyone on a version prior to that. This has been seen already with the wheel specification and packaging even where there is that mechanism that this prevents progress.

2 Likes

That really depends on the status quo. If some type checkers can’t deal with unknown content in py.typed, then maybe that’s the wrong place for this kind of configuration and we need something new. Either way, we can definitely come up with a design that doesn’t break with older versions of type checkers.

Yes, you won’t get the feature of the feature flag on the old version. But whether that actually causes a problem will depend on the feature. I don’t see a large potential for issues being caused by float continuing to be treated the way it always has been on older version of type checkers, that would happen either way.

I thought of other questions in addition to the prior two, how should editions work for runtime typecheckers? Also, how should this work for namespace packages?

There’s a lot here to go wrong and need definitions, and I don’t see a strong benefit over “pin your typechecker” or “tie typing specification to python version to create a point of coordination and a way to tie it back into python’s breaking change policy”

For reference: Options for a long term fix of the special case for float/int/complex - #104 by jorenham

This idea was mentioned above:

I think that can work as a starting point. A future import is problematic because it means libraries waiting for e.g. 3.15 to be the minimum supported Python version. A type comment directive (I don’t know what that is called) makes it possible to write correct annotations as soon as major type checkers support it. I assume that this would not crash or anything with older type checkers even if they do possibly misunderstand the annotations.

I imagine this like:

Type checkers agree to respect a per-file # type: strict_float directive in .py and .pyi files. The meaning of the directive is that it changes the interpretation of float as used in type expressions for that file so that “float means float”.

Without the strict_float directive type checkers will work as described in the proposal in the “clarifying the special case” thread where float in type expressions always means float | int which is more or less the status quo.

The full benefit of this could only happen if float means float everywhere but that would have to be something for the future to allow time for adaptation and fixing annotations. In the mean time if a library uses strict_float and then the user of that library (including other libraries) uses strict_float as well then they no longer have the problems caused by the special case.

What this would mean is that typeshed and libraries could make their annotations accurate and in particular could express overloads and return types correctly. It could be possible immediately for type checkers to infer the types of a and b correctly here:

def f(x: float):
    a = np.array([1])
    b = np.array([1.0])
    c = np.array([x]) # this needs strict_float

If f is in a module with strict_float then c would be inferred correctly as well but otherwise it would be a union. The other cases can work because as I understand it the type checkers distinguish between the types of 1.0 and x here.

What prevents inference for a and b right now is the type checker misunderstanding float in the @overload signatures rather than misunderstanding the concrete types of 1 and 1.0. Inference for c then also requires that float in f’s signature be understood as meaning a real float.

Library code that looks like this:

def g(x: float) -> float: ...

should usually be changed to something like:

# type: strict_float
def g(x: float | int) -> float: ...

That will work for a caller with z: float with or without strict_float:

def h(z: float) -> float:
    return g(z)

More difficult are cases involving invariance like list[float]. If caller and callee don’t agree on the meaning of float then is it impossible to pass a list[float] from one to another?

Another case is SupportsFloat. Ideally you want this:

# type: strict_float

class SupportsFloat(Protocol):
    def __float__(self) -> float: ...

I think that a type that is not itself defined in a module/stubs with strict_float would not be able to write annotations that satisfy this protocol. Maybe this would have to be a change in a particular Python version.

It would take time for libraries to adapt to this but I think for libraries where this is already problematic (e.g. for inference) they will want to opt in to this quickly and enthusiastic users including other libraries would opt in as well.

Having an opt-in # strict_float directive also resolves the questions about ambiguity somewhat because if the directive is there then it can be assumed that the annotations were updated with the understanding that float means float. Of course any file that doesn’t have strict_float would still be ambiguous.

One day though to get the real benefit something needs to flip the switch and make strict_float the default. That could perhaps be tied to a particular Python version. In the mean time I think that tangible benefit can come from doing this incrementally which would be ultimately necessary anyway because of the needed library changes.

9 Likes

I don’t see a problem with either of those. If runtime type checkers really wanted to support a per-module feature flag, they could[1] and nothing stops you from marking each module in a namespace package separately, in fact that’s what you may want in some cases, since they might not be maintained by the same people or in the same repository.

For the record, I’m not disagreeing with you, that this would need to happen for the full benefit. However getting most of the benefit for the part of the ecosystem that cares about that benefit, might be a good enough long-term solution, if large-scale adoption beyond the standard library and the numeric ecosystem never happens, since you will have an even harder time convincing anyone that doing the breaking change yields a large enough benefit compared to the amount of disruption it causes at that point.

And with that I’m bowing out of the discussion. My main hope was to steer the discussion in a more productive direction, rather than for it to keep going in circles. You will not be able to convince a majority that the special case should be removed at any cost, nor will you be able to convince a majority that the status quo or proposed future features are good enough to work around it. So we either need a lower transition cost, better features or both.


  1. admittedly some designs will make that easier/harder on them than others and you can construct cases where they have no chance of figuring out that a value crossing module-boundaries should use the old or new semantics, but in those very same cases they’d have trouble to figure out anything at all. Either they have access to the annotation, so they know where it came from and they can apply the feature flag, or they only have access to the runtime type and they can’t do anything either way. ↩︎

Maybe there should be a new thread about “How to make the transition towards strict floats less painful” where we concentrate on that question, and this thread here can still discuss whether removing the special case is a good idea in the first place.


Regarding something like # type: strict_float per file: I think doing it per package is sufficient granularity? I really like the idea of putting “strict_float” into the py.typed file, which signals to type checkers that they should not expand float to float | int in this package. This would allow a gradual adoption.

It does seem desirable though to have a standard way of telling linters that you really meant just float and that the linter shouldn’t complain about it. Something like

def f(x: float) -> None:  # noqa: only_float
    ...

which signals that the library author has thought about it and thinks only float is correct here.


I think this can work. Libraries and type stubs do need to all be converted, but end-user code can keep using float to mean float | int, as long as they configure their type checker to do it like that.

It is IMO a bad idea to have type annotations mean different things depending on where they are used - it makes it harder to use, understand and learn.

The consequences of this is that IMO there are two acceptable solutions I can currently see:

  • Continue to use the special case and add a new JustFloat type somewhere - IMO this is the most likely outcome of all these discussions.
  • Add a new Real protocol in the builtins that has no literal relation to the numbers.py ABC [1], but instead represents the useful operations you can do on floats & ints (& numpy scalars & numpy arrays & …)

  1. Those are dead ends and should be completely removed from the language ↩︎

I would say yes, you won’t be allowed to do this, since allowing it would break invariance.
Users will have two paths available as recourse: ignore comments and switching to strict float typing.

(All on the assumption that we allow a narrowly scoped opt in.)


I’m in favor of making it opt in at first.

Tying it to the language version carries a lot of positive social signals.
It indicates that this is a slow and methodical change, with a thought-through transition plan.

It also aligns with providing a future import as the opt in mechanism. And connecting the type system version to the language version makes Python seem more cohesive.

2 Likes

Above someone said that there are issues with doing this with py.typed so I don’t know if that can work. That level of granularity would be good but I see that as a convenience for library authors whereas a per-file flag would still be needed to handle all cases. If you write a standalone script/notebook that uses numpy etc then you could use # type: strict_float there even though you don’t have a py.typed file. It could also be useful for a library to migrate stubs incrementally by adding the directive one file at a time.

It is a bad idea but the problem is the special case rather than strict_float. The special case makes typing harder to use, understand and learn. The bad idea is that float does not mean float. Having StrictFloat is still confusing because people still have to learn the difference between float and StrictFloat and forever deal with float being an ambiguous annotation.

This would be good but I think that it does not really work the way that people expect. The problem with the numeric tower is that it supposes that you just have some object x of say Real but does not say how you can do useful operations with that object like compute sin(x) or create a 2 of the same type as x so you can’t really use it to do anything more useful than just converting x to a float. It is quite clear that the numeric tower was designed by people who are more interested in class hierarchies than in writing numeric code with different types.

The Real type is just defined as having the operators that make an ordered field but meaningfully working with a generic field even without functions like sin and cos means needing to be able to do basic things like create a 1 or 0 of the correct type. This for example is a bug inherited from the Real ABC:

>>> from fractions import Fraction
>>> f = Fraction(0)
>>> (f.real + 1) / 3
Fraction(1, 3)
>>> (f.imag + 1) / 3
0.3333333333333333

The bug is writing return 0 in a situation where a different type of zero should be returned but Real does not know how to create anything of the correct type.

What the numeric tower provides that are useful are the methods __complex__, real, imag, __float__, numerator, denominator and __index__. They are useful not because they allow you to work with a given type but because they allow you deconstruct and convert to known types that you can work with. If you wanted to work with the original types you would also need constructors to do things like make a Complex from two Reals or a Rational from two Integrals or create any type from an int but the numeric tower has no constructors.

If you actually want to work with the given Real type then you need to have functions like sin and cos that work with the type. In Julia those functions use multiple dispatch so if you have x you can compute sin(x) if it is defined for the type. In Python the way this works is different. It is not that the type x knows operations like sin and cos but rather that you have an object with a set of functions that work with a given type and will coerce inputs to that type like:

def sin(x: T | int) -> T: ...

Generic code here needs the domain/namespace/context object that holds these functions. Typing this generically is complicated but it would be something like:

from typing import Self, Protocol

class EField(Protocol):
   def __add__(self, other: Self | int, /) -> Self: ...
   def __pow__(self, other: int, /) -> Self: ...
   ...

class RealFuncs[E](Protocol):
    def sin(self, x: E | int, /) -> E: ...
    def cos(self, x: E | int, /) -> E: ...
    ...

def generic_code[E: EField](D: RealFuncs[E], x: E):
    return D.sin(x)**2 + D.cos(x)**2

import math
a = generic_code(math, 1)

import cmath
b = generic_code(cmath, 1)

import numpy as np
c = generic_code(np, 1)

import mpmath
d = generic_code(mpmath, 1)

ctx = mpmath.MPContext()
ctx.dps = 50
e = generic_code(ctx, 1)

import gmpy2
f = generic_code(gmpy2, 1)

for v in [a, b, c, d, e, f]:
    print(v, type(v))

It is the domain object D that makes this work with its functions sin and cos. The other argument to generic code is always 1 but in the same way that math.cos can accept int all of these functions can coerce int as well. It is dispatching on the domain object D rather than the type of the argument x that makes it possible to work with different types and still be able to pass an int for x.

These functions will coerce many more things than just int so the signatures are generally like:

def sin(x: Coercible[T]) -> T

It is hard to define the Coercible[T] type in a generic way though beyond writing something like T | int.

The array API defines a way to get the domain object from an array:

>>> import numpy as np
>>> a = np.array([1])
>>> D = a.__array_namespace__()
>>> D
<module 'numpy' >
>>> D.sin(a)
array([0.84147098])

Using __array_namespace__ works like multiple dispatch so it only works if you are very strict about the types so that an array is definitely an array of the expected type rather than something like 1 or [1, 2] that could be coerced by np.sin. Dispatching on the type of a as this does is very different from the model of typing where you can expect to be able to pass an int in place of the proper numeric type that you want to work with.

could be as easy as

strict_float = float & ~int
1 Like

This is exactly what I was thinking as I wrote my earlier comment, but you have fleshed it out much more clearly and completely than I could have!

These examples[1] make a strong case that there needs to be some sort of escape hatch on a per-annotation basis: if I have an existing codebase using the old meaning of float, but am calling to a library that updates to use strict_float, and can’t afford to update my annotations now, there needs to be some way to express that I am providing objects that satisfy the type checking of the library.[2]

While a per line typing comment could work, comments like these are a pain to format, particularly if a single line contains multiple annotations. I think I favor the creation of a special type in typing that conveys “float and not int” to all type checkers in all instances, as an escape hatch / transition plan (I’m calling it StrictFloat but don’t really care what color that shed is painted).

The runtime implementation of StrictFloat would just be StrictFloat = float, but the typing spec would be updated to state that StrictFloat must be interpreted as “float and not int”.

The documentation for this type would be explicit that it exists only to ease this transition, and that its use is discouraged in any code in strict_float mode. Once strict_mode becomes the default, this type would be treated the same as typing.Tuple is now.

I think there’s a very real concern that tying this transition to a language version would be too slow, and that it would specifically punish libraries that need strict float typing and support older Python versions (particularly if they use inline typing). Oscar made a very good point above that waiting 5 years before I can even start using the from __future__ import strict_float in my library code is untenable.

If we do go this route, then the StrictFloat transitional type will be essential, and it would likely become much more entrenched and be in active use for much longer. TBH, at that point, I’m not even sure it would ever be worth it to make strict_float the default.


  1. although one can make a case that the typing spec could safely require that any method called __float__ must return a float, not an int ↩︎

  2. It is very likely that runtime type checking would also be needed, but that is a separate consideration. ↩︎

Granted! I think that’s a fair criticism. But if the alternative is not to have a transition, I prefer this.

I’m still waiting to use things which are “new” in 3.10 and that’s okay, but there’s also a much clearer workaround for missing runtime code.

Is this resolvable by giving us a way other than a future import that we can spell this today? For example, could the behavioral backport be offered through typing_extensions or as a structured comment (like file encodings)?

IMO we’re at a phase in this discussion in which we should start trying to be more concrete about the plan of action. It’s much easier to debate a specific plan – good or bad – than it is to navigate an idea cloud.

To my mind, there’s two steps here, which can (and perhaps should) happen simultaneously:

  1. Allow current code to express the idea of “float and not int” through the creation of a typing.StrictFloat type in 3.15. This type should be backported to typing_extensions for use in all currently supported versions of Python.
  2. Declare as part of the typing spec the intent to transition to “float means float” in the long term, and, as part of the typing spec, declare a directive such as # typing: strict_float that can be used to apply that new intent to code now.

While these need not happen simultaneously, (2) will have very little uptake without (1) being in place. Conversely, creation of a typing.StrictFloat type without an explicit declaration of the intent to move to the new meaning of float could stall the momentum to complete the transition.