Make float.__(i/r)floordiv__ return an int

@ntessore
I didn’t just come up with the term “integer division”, it’s mentioned in the doc (see the first note in the first table) as something it’s also called. And it already returns an integer value regardless of type, so the name makes sense.

But the name is not the point here (call it whatever you like, I can use the term floor div if you prefer), the behavior is. And by the way, as pointed out earlier, math.floor does return an int even when given a float.
The point is that, particularly in the case of liz[ln//3], it can be very useful to use the floor division result as an int, in situations where the type matters (such as indexing a list or using range), because, and that’s my more global point I explained in more details earlier, the int API is a superset of the float API.

So far, the counter-arguments are

  1. the cost of the change itself, implementation-wise
  2. the possible and likely execution time and memory usage cost
  3. the risk of changing a long-standing implementation detail

Needless to say, I’m not convinced either of these reasons are sufficient to (definitively) dig the grave of that feature. An effective implementation would have to be found, yes, but other than that I think it’s worth it, if only because there are use-cases enabled by my version and none disabled by it (as far as I can see).

@steven.daprano
Let’s take the most recent example, in a graphics/render system. I have an image, having a width and a height which can be subpixel (aka, a float). I want to slice that image, let’s say approximately every 10 pixels, so I do nslices = height//10, and generate random slicing heights with cutoffs = [random.(...) for _k in range(nslices)]. And it fails when height ends up being a float. Or when the approximate number of pixels, which can be provided by the user, is a float instead of that 10 (and it’s not absurd, since it’s a mean value it doesn’t have to be an integer even if the height of every resulting slice is integer).
I had it in another context where I had regular slices, and a position of the mouse and to find the slice it fell on the calculation was something like slices[mouse_y//total_height], where both the position and the height could be floats.

No it doesn’t. It can also return NANs or infinities.

>>> float('inf')//2
nan
>>> Decimal('inf')//2
Decimal('Infinity')

Not really. There are float methods which ints do not support, such as float.fromhex, hex, and is_integer.

And of course many parts of the float API, such as many of the functions in the math module, don’t support ints directly, but rely on coercing the int to a float. Which means that for large values, you will lose precision, or overflow.

It seems to me that nslices = int(height)//10 will do what you want. Although round(height/10) may be better.

I suggest you read the messages above, I answered that. And is_integer is of very limited use on a value coming from a // operation…

Yes, so ? This is a general fact about ints and floats, it’s not specific to floor div. In the relevent context where we would pass a // value to such a function, how is there an advantage to lose precision sooner, at division time, rather than later, at function time ? And why should we sacrifice precision for all division results just so that calling math functions on them is quicker ?
And more importantly, why should the use of the math float functions be optimized for rather than the int API, when the values are always integers ? (except when they’re nans but I already answered that too)

A better one would be to cast the result, since either operand can be a float when the divisor is user-provided (and rounding the divisor will change the result). But in any case, it’s very weird to have to manually cast both operands or the result to an int when the result is, by definition, already an integer value.

In the second case the correct py3 code is slices[int(mouse_y//total_height)] (with one or two slashes, doesn’t matter if both operands are positive), which is hard to read, unnecessarily long and (maybe a personal opinion) quite ugly by passing a function call inside an indexing.
In addition to this, I’m actually losing execution time converting the value to an int, when if it had been built that way since the beginning, the conversion time could perhaps be avoided. And I’ll say it again, it’s weird to have an operation that’s made to return an integer value, be optimized for the float API of the returned value, when that’s likely not what the program will use it for (or it would have done a true div).

(It doesn’t change the fundamental argument of course, but just as a quick sidenote, int.is_integer() was added in 3.12.)

1 Like

Actually, a lot of points made in the issue discussion go in my direction : ints being accepted when floats are, being an “almost-subclass” of float… (typing-wise, that is)
We can’t take that too litteraly, of course, because of the hex method, but if it was enough to be ignored in typing rules, I think it’s a pretty strong point about the int API being a superset of the float API.
And let’s not forget that a code such as this : (a//b).hex() was relying on an implementation detail, and should probably not be granted a support that would normally be specified in the doc. In other words, there was never a guarantee for floordiv results to even support the float API to begin with.

I understand that backward compatibility is important, so that may make this a non starter, but the rest of the arguments haven’t been convincing to me. In particular, going back to the original PEP – there were two key points:

  1. in a dynamically typed language having the value of the results of an operation depend on the types (Not values) of the operands was a “bad thing” (wart). “true division” and “floor division” did fix that)

  2. almost everywhere, int and float can be used in the same context.

The PEP says: Floor division will be implemented in all the Python numeric types, and will have the semantics of:

a // b == floor(a/b)

it does add: “except that the result type will be the common type into which a and b are coerced before the operation.”

However, I am curious about that second bit, particularly as math.floor(a/b) now returns an integer – which it did not when the PEP was approved.

Given that math.floor() and math.ceil() return ints, it does seem particularly odd that floor division doesn’t. Related to point (1) above, it looks like the change to floor and ceil were a result of PEP 3141 – A Type Hierarchy for Numbers, which came six years after floor division was proposed.

(which was for Python 3.0, and did not have a __future__ import.

" 1. __floor__(self), called from math.floor(x), which returns the greatest Integral <= x."

I pretty sure in that context “Integral” refers to an integer type, not an integral value. It also says:

“In 2.6, math.floor , math.ceil , and round will continue to return floats.”

Which extended through to the end of the py2 series (2.7)

Note that the “Real ABC” in the PEP, has:

    def __floordiv__(self, other):
        """The floor() of self/other. Integral."""

Again, Integral meant an Integer type in this context.

Interestingly, there is no other mention of floor division in that PEP. All this makes me wonder if not making floor division return a int was an oversight. Or, not an oversight, maybe too much of a breaking change – while py3 had multiple breaking changes, __future__ division had been around for quite some time, so perhaps it was considered already established, as for the most part, py3 would behave as the __future__ imports did without any other changes.

Does anyone recall the discussion at the time?

As for the arguments in this thread:

Almost all of them apply to the round() and floor() and ceil() – and yet those all were made to return ints.

As for dispatching on type:

As floor division can return any number of types, but will always have a integral value, I’d think you’d get more errors by dispatching on the return type this way that if it always returned an int. Granted, any change is potentially a breaking change, but this seems like it would fix more bugs than it would cause.

NOTE: I’d have to go dig up the code, but I know that when i ported a substantial project to py3 from py2, I found myself having to annoyingly wrap an int() around floor division operations – the idea that if you really need an integer, you’d be passing in integers for both operands is simply not correct. (just like floor and ceil and round …)

Back to the original motivation – yes, we always get an integer value which is great, but it’s a simple fact that there are places where you can’t use a float with an integer value in place of an integer – but you can almost always use a integer where a float is expected.

Again, backward compatibility is important, changing this would break (at least some) code – so it’s probably not worth it, but that’s not the same as saying it’s not a good idea.

2 Likes

Unlike round, floor, and ceil, // and divmod are binary functions and the reason they return floats instead of integers is, I think, just the result of coercing.

That said, I like the OP’s idea, since it seems SO intuitive. I think many people still misunderstand that // always returns an integer. I wasn’t aware of it until I ran into the problem which comes up since 3.10;
C++ built-in function doesn’t do an implicit conversion.

Updating from 3.9 to 3.10 required many codes to be fixed such as

cfunc(a//b) --> cfunc(int(a//b))

Apart from that, there is a problem with using float for //:

>>> 1//0.1
9.0
>>> divmod(1, 0.1)
(9.0, 0.09999999999999995)

IF // was to return an integer, the float-integer conversion problem still remains.
That is, they would expect 10//0.1 to be 10, which unfortunately will never be.
So, this is the kind of thing: “Don’t do it even if you can do it”.

I’m now thinking that math.fdiv would make some sense, as a counterpart to math.fmod. The math.fmod docstring documents it as returning something “equal to x - n*y for some integer n”, and it would be nice to be able to actually get hold of that integer n.

Below is some non-Fraction-based Python code that could (after translation to C, of course) form the basis of such a function.

First, here’s an inner function that does the job of computing both quotient and remainder when dividing a float x by another float y under some restrictions - namely, that both x and y should be finite, that y should be positive, and that x / y shouldn’t be too close to overflowing. It guarantees that the returned remainder is smaller than y in absolute value, but doesn’t make guarantees about the sign of that remainder:

def _fdiv_inner(x: float, y: float) -> tuple[int, float]:
    """
    Divide one float by another, giving integer quotient and float remainder.

    Assumes y is finite and positive, x is finite, and x / y does not overflow.
    Returns a pair (q, r) such that x == q * y + r (exactly) and |r| < |y|.
    """
    acc = 0
    while y <= abs(x):
        q = round(x / y)
        # Could use a single fma in place of the two lines below, if supported:
        # x := -fma(q, y, -x)
        xlo, xhigh = exact_mult(q, y)
        x = x - xhigh - xlo
        acc += q
    return acc, x

Here exact_mult multiplies q by y and gives an exact answer in the form of the sum of two floats: a low and a high part; the code for that is below. We could use a single fused-multiply-add for this, where available. The key point here (which requires some analysis and proof), is that after q = round(x / y), the value x - q * y is always exactly representable as a float with no precision loss; then all we need to do is compute it.

Given this restricted inner function, here’s an outer function that drives it, deals with special cases, and the like. In particular, we have to do some work to deal with the possibility that x / y might overflow.

def fdiv(x: float, y: float) -> tuple[int, float]:
    """
    Divide one float by another, giving integer quotient and float remainder.

    The result (q, r) satisfies x == q * y + r (exactly), and abs(r) < abs(y),
    with r having the same sign as x.
    
    Raises ValueError if either x or y is not finite, and ZeroDivisionError
    if x is finite and y is zero.
    """
    # Handle error cases.
    if not (math.isfinite(x) and math.isfinite(y)):
        raise ValueError("x and y must be finite")
    if not y:
        raise ZeroDivisionError("Division by zero")

    # Reduce to the case x and y finite, y positive, x nonnegative.
    sign_x, x = sign(x), abs(x)
    sign_y, y = sign(y), abs(y)

    # If x is *much* larger than y, x / y will overflow. In that case we do a
    # form of long division: first divide with a shifted version of x, then
    # shift back and continue with the remainder. In the common case where the
    # exponents of x and y are not too far apart, the for loop body is executed
    # only once. In extreme cases, it'll be executed two or three times.
    exp_diff = math.frexp(x)[1] - math.frexp(y)[1]
    x_shifts = list(range(exp_diff - 954, 0, -954)) + [0]
    acc = 0
    for x_shift in x_shifts:
        q, r = _fdiv_inner(math.ldexp(x, -x_shift), y)
        x = math.ldexp(r, x_shift)
        acc += q << x_shift

    # Ensure that the returned remainder has the same sign as the original x.
    if x < 0.0:
        x += y
        acc -= 1
    x += 0.0

    return acc * sign_x * sign_y, x * sign_x

Here sign does the obvious thing, returning the integer -1 if its argument has its sign bit set (so including the case of negative zero) and 1 otherwise.

Here’s the whole thing, including the various helper functions.


import math

C = float.fromhex("0x1.0000002000000p+27")


def sign(x: float) -> int:
    return int(math.copysign(1.0, x))


def split(x: float) -> tuple[float, float]:
    """Veltkamp splitting. Returns low, high parts."""
    p = C * x
    h = p + (x - p)
    return x - h, h


def exact_mult(x: float, y: float) -> tuple[float, float]:
    """Dekker exact multiplication. Returns low, high parts of the product."""
    xl, xh = split(x)
    yl, yh = split(y)
    h = x * y
    return -h + xh * yh + xh * yl + xl * yh + xl * yl, h


def fdiv(x: float, y: float) -> tuple[int, float]:
    """
    Divide one float by another, giving integer quotient and float remainder.

    The result (q, r) satisfies x == q * y + r (exactly), and abs(r) < abs(y),
    with r having the same sign as x.

    Raises ValueError if either x or y is not finite, and ZeroDivisionError
    if x is finite and y is zero.
    """
    # Handle error cases.
    if not (math.isfinite(x) and math.isfinite(y)):
        raise ValueError("x and y must be finite")
    if not y:
        raise ZeroDivisionError("Division by zero")

    # Reduce to the case x and y finite, y positive, x nonnegative.
    sign_x, x = sign(x), abs(x)
    sign_y, y = sign(y), abs(y)

    # If x is *much* larger than y, x / y will overflow. In that case we do a
    # form of long division: first divide with a shifted version of x, then
    # shift back and continue with the remainder. In the common case where the
    # exponents of x and y are not too far apart, the for loop body is executed
    # only once. In extreme cases, it'll be executed two or three times.
    exp_diff = math.frexp(x)[1] - math.frexp(y)[1]
    x_shifts = list(range(exp_diff - 954, 0, -954)) + [0]
    acc = 0
    for x_shift in x_shifts:
        q, r = _fdiv_inner(math.ldexp(x, -x_shift), y)
        x = math.ldexp(r, x_shift)
        acc += q << x_shift

    # Ensure that the returned remainder has the same sign as the original x.
    if x < 0.0:
        x += y
        acc -= 1
    x += 0.0

    return acc * sign_x * sign_y, x * sign_x


def _fdiv_inner(x: float, y: float) -> tuple[int, float]:
    """
    Divide one float by another, giving integer quotient and float remainder.

    Assumes y is finite and positive, x is finite, and x / y does not overflow.
    Returns a pair (q, r) such that x == q * y + r (exactly) and |r| < |y|.
    """
    acc = 0
    while y <= abs(x):
        q = round(x / y)
        # Could use a single fma in place of the two lines below, if supported:
        # x := -fma(q, y, -x)
        xlo, xhigh = exact_mult(q, y)
        x = x - xhigh - xlo
        acc += q
    return acc, x
1 Like

I have read the messages above. All of them.

If we are going to say that the int API provide a superset of the float API, then ints have to support every method and operator that floats support, which they don’t.

But we can agree that the int API is almost a superset of the float API:

  • missing at least three methods (two as of 3.12);

  • only partial support from the functions in the math module;

  • no int version of infinity or NAN.

There may be other differences.

Ints are not a true super type (in the Liskov sense) of floats. They might be “close enough” for some purposes, but that leaves plenty of landmines waiting to strike. For example, mypy 0.931 fails to recognise the second type error in this code:


def function(x: float) -> str:

    return x.hex()



print(function(1.2))   # This is okay.

print(function(None))  # Obvious type error.

print(function(1))     # Type error (AttributeError) missed by mypy.

This is the sort of landmine we create when we prefer convenience over correctness :frowning:

The point of this is that we shouldn’t gloss over the fact that changing the result of float floor division is a breaking change.

  1. You have to pay the conversion cost of float to int one way or another. The only question here is whether it is done in Python code int(x//n) or inside the float implementation of __floor_div__.

  2. Inside the float dunder may be slightly faster, but it is also premature optimization. What sort of work are you doing with those images that the cost of calling int() is significant?

Do you think it is weird that math.sin(0.0) returns the float 0.0 instead of an int?

Is it weird for math.sqrt(100) to return 10.0 instead of an int?

Is it weird for math.log(1.1719142372802612e+16) to return a float instead of an int?

I say that the answer is No for all of those, and I say that it is not weird for float floor division to return a float either. Having a float operation return a float is about as far from “weird” as is possible.

But what about math.floor and .ceil, and round, I hear you ask?

I am not convinced that the Python 3 change to those functions was a good idea. Back in Python 2, we had the very sensible:


>>> math.floor(float('inf'))

inf

This makes perfect sense! The floor of an infinite quantity is still infinite. But now in Python 3 we get an unnecessary exception for a completely unexceptional operation :frowning:

1 Like

It’s something calling the GPU and being processed several times a second for a number of images, so anything is significant - but to a certain extend if performance was my #1 concern I shouldn’t have chosen Python.
My first concern is practicality, consistency and understandability of syntax. People raise the performance question (which to some extent is a valid objection), so I’m pointing out that the current implem/behavior also has some performance optimizations done upside down. In this case, the conversion to int has to be done in pure python rather than C (inside float.__floordiv__, as you also mentioned) even though the code requesting an integer value will more likely use the int API.

Neither of those are made to return integer values. The sine, square root and log of a number has no particular reason to be an integer, it only does so for particular values which aren’t trivial to sort out from those generating non-integers. Whereas the purpose of // is to return an integer value, as pointed out by the doc, and except for overflow and non-number value situations which actually are, in my opinion, exceptional operations.
That’s why I agree with the rationale of the floor, ceil and (I guess) round you disagree with :person_shrugging: the math functions are not all similar with one another, the ones returning integer values are not the same as the ones returning float values.

Also, a question in return to yours : since you consider // should support inf and nan, why shouldn’t it support dividing by float-0 ? Why raise a ZeroDivisionError, why shouldn’t the result be a nan ?

I’m not, but perhaps I should have made that more clear. I know and admit it’s a change that will likely break some codes. My argument is that we should still do it, mainly because 1) it’s not breaking a documented behavior, 2) the situations it would break anything are marginal : calling .hex, dispatching on the basis of type, and that’s about it.
And I’d like to point out that’s something routinely done in the stdlib : access of enum values as attributes of other values of the same enum, that was broken somewhat recently. Passing sets and dicts to iterable-taking functions of the random module, that got broken too. Sometimes those get deprecated for a few versions before getting removed, and that’s all good and fine by me in this situation : I’m all for warning people that the floor division will change its undocumented return type, and waiting one or two versions before making the change.

I agree with you about this, but it IS documented – from the PEP that introduced floor division:

“the result type will be the common type into which a and b are coerced before the operation.”

So yes, it’s documented. But even if it weren’t – I don’t know that every corner case of Python is documented, but this one is not that subtle and has been this way a long time.

  1. the situations it would break anything are marginal : calling .hex, dispatching on the basis of type, and that’s about it.

The other one is the IEEE special values would raise rather than pass through.

No, it’s the kind of landmine we have when we try to apply static typing to s dynamically typed language. I have no idea how MyPy tries to handle the PEP 3141’s Type Hierarchy for Numbers – but apparently not that well (or imperfectly, anyway).

PLEASE don’t make decisions about how Python should be to make it easier to write static type checkers – but if you were going to, then having // always return the same type would make things easeir to static type :slight_smile:

It appears you’re not a fan of PEP 3141 – fine. But it was accepted, so it would be better if Python was consistent.

2 Likes

I have no particular strong opinion about this change, but just on a procedural note, per PEP 1, Final standards-track PEPs themselves at least shouldn’t generally be considered user-facing living, canonical documentation as opposed to historical change proposals, which may not reflect the current state of the implementation:

Once resolution is reached, a PEP is considered a historical document rather than a living specification. Formal documentation of the expected behavior should be maintained elsewhere, such as the Language Reference for core features, the Library Reference for standard library modules or the PyPA Specifications for packaging.

There are some remaining exceptions out of necessity, like Active Process PEPs or specialized interoperability standards like many of the Typing ones that don’t yet have another home, but they don’t really apply in the case of that PEP, as it was purely a core language change proposal that was accepted, implemented and marked Final, and the documentation should be the authoritative, canonical reference here.

The issues are more with the imperfection of the numeric tower itself and PEP 484 favoring practicality over purity (at least the latter of which was likely the best decision in practice) then any particular type checker; in actual use, these issues are relatively rare, and apparently Mypy is working on catching more of these particular corner cases.

Fair point, which is why we need to find a living document that IS canonical.

1 Like

My main takeaway from this discussion is that math.floor etc. are broken in Python3 and should be fixed.

This automatic coercion into a more precise type is an unsafe operation that should not be done without warning, and nonsense:

>>> import math
>>> math.floor(1e23)
99999999999999991611392
1 Like

Not sure what you’re trying to demonstrate here. The floating-point value 1e23 is precisely equal to:

>>> (1e23).as_integer_ratio()
(99999999999999991611392, 1)

So when you floor that value, you get the integer 99999999999999991611392. This seems correct to me. Where’s the nonsense?

1 Like

It seems to me that this particular concern is getting a little off topic for this thread (about floordiv), and rather essentially the same as the one addressed in the previous 1e23 ones (that unfortunately mostly devolved into dumpster fires), which is a separate issue.

Yeah, this has nothing to do with the discussion here.

There are two doc sections relevent here (until another is found, but I doubt it).
Binary arithmetic operations, as quoted by @Rosuav, and Numeric types, table 1, note 1, as quoted by me.

The former one is somewhat more precise in its wording, but I don’t find it that explicit : it only says that “floor division of integers results in an integer” but doesn’t address floor division of floats. Also “the result is that of mathematical division with the ‘floor’ function applied to the result” which, if that sentence applied to the type but I don’t think it does, would imply that float floor division results in an int just as math.floor does. And “The numeric arguments are first converted to a common type”, which doesn’t provide any information as for the return type.
(There are other bits about floor divs but not relevent as to the return type.)

The latter is more in form of a warning : “The resultant value is a whole integer, though the result’s type is not necessarily int.” Nothing more.

So, if we accept @CAM-Gerlach’s explanation on the standing of PEPs, it would make the behavior undocumented. But then :

I think the solution is to give proper warning that it’s about to change, as in a deprecation warning, and to remind (never hurts) that it’s not documented and that it may change again one day (who knows).

I mean deprecation warning in the doc sense, not in the warnings module sense, if we could add an actual warning that would be great but I don’t see how. Maybe on calls to .hex() but it would make a lot of false positives if not called on the result of a floor div.
Maybe inside the floordiv implem so as to emit warnings when floordiv-ing an overflow, inf or nan ? when not float_result.is_integer() ?

Having very strong opinions (“nonsense”) about technology you don’t understand is risky. It’s never stopped me, though, so welcome to the club :slight_smile:

>>> math.floor(1e23) == 1e23
True

If you think that floating point maths in 2023 is “nonsense”, you should have been around in the 1960s, 1970s and 80s before the IEEE-754 standard was established.

  • Machines where x == y but x/y != 1.0
  • Machines where x != y but x - y == 0.0
  • Machines where 1.0*x != x
  • Other machines where x != 0.0 but y/x crashes with a division by zero error.
  • Yet other machines where 1.0*x can overflow even if x is finite.
  • And my personal favourite, machines where x == y but (x - z) > (y - z).

I believe that all of those anomalies are impossible on IEEE-754 systems.

IEEE-754 floats have many quirks but they are consistent quirks, the same on all machines (ignoring the occasional bug and CPUs that don’t support the standard at all) and unlike past systems, have been carefully designed to be as sensible as possible – even if it doesn’t seem that way at first.

Obligatory link to What Every Computer Scientist Should Know About Floating-Point Arithmetic. For a possibly more understandable explanation, see here.

I believe the reason the second one is more vague is that it’s talking about the entire numeric stack as a whole, hence it attempts to describe what can be seen by numbers in general.

The first reference says this in describing several operations: The numeric arguments are first converted to a common type. By that definition, we should safely be able to assume that, if you divide two values of the same type, the result will be that same type; there’s no conversion necessary to get them to a common type, and then nothing else in the paragraph says anything about conversions.

Oh I am very familiar with these quirks, numerical computation is and has been my job for a long time. The promotion to a more precise an exact type is nonsense (or maybe surprising, if you don’t like the strong wording) and should never be done implicitly. That float 1e23 isn’t int 10**23 was an example to demonstrate the behaviour: you are assigning an exact value to an inexact expression.