How to tell sum, in Python 3.12, that it is adding floats?

franklinvp · June 28, 2024, 12:19pm

In Python 3.12, sum of floats

switched to an algorithm that gives higher accuracy on most builds.

Fine. I don’t like it, but it doesn’t affect me beyond having to read the documentation again after upgrading.

We get

>>> sum([float(2**53), float(1.0), float(1.0)])
9007199254740994.0

Now, when inheriting from float, sum doesn’t seem to realize that it is adding float

>>> class fLoAt(float): pass
...
>>> sum([fLoAt(2**53), fLoAt(1.0), fLoAt(1.0)])
9007199254740992.0

It is back to computing (fLoAt(2**53) + fLoAt(1.0)) + fLoAt(1.0).

Question: Is there something that I can do to fLoAt for sum to realize that it is adding floats.

MegaIng · June 28, 2024, 12:22pm

Yes, turn the instances back into float. The sum function is deliberately not special casing subclasses of float, in case for example you overwrite __add__. And no, checking if __add__ has been overwritten is not an option since that would be a potentially massive performance cost.

JamesParrott · June 28, 2024, 12:22pm

I recently learned (from elsewhere in the forum) about math.fsum .

>>> class fLoAt(float): pass
...
>>> import math
>>> math.fsum([float(2**53), float(1.0), float(1.0)])
9007199254740994.0
>>>

I know it doesn’t fix sum. But if 16 significant figures of precision are required from a floating point calculation, it’s reasonable to use tools specifically designed for that purpose.

franklinvp · June 28, 2024, 12:34pm

Ouch. They did me wrong with that change …

franklinvp · June 28, 2024, 12:48pm

In my case, it is the other way around. sum used to be the sum of float, where “sum” is the sum of float. We have libraries that were written to imitate C++ code adding double, not Kahan, not Neumaier, just adding. Of course, adding among other stuff. Feeding it my own fLoAt aimed to change, for example, the way they print, when they print intermediate results. Now the results are very different when fed float from when fed fLoAt.

I guess I will need to go into the libraries, changing all their use of sum into for loops.

jamestwebber · June 28, 2024, 2:20pm

xkcd 1172 strikes again

franklinvp · June 28, 2024, 2:27pm

Nonsense.

Nothing was broken.

Instead it is sum, a built in function, no longer being the most basic interpretation of adding a basic type with the addition that it comes from. Now it is a more sophisticated algorithm that only kicks in for that specific type. Also now to get the most basic behavior of a sum of float associated from left to right one needs to go to either for (in Python) or functools.reduce (also in Python).

What they did is to cater to those who see the features of operations of float as defects.

The new implementation is wonderful, good to have, just not its placement.

JamesParrott · June 28, 2024, 4:07pm

You’ve got a point.

By the way, if the issue is with a subclass of float, why don’t you define precisely the __add__ method on it that you want?

But I still think either relying on a particular implementation of floating point arithmetic, or changes in that implementation causing bugs, is a code smell.

It’s better to fix code, so that it’s robust, and unaffected by relatively tiny floating point numerical errors. Not using floats, or using almost equal comparisons instead of strict equality are two options to do so.

Stefan2 · June 28, 2024, 4:58pm

Or start with a fLoAt.

sum([float(2**53), float(1.0), float(1.0)], fLoAt())

Stefan2 · June 28, 2024, 5:13pm

Or more generally with a neutral element, which for example also circumvents the ban for summing strings:

class Neutral:
    def __add__(_, x):
        return x

print(sum([float(2**53), float(1.0), float(1.0)], Neutral()))

print(sum(['abc', 'd', 'ef'], Neutral()))

Output (Attempt This Online!):

9007199254740992.0
abcdef

(With '' as start value, you’d get an error falsely claiming “sum() can’t sum strings”.)