When does the new sum of float in Python 3.12 happen?

In Python 3.12, for sum, we have

Changed in version 3.12: Summation of floats switched to an algorithm that gives higher accuracy on most builds.

What is the actual condition? When does it consider that is adding floats?

I don’t understand it. For example

  1. Pre-3.12 result of sum.

    >>> (float(2**53) + float(1.0)) + float(1.0) 
  2. Python3.12’s sum

    >>> sum([float(2**53), float(1.0), float(1.0)])  
  3. Here it is not happening

    >>> class R(float): pass  # Same definition for the other examples
    >>> sum([R(2**53), R(1.0), R(1.0)])
  4. Here maybe it is happening, I suppose due to the default start=0 giving 0 + R(2**35) resulting in a float.

    >>> sum([R(2**53), float(1.0), float(1.0)]) 
  5. Here it is not happening

    >>> sum([R(2**53), R(1.0), float(1.0)])
  6. Here it is not happening

    >>> sum([float(2**53), float(1.0), R(1.0)])

Yes, exactly. This is because the first special casing is actually adding up an all-int instances iterable, which fails, but consumes the first element in a normal call to __add__, which then produces a float instance and then the special case for all-float kicks in. See the code.

I didn’t understand the second sentence (which is most of your answer).

For example, what does it mean “an all-int iterable”? Does it mean an iterable for which all of its elements are int? Which one is that iterable in the examples?

"the first special casing " and “the special case for all-float kicks in” I am not sure if I understood. In particular the last one seems to be the entirely of what I am trying to understand. Namely, when does it consider to be adding all float?

… Meanwhile I am trying to understand the implementation.

There is a PyFloat_CheckExact(item) enclosing the part that is about to do the Kahan-like sum, but then in the cpython/Doc/c-api/float.rst I see

Return true if its argument is a :c:type:PyFloatObject or a subtype of
:c:type:PyFloatObject. This function always succeeds.

… c:function:: int PyFloat_CheckExact(PyObject *p)

which confuses me more. If being float or a subtype is what PyFloat_CheckExact, then what stops class R(float): pass from passing through the same computation as float. This is probably me misinterpreting the code. I not familiar with cpython.

Eead the text under PyFloat_CheckExact, not above it.


Return true if its argument is a :c:type:PyFloatObject, but not a subtype of
:c:type:PyFloatObject. This function always succeeds.

is the description for PyFloat_CheckExact.

Is it correct to say that the correction gets applied or not to each individual sum depending on the type of the next item being float or not? And there is also a check at the beginning. … So, perhaps the condition is

(1) Start with float and the correction gets applied each time the next item is a float.

What I want to do now is experiment with

class R(float):
  def __add__(self, o):
    return float(o + self)

and a sum that combines instances of R and instances of float.

It doesn’t look like (1) either

>>> sum([float(2**53), 0.0, 0.0, 1.0, 1.0])
>>> sum([float(2**53), R(0.0), 0.0, 1.0, 1.0])

The function forst assumes that all elements are of exactly type int and starts adding them up like that. If this isn’t the case, we do one normal __add__ based addition and use that as the new start essentially. Then we drop into the loop that assumes that all elements are exactly of type float and do the special adding here. If this is ever violated (for even one item) we drop into the general loop that uses the normal __add__ mechanism.

The interesting edge case you discovered here is that if the first call to __add__ from the end of the int special casing results in a float, then the second special case loop gets used even if we have some arbitrary unknown object in there. (It doesn’t have to be a subclass of float)

1 Like