Please Change The Source Code For "%"(modulo) Operator

There are lots of problems about taking remainders of number couples. With integers, there is no problem. But when it comes to floating-point numbers( I mean, float-float or float-int couples), shell returns a nonsense answer.There is an example below:

>>> 5.8%2
1.7999999999999998  #Correct Answer: 1.8
>>> 25.34%5
0.33999999999999986 #Correct Answer:0.34

I recommend you to check numbers before taking them into the process. I will try to explain the “checker code” draft which is in my mind:
(x%y)
*
-If both numbers are integers, code directly sends these to the main process.

return (x,y) #You can do anything which directly returns x and y.(I preferred to return them as a tuple.)

-If there is a float between these to variables(x and y), code firstly does this(to make sure that both numbers are float):

x=float(x)
y=float(y)

Then, our code must calculate the lengths of the fractional(after-dot) pieces. We will use these lengths to make integers.

lenFractx=len(str(x).split(sep=".")[1])
lenFracty=len(str(y).split(sep=".")[1])

After that, we must change our floats into integers. (Because modulo returns a correct respond, if both numbers are integers.) To change a float into an integer -we want to use the same characters(12,34->1234)- we must multiply it with 10^^length_of_the_fractional_part.There is an example below:
**
ourFloatNumber=34.45
fractionalPartLength=2
ourIntegerNumber=34.45 * 10**2 #and this is equivalent to 3445
**
But, we have two numbers. To save their ratio, we must multiply them with the same number. And this number will be equivalent to the biggest fractional part’s length. So, at this point, our code must do something like this:

if lenFractx>lenFracty:
    x*=10**lenFractx
    y*=10**lenFractx
    floatMultiplier=lenFractx
else:
    x*=10**lenFracty
    y*=10**lenFracty
    floatMultiplier=lenFracty

Right now, our variables are ready to go. But you mustn’t forget to divide the remainder by 10^^biggest_fractional_length. You may add a new parameter to your remainder function called floatMultiplier(which will also has default value of None). If this parameter is None, your code won’t do anything extra. If it has an integer value passed into it, then your remainder function will divide the result by 10**floatMultiplier and return it.
I bringed together all this stuff here(I suppose your modulo function is called remainder and requires only two args):

if ("." not in str(x) and "." not in str(y)):
    remainder(x,y)
else:
    x=float(x)
    y=float(y)
    lenFractx=len(str(x).split(sep=".")[1])
    lenFracty=len(str(y).split(sep=".")[1])
    if lenFractx>lenFracty:
        x*=10**lenFractx
        y*=10**lenFractx
        floatMultiplier=lenFractx
    else:
        x*=10**lenFracty
        y*=10**lenFracty
        floatMultiplier=lenFracty
    remainder(x,y,floatMultiplier)

Stuff about your remainder function:

def remainder(x,y,floatMultiplier=None):
    #bla bla
    #result=something
    if floatMultiplier==None:
        return result
    else:
        return result/(10**floatMultiplier)

If you can take this into account, I will be really happy.
Note:I am sorry if there are any mistakes about phrases or language usage. English is not my mother language.Thank you for reading this.
*

As far as floating point arithmetic goes these are correct results. The difference between the correct and the “nonsense” number is the 64-bit floating point number machine precision. You can test it yourself without the modulo operator.

In[1]: 3.8-2
Out[1]: 1.7999999999999998
In[2]: 5.34-5
Out[2]: 0.33999999999999986
7 Likes

There’s a nice explanation about this sort of thing here:

https://docs.python.org/3/tutorial/floatingpoint.html

There are also some nice answers on this stackoverflow post: language agnostic - Is floating point math broken? - Stack Overflow

7 Likes

Okay, I feel like a st_p_d(Maybe I am😅). But don’t we have any ways to make calculations totally base 10? I know there are 2^^x bytes in a computer, but how does base10 calculators work?
Edit(Answer to myself):It’s possible for them to use extra functions. Maybe they are slicing those fractional numbers and using integer math.

Yes. Use the decimal module.

6 Likes

Yes, you can do your calculations in base 10 by using the decimal module.

But there are good reasons why serious numeric calculations are still done in base 2 (binary):

  • speed

  • accuracy

Decimal (base 10) has the advantage that every number you can write exactly in decimal notation (like 0.1) can be expressed exactly (up to the limit in digits). Decimal is especially useful when you are working with quantities representing money.

But it has the disadvantage that it will be slower, and for many computations, the rounding errors may be larger.

6 Likes

Thank you for your all explanations and answers. I see the point now. @storchaka, @steven.daprano _ilayn _weeneyde (seems like i can mention max. 2 people because i am a new user, so i couldn’t mention everyone)

Here’s an idea: maybe the float type could operate based on the decimal module under the hood? That way we could eradicate all the floating-point rounding error issues.

How would switching to decimal remove rounding errors in 1/3, for example?

7 Likes

Simple, actually.

The calculation 1 / 3 could be done under the hood like

result = decimal.Decimal('1') / decimal.Decimal('3')

where the value of this result object is

Decimal('0.3333333333333333333333333333')

and this then casted to a float by

float(result)

giving the output

0.3333333333333333333333333333

But this example does not portray any advantage over a direct 1 / 3 calculation, which gives the exact same result.


Let me demonstrate a case where using the decimal.Decimal() actually makes a difference.

If you do 0.1 + 0.2, you would expect Python to output 0.3; but no, you get 0.30000000000000004. Strange, right?

decimal.Decimal() to the rescue!

result = decimal.Decimal('0.1') + decimal.Decimal('0.2')

where the value of the result object is

Decimal('0.3')

and then the result object casted to a float

float(result)

giving us

0.3

which is the expected result.

WARNING: If we write result = decimal.Decimal(0.1) + decimal.Decimal(0.2) – i.e., no quotes around the passed data to decimal.Decimal() – the value of the result object in this particular case is then Decimal('0.3000000000000000166533453694'), so something even more strange than that of a direct 0.1 + 0.2 calculation. Therefore, always put quotes around the data you pass to decimal.Decimal().

Anyway, to make division in Python work based on decimal.Decimal(), then the way how __truediv__() (the implementation of the / operator in Python 3.x) operates globally would need an upgrade.

Assuming finite (even if large) precision, changing from base 2 to base 10 just changes where the problems occur. My point is that you cannot eliminate all rounding errors.

10 Likes

Decimal doesn’t solve all cases.
One could imagine a universe, in which division of ints results in a Fraction, like:


from fractions import Fraction
from typing import overload, Union

@overload
def divide(x: int, y: int) -> Fraction: ...

@overload
def divide(x: float, y: int) -> float: ...

@overload
def divide(x: int, y: float) -> float: ...

@overload
def divide(x: float, y: float) -> float: ...

def divide(x: Union[int, float], y: Union[int, float]) -> Union[Fraction, float]:
    return Fraction(x, y) if isinstance(x, int) and isinstance(y, int) else x / y  # with x / y existing behaviour

assert divide(1, 10) + divide(2, 10) == divide(3, 10)   

But that universe isn’t ours, because when Python division changed (going from 2.7 to 3.0) we were in the other branch, and going back would distort the space-time continuum :wink:

1 Like

As others have said, it would not eradicate floating point rounding errors, it would just hide the problem from beginners when they discover programming and type their first calculations in the REPL. They will meet rounding issues sooner or later anyway. And decimal is slower than float because it doesn’t use the hardware.

$ python3
Python 3.10.7 (main, Sep  7 2022, 00:00:00) [GCC 12.2.1 20220819 (Red Hat 12.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from timeit import timeit
>>> from decimal import Decimal
>>> lst_floats = [0.1]*1000
>>> lst_decimals = [Decimal("0.1")]*1000
>>> timeit("sum(lst_floats)", globals=globals())
4.566519747999791
>>> timeit("sum(lst_decimals)", globals=globals())
57.152030233000005

Decimal doesn’t eliminate rounding point issues. It makes them worse.

We expect from pure mathematics that 1/i*i should always equal 1. With floats, that’s not always true. Out of the first million integer values of i, that expectation is violated 15% of the time, with the first two violations being 49 and 98:


> a = [i for i in range(1, 1000001) if 1/i*i != 1]

> len(a)

148164

> a[0:2]

[49, 98]

For Decimal, the invariant 1/i*i == 1 is violated more than 37% of the time, with 32 examples below 100 compared to just two for floats:


> a = [i for i in range(1, 1000001) if 1/Decimal(i)*i != 1]

> len(a)

371154

> sum(1 for x in a if x < 100)

32

The first two violations are for i = 3 and i = 9.

Another important invariant is that averaging two values should always be within the range of the two variables: a <= (a+b)/2 <= b. This invariant is always true for binary floats. But:


> a = Decimal('0.10000000000000000009267827205')

> b = Decimal('0.10000000000000000009267827207')

> a <= (a+b)/2 <= b

False

> (a+b)/2

Decimal('0.1000000000000000000926782720')

With that, even a == +a is false.

Can’t tell from that how much of the speed difference is “because [decimal] doesn’t use the hardware”, since sum has a whole bunch of extra code specifically for fast summing of float values.

To be clear, Decimal defaults to 28 digits of precision, which is not worse than float (i.e. 64-bit binary float), which has only 15 decimal digits of precision (rounded down from 15.95). The comparison 1/i*i != 1 is naive, so I guess Decimal is ‘worse’ from the naive perspective of a novice. If one sums the magnitude of the difference from 1 for the given range, then Decimal is significantly better with its default precision. It’s about the same if the precision is manually set to 16 decimal digits.

>>> sum(abs(1/i*i - 1) for i in range(1, 1000001))
1.644950842205617e-11
>>> ctx = decimal.getcontext()
>>> ctx.prec
28
>>> sum(abs(1/Decimal(i)*i - 1) for i in range(1, 1000001))
Decimal('6.74199E-23')
>>> ctx.prec = 16
>>> sum(abs(1/Decimal(i)*i - 1) for i in range(1, 1000001))
Decimal('6.74277E-11')

How much should we try and hide the fact that number systems like float are approximations? This seems to be another case of our FP Whac-a-mole. We had cleaning up the printout of FP before, and made changes (for the better I think), but that did not affect the speed of calculations much. Here we are trying to improve accuracy but at a speed cost.
Maybe we should thing of this as an additional FP method rather than an operator.

P.S. If there is a need, was there already a method that does this? Does this in other languages/libraries?

For completeness may I suggest the below (well-known?) text:

What Every Computer Scientist Should Know About Floating-Point Arithmetic

https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

1 Like