Trying to understand rounding - in python -

tim.one · September 10, 2023, 12:11am

On a scale of “no credit” to “full credit”, it’s 75% of the way to “full”

Clever, but like my first attempt is also dependent on the rounding mode currently in use. We can’t easily change the rounding mode in core Python, but we can in the mpmath module:

import mpmath
magic = mpmath.mpf(0.49999999999999994)

print(repr(mpmath.fadd(0.5, magic))) # mpf('1.0')
print(repr(mpmath.fadd(0.5, magic, rounding='f'))) # mpf('0.99999999999999989')

For 0.5 “it works” under the default rounding mode because nearest-even rounds a halfway case up to give it a final 0 bit. Under to-minus-infinity or to-0 rounding, though, it throws the trailing one bit away and leaves the last remaining bit odd.

So if you want this, take my second version. As already said, the current rounding mode has no effect on its result. It also has the benefit of being obvious instead of clever .

Most users couldn’t care less. If you can find a conforming implementation of 754 binary floats (as I said before, Python’s is not), then, as in Python’s decimal module, changing the rounding mode could be straightforward.

Or not . One highly valid criticism of 754 is that it didn’t specify a concrete way implementations had to supply to change all the adjustable moving parts it requires of a conforming implementation. As a result, there is no portable way.

I showed above a way to force a specific rounding mode under mpmath, but even in that there’s no way to set it “in general”. Instead some specific named mpmath functions (like fadd()) accept an optional rounding= argument. The rounding used by the infix binary operators (like +. *, …) is always nearest/even.

BTW, mpmath’s “round to nearest int” function is named nint(), and also does nearest/even rounding.

>>> import mpmath
>>> for i in range(10):
...     x = i + mpmath.mpf(0.5)
...     print(x, mpmath.nint(x))

0.5 0.0
1.5 2.0
2.5 2.0
3.5 4.0
4.5 4.0
5.5 6.0
6.5 6.0
7.5 8.0
8.5 8.0
9.5 10.0

newbie-02 · September 10, 2023, 11:03am

@TimPeters,

I have to thank you for your time and engagement, we have discussed in depth and length,
with the result:
A.) with small tricks we can achieve ‘human compatible’ rounding for IEEE values,
( which IMHO does not solve irritations due to scaling when rounding to decimals places ),
B.) we can’t produce ‘human compatible’ rounding for basic arithmetic results where
‘hardware rounding’ hit’s before programmers intention,

you contributed profound knowledge, I a persistent intention, no solution within IEEE
binaries.

You consider ties to even a good thing, while I criticise the lack of ties away from
zero as an alternative, as it is needed, for example, in spreadsheets to produce
‘people-friendly’ results.

If we want better human compatibility we’d have _Decimals, python decimal or
fractions to choose from. But - IMHO - very few programmers would consider
rewriting well matured complex applications in a new datatype.

So we can expect deviations and surprises in ‘computational mathematics’
for some time to come.

[ edit ]
want to add one point where I see a risk of fooling newbies and importance to raise
awareness:

Add floats a and b, giving a correctly rounded sum and exact error.
Mathematically, a + b is exactly equal to sum + error.

in the SO example is valid from a binary point of view, preserving the deviations
in representation of the operands!!!, not from a decimal POV!!!, not accounting
‘what we see’. see example:

exact_add( 4.4, 2.2 ) --> (6.6000000000000005, 0.0)

such cases are rare, and people experimenting with improved algorithms might
get trapped by ‘often looking good’ and not investigating in detail that

exact_add( 0.2, 0.1 ) --> (0.30000000000000004, -2.7755575615628914e-17)

is neither pointing to the decimal correct 0.3, nor it’s binary representative
0.29999999999999998890… be aware of different understanding about ‘exact’
or ‘correct’ between binary vs. decimal oriented people!

question on detail ( we managed to get me confused ): we have one! option
where ‘decimal’ can be steered to half away rounding, but NO option in any
datatype where operations round results ties away?
_Decimal64 in ‘C’:

4503599627370496.0DD + 0.5DD  -->  4503599627370496.0  
4503599627370496.0DD + 1.5DD  -->  4503599627370498.0

python decimal:

import decimal  
decimal.getcontext().prec=16  
print ( decimal.Decimal('4503599627370496') + decimal.Decimal('0.5') )  
print ( decimal.Decimal('4503599627370496') + decimal.Decimal('1.5') )  
-->  
4503599627370496  
4503599627370498

[ /edit ]

tim.one · September 11, 2023, 1:41am

Yes, I don’t care that it’s “not what people expect”. If that’s what they want, decimal’s BasicContext supplies it, and in other ways too tries to act like a feeble desk calculator (e.g., only uses 9 digits of precision).

nearest-up is a biased rounding mode, period. Instead of applying your intellect to contriving special cases where it isn’t, try instead writing ever-more realistic programs that don’t force special, contrived distributions.

I already did that in my youth . Although my experiments were using decimal arithmetic and mostly focused on US “round to whole dollar” tax rounding rules (which legally require nearest-up), the results broadly matched what Mark showed for binary arithmetic on the “paired sum” algorithm: nearest-even’s mean error was close to 0, while nearest-up’s was dead obviously “too high” in every run. Indeed, nearest-even’s worst result was better than nearest-up’s best result across runs.

This isn’t “a thing” among people who crunch numbers for a living. Mark already gave the best “head argument” in his first sentence: across the 10 possible final decimal digits, nearest-up leaves one unchanged, rounds 4 down, and rounds 5 up. Systematically biased to "too high’. nearest-even leaves one unchanged, rounds 4 down, rounds 4 up, and what it does with a final “5” depends on the parity of the next more-significant digit-. Unless you contrive a distribution to skew that more to “even” or “odd”, half the time it will round down and the other half up.

That isn’t a proof that it’s always unbiased, and there can be no such proof because it isn’t: you can in fact contrive distributions where it isn’t.

Contrived cases aren’t compelling, though. Here, let’s say we have a 2-digit decimal calculator, start with 10, and repeatedly add and then subtract 0.5.

Under nearest-even, 10+0.5 rounds to 10, and then 10-0.5 also rounds to 10. Nothing changes, no matter how often we do the ± 0,.5 dance.

Under nearest-up, 10+0.5 rounds to 11, and then 11-0.5 also rounds to 11. Oops! Worse, on the next iteration, it leaves us with 12. Time after that, with 13. And so on. After 10 iterations, we’re at 20: we doubled the number we started with, merely by suffering 20 rounding errors. The error will continue increasing without bound, no matter how long we keep going.

Did I contrive this by starting with 10, an even number? Not intentionally, but it’s a fair objection. Start with 11 instead. Doesn’t make any real difference to nearest/up - that continues adding another 1 on every ± 0.5 iteration. Under nearest/even, first 11+0.5 rounds to 12, and then 12-0.5 also rounds to 12. And that’s it: it sticks at 12 now forever after.

Now with years of computer experience, that sets off glaring alarm bells. You never want to build mission-critical software using a gimmick that’s known to go off the charts without bound, no matter how rarely. On modern boxes, something that can go wrong one time in a million can be expected to go wrong a thousand times every second . Calculator and spreadsheet users don’t have a sufficiently paranoid mindset to be trusted with floating-point code in contexts with consequences.

I’ll note that 754’s directed (to plus or minus infinity) rounding modes are intended to be biased, though. Their primary intended use is to help mathematical library authors write software “interval arithmetic” packages that run much faster than can be done without HW support. They weren’t intended for general use.

Very true - they won’t. Even if they were convinced that, say, decimal was a much better choice for their needs, another old saying applies with vicious consistency: the fast drives out the good.

Indeed, it’s something of a miracle that binary 754 took over the world. The hardware needed to implement it is hairier and slower than the zoo of incompatible, vendor-specific, half-assed float implementations that came before it, and writing software libraries for IEEE 754 is a seemingly endless tedious maze of special cases (is it a NaN? an infinity? subnormal? if it’s a zero, does the sign bit matter? etc) which take lots of code and cycles to navigate.

I’m still amazed by that they even got the standard approved, let alone near-universally adopted. Standards usually cater to the “least common denominator” among long-time industry practices, not mandate cutting-edge, brand new schemes.

It’s not Mark’s fault that you’re doing an inaccurate conversion to decimal . Do it wirh infinite precision, and there’s no problem.

>>> from decimal import Decimal as D
>>> a = 4.4
>>> b = 2.2
>>> c = 6.6000000000000005
>>> d = 0
>>> a + b == c + d
True

# Now let's see what it should look like in decimal. First we'll set the Inexact
# trap, so we get an error if anything we try isn't computed exactly.
>>> ctx = decimal.getcontext()
>>> ctx.traps[decimal.Inexact] = True

# Now convert all the inputs to decimal.
>>> da = D(a)
>>> db = D(b)
>>> dc = D(c)
>>> dd = D(d)

# And show their true, infinite-precision, decimal equivalents.
>>> da
Decimal('4.4000000000000003552713678800500929355621337890625')
>>> db
Decimal('2.20000000000000017763568394002504646778106689453125')
>>> dc
Decimal('6.60000000000000053290705182007513940334320068359375')
>>> dd
Decimal('0')

# OK! Are the decimal sums equal too?
>>> da + db == dc + dd
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
decimal.Inexact: [<class 'decimal.Inexact'>]

# Oops! While conversion to decimal is always done exactly, there are
# too many digits for floating addition to hold all the result digits exactly.
# Rather than try to out-think that, let's just double the precision and
# try again.
>>> ctx.prec *= 2
>>> da + db == dc + dd
True

So, ya, the sums are exactly same in binary or decimal.

I think you could benefit from reading this appendix in the Python tutorial. I first wrote it years ago, and Mark has contributed to it too. It appears to have been quite successful in easing newcomers into not flipping out over that binary floats really aren’t decimal floats ;.

Set decimal rounding to ROUND_UP.

>>> ctx = decimal.getcontext()
>>> ctx.prec=16
>>> ctx.rounding = decimal.ROUND_UP
>>> print ( decimal.Decimal('4503599627370496') + decimal.Decimal('0.5') )
4503599627370497
>>> print ( decimal.Decimal('4503599627370496') + decimal.Decimal('1.5') )
4503599627370498

But I’m not clear on what you mean by “round results ties away”. You gave what appears to be the same example in C and Python, which gave the same results, but you didn’t say anything about what you wanted to see for output. I have no guess as to how “round results ties away” would differ to you from “half away rounding”, or what your example was intended to illustrate.

Rosuav · September 11, 2023, 3:12am

Contrived cases aside, I’m trying to figure out how fair this actually is with randomly selected data. Suppose we assume that every floating point value actually represents the range of real values around that, which round to that value. For the most part, this won’t change the argument; the values that round to 1.5 are centered around 1.5, so anything you can say about fairness will still be true. But that’s not the case around numbers that are exact powers of two. The real numbers that would round to the floating point value 1.0 are going to be skewed slightly high, since there are more representable floats a smidgen below 1.0 than a smidgen above.

My gut feeling is that this can’t make enough difference to affect any sort of meaningful statistic, but I also can’t prove that.

tim.one · September 11, 2023, 4:23am

The argument I paraphrased was for decimal floats, but you seem to have binary in mind.

So let’s stick to binary.

Suppose the source format has s significand bits, and the target format t. Then there are 2**(s-t) possible bit patterns that may need to be discarded when throwing out the last s-t bits of a source value.

Throwing out the all-0 bit pattern doesn’t lose any information. It’s not “rounding” anything in the sense of losing information, and when information isn’t lost there’s no possible ill effect on accuracy. The all-0 bit pattern is irrelevant, but that point is often overlooked in flawed analyses.

That leaves 2**(s-t) - 1 bit patterns that do lose information if they’re thrown out. That’s an odd number. Any case that leaves the last retained bit untouched is losing information in the “too low” direction". Any case that adds 1 to the last retained bit is losing information in the “too high” direction.

to-plus-infinity rounding is “too high” in every case, as biased high as possible.

to-minus-infinity rounding is “too low” in every case, as biased low as possible.

to-nearest-up is “too high” in 2**(s-t-1) cases and “too low” in the remaining 2**(s-t-1) - 1, too high more often than too low. It’s biased high.

to-nearest-even is evenly split between the cases, except for throwing out the trailing bit pattern 2**(s-t-1), which is at the midpoint of the odd number of not-all-zero trailing-bit cases. Whether that ends up too low or too high can’t be determined by staring at the trailing bits, but depends on the last of the retained bits. Assuming the last retained bit is 0 or 1 with equal likelihood. “too low” and “too high” are again equally likely. In which case it’s not biased in either direction.

Although it is biased toward leaving the last retained bit 0 instead of 1. Which isn’t directly related to accuracy, so hard to care.

to-nearest-odd (round halfway cases to the nearest odd representable value) is also unbiased, but almost never (never, period?) used. It some obscure study I can’t recall now, the argument for using nearest-even instead of nearest-odd was that the last bit of a rounded result is very often destined to be thrown away too, by rounding a later operation using the result. Throwing away a trailing 0 doesn’t lose more information, so better to force a 0 than a 1.

Another unbiased method has no universally accepted name: “Von Neumann” rounding was common when I was younger, but it’s not quite what Von Neumann actually proposed. “round-to-odd” is what I see most often now, but I dislike that name, because it’s not at all the “to-nearest-odd” described just above.

Under this, if the trailing bits are all 0, just throw them away. We’re done. Else “or” a 1 bit into the last retained bit. In effect, the last retained bit becomes a “sticky bit” recording whether it, or any bit to its right, was ever set.

This requires a bit of thought, which I’ll leave to you. You might at first think “but that’s always too high!”, but it’s not. If the last retained bit was already a 1, it’s “too low”. In fact, whether it’s too high or too low has everything to do with the original last retained bit, and nothing to do with the bits we’re throwing away. It’s not trying to be a “nearest” method at all.

It has the implementation advantage that it never requires propagating a carry, and so a finite intermediate result can never overflow to an infinity. OTOH, a very small but non-zero intermediate result can never underflow to a zero either.

Short course: stick to the default nearest-even. Kahan knows better than anyone

Rosuav · September 11, 2023, 7:15am

That works for me!

newbie-02 · September 11, 2023, 10:22am

Again I have to thank you for your time and engagement!,
continuing as we have a good run bringing things to the point.

Yes, I don’t care that it’s “not what people expect”.

Ok, I’ll make that one of my points, this ‘ignorance’ needs
to be better / more clearly communicated to people.

Contrived cases aren’t compelling, though. Here, let’s say we have a 2-digit decimal calculator, start with 10, and repeatedly add and then subtract 0.5.

Applause, that’s a good argument. Think about put it in the manuals / tutorials.
But … it’s just pulling the corner cases, where it’s logical to experience
irregularities, into what I named ‘central area’ where we normally expect
‘normal’ math.

gimmick that’s known to go off the charts without bound

think your example would stop at 99, but you are right, it’s evil, in my system of
‘wanted mathematical properties’ it shows ‘harming reversibility’ … at normally
far distant corner cases.

with vicious consistency: the fast drives out the good.

good wisdom, two other points I want to communicate:
A.) ‘good’: there are! alternatives, at least partly better fulfilling ‘human common’,
B.) ‘slow’: decimal compatible calculations are slower than binary, but NOT as much
as the ‘old tales’ of 100 to 1000 times slower. On modern machines the atomic
steps are hard to measure. I just tried timing a 1000 iteration loop in python:
decimal takes between 0.5 and 20 times the time of binary, with an estimated
mean around 4. Which melts down to a few percent in the overhead of real
application programs with number crunching and to fractions of percents in
interactive use, so it is no longer noticeable to users.

It’s not Mark’s fault that you’re doing an inaccurate conversion to decimal .

It’s not what I wanted, but again got trapped by IEEE representation imprecision,
propose to take ( SRT ) strings for conversion to decimal:

from decimal import Decimal as D
a = 4.4 
da = D( a ) 
print( 'decimal from float 4.4  :', da )   # prints 4.40000000000000035527136788... 
da = D( str( a ) )  
print( 'decimal from string 4.4:', da )   #prints 4.4

the second is what users ‘see’, and would expect to be calculated.

‘this appendix’

I’d already read more than once, it is! good, but even together with this one
didn’t boost my knowledge as good as our discussion here.

Set decimal rounding to ROUND_UP.

thanks, I was short sighted / confused, think ROUND_HALF_UP is more what I looked for.

same example in C and Python,

yes, wanted to keep short, wanted to point out that both systems produce ‘to even’,
improved now by your hint where to set the rounding mode. If there is similar for C / gcc
‘_Decimalxxx’ I’d like to know, not yet found.

Short course: stick to the default nearest-even. Kahan knows better than anyone

As ‘to even’ isn’t complete a solution, neither is Kahan summation but much better,
I’d like to be able to use ‘ties away’ for single calculations and Kahan summation for
mass data, alas current implementations claiming compatibility to ‘the standard’
don’t allow.

tim.one · September 11, 2023, 3:39pm

It would stop at 100, but that’s incidental. I pictured a 2-digit decimal calculator so people could do it all in their head without strain. Change it to, e.g., a 9-digit calculator, and start at 100000000, and the ± 0.5 loop would go up 1 at a time for 900000000 iterations. Do a similar thing in Python decimal’s default context, with 28 digits, and the human race would probably go extinct long before it stopped going up.

Alas, that says less about the speed of decimal than about the sloth of CPython when doing native binary float arithmetic. Interpreter overhead is very high compared to the cost of HW float arithmetic. You could probably get a speedup of around a factor of 20 for simple binary float loops just by running the Python code under the PyPy implementation instead of under CPython.

However, PyPy is much slower than CPython for decimal float arithmetic.

In any case, people who care about float speed use numpy instead, which skips all interpreter overheads entirely by doing whole-array binary-float operations “in one gulp” at peak C speed. It’s of no use at sp\eeding decimal float operations, thoujgh.

Can’t help there! Never used the decimal facilities/packages in C, C++, Java, …

The standards define the results of individual additions. They certainly don’t disallow an environment from offering other facilities, such as functions that add a list of floats, or that multiply two matrices, that define results in ways they like.

As noted before, Python’s math.fsum() gives the best possible sum (as-if to infinite precision, with just one rounding at the end). And I believe Raymond Hettinger is looking into changing CPython’s implementation of the builtin sum() so that if it detects it’s adding a sequence of floats, it will use a form of “compensated addition” (Kahan-like). Why wouldn’t he pass the sum on to math.fsum() instead? Speed again. That’s much slower than doing compensated addition. Which in turn is slower than doing the “pair sum” algorithm Mark featured in his SO post, which numpy uses to get better binary-float sums.

No standard forbids any of this… Standards only say certain facilities must be made available. They don’t say other facilities are disallowed. In fact, 754 doesn’t even say, e.g., that

double add(double x, double y)
{
    return x + y;
}

in a 754-conforming implementation has to use 754’s definition of + for the add. The standard mandates nothing about how its facilities need to be made available. It’s fine if, e.g., a conforming implementation requres that the only way to get at 754’s addition is by spelling it ieee_754_conforming_addition(x, y).

newbie-02 · September 14, 2023, 3:53pm

It would stop at 100,

how do you display 100 on a 2-digit calculator?

the Python code under the PyPy implementation instead of under CPython.
However, PyPy is much slower than CPython for decimal float arithmetic.

for the moment I’m confused enough between import module, import module as X, from module import xxxx, from module import *, explicitely need naming the module sometimes, sometimes not, reach of statements … and similar, will try to get functionality together first, then speed.

different options / modules to calculate …

admit having difficulties enough understanding pythons scheme of rounding, set ‘prec’ and ‘quantize’, am trying to pick some which do what I want …

Is joking allowed in python?

x=str(2**52)
y="0.5"
print(x+y)

by spelling it ieee_754_conforming_addition(x, y)

that’s cruel but illustrates very well how I feel at the moment among lot’s of
details which partly work but only partly …

not yet worked through but think this: Python decimal.ROUND_HALF_UP Examples
can help gaining insight how things work - for others seeking ‘half up’.

tim.one · September 14, 2023, 5:20pm

Scientific calculators (the only kind I’m likely to use), and the decimal module, use “scientific notation” if their are more digits than fit in the display. That is, they tack on an exponent:

>>> from decimal import * # to illustrate a point later
>>> from datetime import *
>>> ctx = getcontext()
>>> ctx.prec = 2
>>> ctx.rounding = ROUND_HALF_UP
>>> base = Decimal(10)
>>> delta = Decimal(0.5)
>>> for i in range(100):
...     print(base)
...     base += delta
...     base -= delta
10
11
12
13
... and so on ...
97
98
99
1.0E+2 # and that's how 100 is displayed with 2 digits
1.0E+2
1.0E+2
1.0E+2
1.0E+2
1.0E+2
1.0E+2
1.0E+2
1.0E+2
1.0E+2
>>> base == 100
True

While it can be convenient (saves typing) in interactive mode, from module import * should be avoided otherwise. As illustrated by the example above, it adds all sort of names to the current namespace, and unless you’re an expert on the module there’s no clue about where names came from. That’s why I’ve never used that form of import in replies before this.

Absolutely! Python was named in honor of the British comedy group Monty Python .

Can’t guess what that means. But it’s a pretty safe bet that whatever you’re trying is working as designed and documented.

There are learning curves to be climbed, for sure. Nothing is instantly obvious to anyone.

newbie-02 · October 14, 2023, 10:58am

Absolutely! Python was named in honor of the British comedy group Monty Python .

sorry, missed that, evtl. morphing John Cleese’s face to the python logo had an error or joke?

confusion, joking, fooling newbies …

There are learning curves to be climbed, for sure. Nothing is instantly obvious to anyone.

e.g. for example additional to ‘prec’ and ‘quantize’ having ‘round’
with difference between ‘()’ and ‘(0)’:

import decimal 
decimal.getcontext().rounding = decimal.ROUND_HALF_UP
print( decimal.Decimal('2.5').__round__() ) 
print( decimal.Decimal('2.5').__round__(0) ) 
  
-> 2  
3

is ‘designed and documented’?
And another problem bothering me, converting to decimal:

import decimal 
decimal.getcontext().prec = 17
decimal.getcontext().rounding = decimal.ROUND_HALF_UP
decimal.Decimal( str( 641441655184871.3 ) )
  
641441655184871.2

has devia ending in ‘2’ instead of expected ‘3’, assume consequence
of different interpretation of 641441655184871.25 for ‘str’ and decimal
conversion? Can work around by:
decimal.Decimal( str( ‘641441655184871.3’ ) ), but that doesn’t help
when I get the value as variable from another source:

x = 641441655184871.3 
import decimal 
decimal.getcontext().prec = 17
decimal.getcontext().rounding = decimal.ROUND_HALF_UP
decimal.Decimal( str( 'x' ) )
  
-> InvalidOperation: [<class 'decimal.ConversionSyntax'>]

realized that already:
>>> 600000000000000.3
→ 600000000000000.2 , which is mimicked by / from ‘C’, having a chain of representables
600000000000000.1
600000000000000.2
600000000000000.4
while application programs like gnumeric or IEEE calculators like weitz.de count
600000000000000.1
600000000000000.3
600000000000000.4
feel somehow set back to zero …

thus my learning curve goes up, but there are still problems …

tim.one · October 14, 2023, 10:08pm

b. s. :

import decimal 
decimal.getcontext().rounding = decimal.ROUND_HALF_UP
print( decimal.Decimal('2.5').__round__() ) 
print( decimal.Decimal('2.5').__round__(0) ) 
  
-> 2  
3

is ‘designed and documented’?

It’s how it was designed (and, BTW, is not a design I agree with - IMO, decimal facilities should never ignore the context), but decimal.__round__() currently isn’t documented at all. There’s an issue report open against that, and a pull request proposing doc changes.

b. s. :

And another problem bothering me, converting to decimal:
import decimal 
decimal.getcontext().prec = 17
decimal.getcontext().rounding = decimal.ROUND_HALF_UP
decimal.Decimal( str( 641441655184871.3 ) )
  
641441655184871.2  
has devia ending in ‘2’ instead of expected ‘3’, assume consequence
of different interpretation of 641441655184871.25 for ‘str’ and decimal
conversion?

No, and your mental model remains more complicated than reality.

Literals in Python source code (and the same is true of nearly all programming languages) have meaning entirely independent of context. "123" always denotes an object of class str, 123 always of int, 1.23 always of float (binary!), and 123j always of complex. There are no literals in Python for objects of class decimal.Decimal.

So your 641441655184871.3 literal denotes a binary float, and whether or not you import decimal first is irrelevant to that. In fact, the Python compiler created the binary float at compile-time.

That literal has more precision than a binary float can capture, so some is rounded away, leaving this decimal approximation to the actual binary float stored (this is covered in the tutorial’s floating-point appendix, which perhaps you should read again):

>>> 641441655184871.3
641441655184871.2

That’s where the trailing 2 comes from, and neither decimal nor str() really have anything to do with that (apart from that str() produces the shortest decimal string that converts back to the original float passed to str()).

Sorry, but I’m lost now as to what you really want.

newbie-02 · October 15, 2023, 11:36am

hi, and thanks for still watching this thread,

agree with you and have commented.

but that doesn’t help when I get the value as variable from another source: … lost

having another application interpreting float 600000000000000.25 as ~00.3 ( ties away ) I’d like an opportunity to get that handled as ~00.3 by python decimals too, to become able to calculate 600000000000000.3 + 0.1 to 600000000000000.4 instead of 600000000000000.3 .
Found a workaround to export as string from other application, cumbersome, an option to convert ‘ties away’ in python would be more convenient.
General proposal … as python is designed and able to handle ‘contexts’ … what about implementing similar for floats? or at least float → string conversion?

tim.one · October 15, 2023, 8:14pm

I don’t know what “~00.3” means to you. If you mean to say 600000000000000.3, just say that. It doesn’t really help to make up your own private shorthand notations on the fly.

That aside, I don’t understand the example even if I guessed right about what you meant. You input is exactly representable as a binary float, so rounding has nothing to do with it. Perhaps you’re suffering from this illusion:

>>> x =  600000000000000.25
>>> x
600000000000000.2

That’s because str(float) and repr(float) create the shortest decimal string that can reproduce the input. That never requires more than 17 significant digits, but in this case it just so happens that 16 digits are enough. But no information was actually lost, as can be seen by forcing it to display more digits, or converting it to Decimal (which shows the infinitely precise decimal value of any finite binary float):

>>> format(x, '.5f')
'600000000000000.25000'
>>> from decimal import Decimal as D
>>> D(x)
Decimal('600000000000000.25')

Again I have to guess too much at what you mean. Are those Python literals? Exact mathematical values? Is + supposed to mean mathematical addition - binary float addition - some hypothetical addition operation? If at all possible, show complete (input and output) actual code examples, instead of relying on inherently imprecise English. For example,

>>> 600000000000000.3 + .1
600000000000000.4

So your specific example already displays what you said you want it to display.

It’s again a case where nearest-even vs half=up is irrelevant, because there isn’t a “halfway” case involved. With more precision:

>>> import decimal
>>> ctx = decimal.getcontext()
>>> ctx.prec = 100
>>> D(600000000000000.3) + D(.1)
Decimal('600000000000000.3500000000000000055511151231257827021181583404541015625')

The infinitely precise result is strictly larger than 600000000000000.35, so under any form of “nearest” rounding, rounding to a single digit after the radix point must round up to 4.

As I said at the very start, unless you’re a numeric expert who knows exactly what they’re doing, sticking to strings is ;your very best approach. But don’t try to sell me on “cumbersome” - I don’t know of any numeric environment under which it isn’t dead easy to get a string representation of floats .

IMO that’s extremely unlikely to happen in CPython. But I’ve explained “why” before.

newbie-02 · October 15, 2023, 9:46pm

x = 600000000000000.3
y = 0.1
print( x )
print( x + y )
→ 600000000000000.2
600000000000000.4
lacks some consistency, had hoped there could be a way to convert to decimal and get consistency back …

tim.one · October 15, 2023, 10:06pm

Sorry, but I’m still lost. Binary floats are emphatically not stored in base 10. Complaining about their lack of “consistency” when viewed as decimal values is as futile as complaining about that Unicode strings aren’t tasty chocolate .

If you want decimal “consistency”, you need to work in base 10, That’s what the decimal module is for. Then you get “consistency” regardless of whether you work directly from binary floats:

>>> from decimal import Decimal as D
>>> x = D(600000000000000.3)
>>> y = D(0.1)
>>> print(x)
600000000000000.25
>>> print(x+y)
600000000000000.3500000000000

or instead work from exact decimal inputs:

>>> x = D('600000000000000.3')
>>> y = D('0.1')
>>> print(x)
600000000000000.3
>>> print(x+y)
600000000000000.4

What you cannot - and can’t ever - do, is expect the Decimal constructor to guess at information that isn’t present in its input. It’s doing the best as is possible to do (“as if to infinite precision with one rounding at the end”) with the information you give it.

Rosuav · October 15, 2023, 11:02pm

Not nearly tasty enough. Also, now my screen has lick marks on it. 0/10 would not recommend.

John_Carter · October 16, 2023, 9:59am

Hi,
If you want a good tool to explore IEEE floating numbers look at https://www.h-schmidt.net/FloatConverter/IEEE754.html
Its not for doubles but it does show how rounding etc works at the bit level
John

newbie-02 · December 13, 2023, 5:43pm

@TimPeters, just in case you are still reading here:

I’m not complaining about less sweets, but about the lack of consistent math with bin-float-values, and think actually to have formal proof, they don’t qualify as ‘numbers’. Numbers to drive math with are required to have two operations and fulfill distributivity, what IEEE biniaries don’t provide. In consequence consistency isn’t possible. But I’d like to get as close as possible, which alas isn’t the direction of many devs.

A question aside the topic of this thread, but to show that your hints get attention. I started to look at mpdecimal, start to like it, but suffer from a point which looks cumbersome to me. It’s all wrapped in functions working on pointers towards mpd_t variables, and these functions are not returning but setting pointed values, which requires several steps of assigning if one want’s to try ‘mixed math’. Did I overlook or did someone for integration into python invent something like ‘mpd_t_from_bin64( x )’ or mpd_t_from_uint64( il ) or similar which returns something that can be used in functions it’s nested in?

Let me give an example, calculating a broken exponent power of an arbitrary integer. I’m used to write something like

‘pow( n, x )’

, and would think something like

‘mpd_to_bin64( mpd_pow( mpd_from_int( n ), mpd_from_bin64( x ) ) )’

would be a convenient substitute, while

create mpd variable a,
create mpd_variable b,
create mpd variable result,
set a from n,
set b from x,
calculate mpd power,
set bin result from mpd result,
destroy mpd var a,
destroy mpd var b,
destroy mpd var result …

is a clear structure, but ‘less easy’ for me to write, and evtl. also time consuming for the computer?? .
.
( I’m not very experienced in coding as you see, pls. bear with me. )

tim.one · December 13, 2023, 10:37pm

You’re on the right track. Coding in C is painful. That’s why we use Python .

It’s worse than you’re thinking, though, because essentially every operation can fail, so after almost every operation you have to check for an error, and decide what to do about it. So your code needs to be more like:

create mpd variable a,
If that failed (e.g., out of memory), return your own error indication.
create mpd_variable b,
If that failed, release the memory you used for variable a, and return your own error indication.
create mpd variable result,
If that failed, release the memory you used for variables a and b, and return …

Hardware floating point incurs none of those costs. That’s why it’s very much faster. And, as noted before, “the fast drives out the good” for most applications most of the time.

Mathematically, hardware floats are a finite subset of the rationals (plus some oddball NaN and Inf elements), but a set that isn’t even closed under simple arithmetic. “Elegance” isn’t its goal. “As elegant as possible without sacrificing much in peak speed” is more like it.

Topic		Replies	Views
Floating-point numbers Python Help	4	1797	December 18, 2022
Adding Decimals to classes with OOP + rounding to significant digits (ATM machine) Python Help	2	413	April 24, 2022
Questions about the Python Language and Floating Point Python Help help	3	1181	December 6, 2022
Integer division Python Help	11	13597	January 28, 2023
Round not rounding to upper number Python Help	6	459	August 9, 2022

Trying to understand rounding - in python -

Related Topics