Rounding to significant figures - feature request for math library

ttamg · June 9, 2022, 3:47pm

The python standard library doesn’t have any function (that I am aware of) to round a number to a number of significant figures. This is a very useful thing to be able to do and I have written a simple function to do it for 2-3 projects now. In the math library we have functions such as round(), ceil() etc, but nothing for rounding to a number of significant figures. Could we add this feature to the math library?

I see quite a bit of activity on this on blog posts and Stack Overflow. For example here is a Stack Overflow thread on this very topic.
math - How to round a number to significant figures in Python - Stack Overflow)

Implementing it would be trivial I guess in the math library. I’ve got a simple python function in a Gist that does the job.

Have you considered adding something similar to the math library? I’m sure there are more efficient ways to do it than my little function but whatever the idea it feels like a quick one to implement?

guido · June 9, 2022, 4:00pm

We already have a builtin that does this:

>>> import math
>>> math.pi
3.141592653589793
>>> round(math.pi, 2)
3.14
>>>

storchaka · June 9, 2022, 4:05pm

Something like this?

float('%.*g' % (figures, value))

Or this?

float(f'{value:.{figures}g}')

mdickinson · June 9, 2022, 5:17pm

Could you describe some of the use-cases you’ve found in practice? It’s certainly a common need to be able to display computed values to a given number of significant figures, but that need is already covered by string formatting. I’ve encountered few cases where I want to round to a float and then continue computing with the rounded value - in that situation the rounding operation is usually throwing away accuracy, and it’s better to continue the computation with the unrounded value instead.

steven.daprano · June 9, 2022, 5:31pm

@guido: that’s rounding to 2 decimal places, not significant figures.

3.14 is three significant figures. And in case you are tempted to imagine that the difference between the two is always just 1 digit, not so:

1234.5678 round to 2 decimal places would be 1234.57;
1234.5678 round to 2 significant figures would be 1200.

Calculators and programming languages typically only support rounding to decimal places, despite the fact that in practice, rounding to significant figures is much more important to actual human scientists and engineers.

I expect that the reason for this is that to do round-to-sig-figs properly, you need a numeric type like Decimal with variable precision. A fixed precision type like float makes it impossible to distinguish between 1200 (two sig figs) and 1200.00 (six sig figs).

malemburg · June 9, 2022, 6:01pm

You can use this for limiting the precision of the mantissa of a number to what e.g. an instrument can actually capture (e.g. measuring values with 1% accuracy). Unlike with normal rounding, the exponent does not matter.

Leaving extra digits in place can lead to amplification of errors in subsequent calculations, so rounding to a known number of significant digits early can help reduce such errors.

That said, I believe a function rounding to a known precision given as percentage would probably have more practical value than using the number of significant digits.

mdickinson · June 9, 2022, 6:23pm

Nope, sorry. I’m not buying it. Exactly the opposite is true - it’s the rounding that can lead to amplification of errors in subsequent calculations.

Example: you measure the diameter and the circumference of your dinner plate, because you need to get a value of π that you’re going to use in further calculations. The diameter turns out to be 18.2cm; the circumference 57.2cm. You divide to get an approximation 3.1428571428571432… to π. Now you round, to 3 significant figures say, on the basis that your measuring tools only give you that much accuracy, to get 3.14. That’s a worse approximation that the one you started with. And you could do statistical simulations to convince yourself that this isn’t an isolated example - by rounding you’re increasing the variance of the error on your calculations.

mdickinson · June 9, 2022, 6:28pm

Here’s NIST’s “Good Laboratory Practices” on the subject:

Note: do not round intermediate calculations; rounding intermediate values can cause rounding errors in the final results and should only take place after the final expanded uncertainty has been determined.

Source: https://www.nist.gov/system/files/documents/2019/05/14/glp-9-rounding-20190506.pdf

ttamg · June 9, 2022, 7:40pm

Yes @steven.daprano has it right. Lots of uses for this in the scientific community.

Here’s a use case why I needed significant figures rounding today. I have been working on my Python trading bot that trades across thousands of markets with very different price levels. Some assets are priced at $000’s of dollars and some are tiny fractions of a $ in price.

My algorithms produce predictions at prices with huge floating point accuracy. But there is huge uncertainty, so I only want to keep price prediction to 2 significant figures.

For example:
Asset A - prediction 234563.44566 → $ 230,000 (to 2 sig figs)
Asset B - prediction 0.1175584566 → $ 0.12 (to 2 sig figs)

So for each asset the amount of decimal places to round to depends upon how big the number is, hence the log in the script I shared here.

Hope that makes more sense.

malemburg · June 9, 2022, 8:13pm

I guess I wasn’t clear. You don’t round intermediate results. You round your measurements before starting the calculation. E.g. say your thermometer shows 12.76°C, but you know that it only has a precision of 3 digits. You’d then use 12.8°C to start off your calculation.

An alternative would be using interval arithmetic for your calculation (you then start with intervals around the measured values, based on the precision of the device), but that’s not as generally useful

BTW: There’s a nice library for doing interval arithmetic, called MPFI: Nathalie Revol's home page: Software It’s based on MPFR and GMP. I’m not aware of a Python wrapper for this, though.

vbrozik · June 9, 2022, 8:40pm

It is still not clear why you round the values. Normally you only need to round them at the output e.g. using the format specification like f'{value:.{figures}g}'.

Rounding before output makes sense only in special cases (mainly for optimization):

to occupy less memory
to make operations faster

These optimizations do not hold for floating point numbers of fixed size (like float).

vbrozik · June 9, 2022, 8:44pm

How this can make the values closer to the real values? If it does not make them closer, it can only make them further away.

malemburg · June 9, 2022, 9:01pm

The point is not making the used values closer to the real values, but to only use values with precision which can be trusted.

In the above example, the thermometer should really be showing 12.8°C instead of pretending to know better and use 12.76°C in the display, but you often have instruments which show more digits than can be trusted.

Also note that you typically work with lots of measurements, so it’s better to rely on statistics to get closer to the actual values rather than pretending that extra digits in individual values give you more accuracy (they could very well be the result of noise, offsets, missing calibration, etc.).

Nodd · June 10, 2022, 4:50pm

But then you have the initial error (noise, offsets, missing calibration, etc…) in addition to the rounding error. How is that better ?

storchaka · June 10, 2022, 6:36pm

If the thermometer gets temperature 12.7613764°C, but shows you 12.76°C, it adds an error 0.0013764°C, and it is pretty fine, because it is insignificant in comparison with measurement error. If you write it as 12.8°C, you add an error 0.04°C, which may be not so good and can increase the first digit of error. If you write it as 13°C, you add an error 0.2°C which is larger than the original error.

There is a benefit of rounding numbers – because we, humans, bad in recognizing and remembering long sequences of digits. But it has a cost (increasing the resulting error), and we should take it into account. For computers there are no benefits in rounding input data, because rounded floating point value takes the same amount of memory to store and the same time to proceed.

In any case, the number of significant decimal digits is bad metric for precision. The relative difference between 99.8 and 99.9 is 0.1%, but between 0.00123 and 0.00124 it is almost 1% – 10 times large. In both cases it is 3 significant digits.

mdickinson · June 11, 2022, 9:26am

To steer this thread back on topic a bit: I’m not convinced that the proposed addition is a good fit for the math module. In most of the cases where I’ve seen people asking for this on Stack Overflow, what those people actually turn out to want is rounding for output (that is, formatting to a string) rather than rounding to another float. And that makes sense: you usually convert back to a human-readable string in human-friendly format at the end of a machine computation, when you’re crossing back over the interface from computer to human. The same applies to the existing round function - it’s surprising how rarely two-argument round - as opposed to string formatting - is actually the right solution in practice. The number of instances of questions on Stack Overflow where the questioner is complaining that round isn’t giving them the right number of trailing zeros is telling.

There are technical issues too: the result of rounding to a number of significant figures, as proposed, would be another binary float, which means both that it doesn’t keep track of significant zeros (as Steven pointed out upthread), and also that it’s not actually representing the value that you want, thanks to the usual What-You-See-Is-Not-What-You-Get nature of binary floating-point. If you round 32.702715 to four significant figures, what you likely want is 32.70 (where that last zero is treated as significant), but what you’d get would be something that displays as 32.7, and whose actual value is 32.7000000000000028421709430404007434844970703125. The WYSINWYG part also bites on the input side, of course - we’d get the usual complaints about round_to_sig_figs(2.675, 3) giving 2.67 rather than the expected 2.68.

For the particular financial-domain use-case you present, it sounds as though the natural form of the output, if not simply a string, might be a decimal.Decimal instance rather than a float. I believe that a round-to-some-number-of-significant-figures operation could make sense for decimal.Decimal (in fact it’s already there in the internals), but that’s a different proposal.

Added to that, as Serhiy demonstrates, there’s a one-line solution available for those rare cases that a float-to-float round-to-significant-figures operation really is what’s needed, and that one-line solution is not subject to problems with math.log10 inaccuracy.

petersuter · June 11, 2022, 11:30am

what those people actually turn out to want is rounding for output (that is, formatting to a string) rather than rounding to another float.

Maybe part of the problem is that this is much more difficult to understand and remember than round(value, figures)?

The documentation on round() does not mention or link to string formatting alternatives.

The string formatting documentation is probably impossible to understand for non-experts.

malemburg · June 11, 2022, 12:11pm

Agreed. It would be better to round to precision intervals (e.g. expressed as percentages) as basis for working on input values. Even better is to use proper tools such as MPFI.

Rounding based on significant number of digits is common for creating output values, as Mark already mentioned.

Also agreed The use case is rather special and not something that needs to be in the stdlib.

guido · June 11, 2022, 4:16pm

Built-in round() is a dinosaur. It predates %-style formatting and was inherited from ABC.

komoto48g · June 12, 2022, 4:10pm

I think there seems to be room for improvement.

>>> import numpy as np
>>> np.round(0.005,2)
0.0
>>> round(0.005,2)
0.01

I’m not sure how bad the mixing odd and even rounding.
It may be subtle, but no one is happy to know it unless you are a tax officer