Allow fractions.Fraction to return 1/6 for "0.17", "0.167", "0.1667", etc

Leengit · February 18, 2022, 10:20pm

I would be happy to see this functionality available now in simplefractions and/or any other similar PyPI library. If at a later date we somehow reached consensus to add it to stdlib too, we could do that then.

pf_moore · February 18, 2022, 10:50pm

Yes, you understand my position correctly, I’d view 0.3, 0.7, 5.3 and 1000.3 differently, because they are an exact number of tenths, and as a denominator 10 is small enough that I’d expect a user to naturally think in terms of tenths (whereas I’d be less convinced that someone would intuitively feel the difference between 33/100 and 34/100, so viewing both of them as 1/3 seems natural to me (ironically, 0.34 is interpreted as 17/50, so I guess that’s another discrepancy).

Your definition is entirely reasonable on its own terms (the specification is clear and understandable). My expectations, on the other hand, have too of a “do what I mean” flavour to be implementable. So I’m not trying to say your implementation is wrong, just that it violates my expectations in a number of cases which feel common enough to me that I wouldn’t use it in practice. Whereas simplefractions.simplest_from_float() gives me the answer I’d expect in those cases (the trade-off being, it gives more complex fractions for things like 0.3333333).

Leengit · February 18, 2022, 11:12pm

In case it is not futile to look at this yet another way …

I am imagining that we would be modifying the fraction.Fraction constructor so that instead of

def __new__(cls, numerator=0, denominator=None, *,
    _normalize=True):

it would be something like

def __new__(cls, numerator=0, denominator=None, *,
    _normalize=True, assume_rounded=False):

The functionality I propose would come into effect only if the user constructed the fraction via something like

a = fractions.Fraction("0.3", assume_rounded=True)

In particular, if you are thinking “0.3” is exact, then maybe you won’t be setting assume_rounded=True in the first place. And if you do happen to think it might be rounded in some peculiar case then would returning a value of 1/3 be what you would want in that case?

(Apologies if the name assume_rounded is not according to convention or is otherwise inappropriate. In that case, let’s change it.)

fungi · February 18, 2022, 11:20pm

One way of looking at the differences between 0.3, 0.7, 5.3 and
1000.3 is that they have 1, 1, 2 and 5 significant digits
respectively, so have different levels of precision on their own.

steven.daprano · February 19, 2022, 1:06am

Lee said:

"I would expect "0.15" -> 2/13 and "0.150" -> 3/20 and "0.3" -> 1/3 and "0.7" -> 2/3. In particular, in a situation where something is reported to the nearest tenth, my first assumption for “0.7” would be that it comes from 2/3.

These are very odd expectations. Under what circumstances are you expecting somebody who wanted 2/13 to write 0.15 instead of 2/13?

I think the problem is not in your description of what is getting computed, but why it is being computed.

To me, this requested feature seems horribly like DWIM:

http://www.catb.org/jargon/html/D/DWIM.html

It feels to me like you are coming from the pespective of somebody who knows that they want 2/13 as an exact fraction, but for some mysterious reason is forced to write it as 0.15 rather than “2/13” or 0.15384615384615385 or even 0.153846 like you might see on a cheap $2 calculator, and wants the Fraction constructor to Do What I Mean and return that 2/13 fraction.

But imagine that you were somebody who actually wanted 6/41, or 9/59. These are no weirder fractions to desire than 2/13, they also round to 0.15 if forced to use only 2 decimal places, and they are closer to 0.15 than 2/13. Your DWIM function fails to return them. What a disappointment you would feel.

There are rather a lot of possible fractions which might also have been written as 0.15, starting with the fraction 3/20 which is exactly 0.15, and it really isn’t clear why 2/13 should be considered “better” or “more likely”, rather than (say) 19/130 (equally close to 0.15, but from the opposite direction).

I don’t think that you are trying to just minimise the denominator, without caring about being the closest rational approximation. But perhaps I’m wrong?

There are 50 rational pairs where both the numerator and denominator are no greater than 100 which are rounded to 0.15 to two decimal places. After normalising by cancelling common factors, there are 31 such fractions.

One of those is exactly 0.15 (3/20). Another 22 of them are closer to 0.15 than 2/13.

If the true value might be any fraction that rounds to 0.15, why is 2/13 a better guess than 6/41 or 9/59?

Leengit · February 19, 2022, 4:14pm

I saw a talk recently where it was reported that the approach did the right thing in 90.3% of their test cases. I knew that they didn’t have that many test cases and was quickly able to estimate that they were achieving 28/31 by using best_fraction("90.3e-2").

I think that this is typical. When the denominator is relatively low, there are many situations where people will instinctively provide enough digits for it to be reconstructed. I would like to provide a tool that does this computation. That people might apply the tool in situations where it is not applicable … is a problem that pretty much all tools can suffer from. I suppose we could put child restraints on – e.g., so that the algorithm would refuse to run unless there were at least N significant digits in the input – but my inclination is to let users decide for themselves.

mdickinson · February 19, 2022, 8:16pm

Sure. But that sounds like a case where what’s clearly wanted is the simplest fraction in the interval
(0.9025, 0.9035). With the current description of best_fraction, as a user I couldn’t be sure whether it was going to give me the simplest fraction, or whether it was going to give me something else because that something else is closer to the value 0.903 than the simplest fraction.

mdickinson · February 19, 2022, 8:30pm

To give one example along the same lines: suppose it’s reported that 28.3% of voters responded “yes” in some poll, and you want to figure out the smallest possible number of people involved in that poll. The answer is 46 (13/46 = 0.282608…). But best_fraction("0.283") gives 15/53 instead, and I have no idea why 15/53 should be considered the “best” approximation to 0.283.

ericvsmith · February 19, 2022, 8:35pm

I’m opposed to this being in the stdlib. It should be on PyPI. If it’s clearly a success there, then we could talk about including it in Fractions.

Leengit · February 19, 2022, 10:00pm

You make a good point. The result from simplefractions is not my first choice but it is easier to explain than my approach, and I would accept that instead of the implementation above for best_fraction. In such a case I would like to have an assume_rounded parameter (or by another name) in the constructor fractions.Fraction default to False, but when set to True would infer the interval as is currently done by best_fraction and then find the simplefractions answer from that interval.

If there is support for the extra bells and whistles, we could support both the simplefractions and best_fraction approaches by allowing assume_rounded to support multiple values, or by introducing additional defaulted parameters.

Whether or not this is incorporated into fractions, I would be happy to see this kind of functionality in simplefractions.

pf_moore · February 19, 2022, 10:56pm

Why not just publish it yourself? It doesn’t need to be incorporated into another package to be useful…

pyby · February 23, 2022, 4:55pm

Agreed. I think the main issue is that, for example, with this function, 0.7 gives back 2/3. While that may be the case sometimes, I don’t really see why anyone would use 0.7 to represent 2/3 unless specifically needing one decimal place. Yes the math works but then fractions that are exact like 7/10 are ignored in situations like this. I think it’s only useful if you give 3 or more decimal places. Or, if there’s an exact match, you give that instead. Or even better, multiple options are given. Otherwise, it’s pointless and inaccurate.

Leengit · February 23, 2022, 5:10pm

I am confused by this response. If you want “0.7” to be 7/10 then you don’t set assume_rounded=True in the fractions.Fraction constructor. You set assume_rounded=True when you want “0.7” to be the simpler fraction 2/3.

pyby · February 23, 2022, 5:18pm

So basically what you’re asking to be added is just an extra parameter to a command? Because if that’s the case then that would probably be fine.

steven.daprano · February 26, 2022, 1:14pm

Lee said:

“If you want “0.7” to be 7/10 then you don’t set assume_rounded=True in the fractions.Fraction constructor. You set assume_rounded=True when you want “0.7” to be the simpler fraction 2/3.”

What if I want 0.7 to be 5/7, or 8/11, or 9/13, or 11/16, or 12/17, or one of the many dozens of other simple fractions that round to 0.7?

More to the point, if the only information I have is that when rounded as a decimal, my fraction is 0.7, how can I know which of the many fractions that round to 0.7 is the one I want?

Leengit · February 28, 2022, 5:46pm

how can I know which of the many fractions that round to 0.7 is the one I want?

Indeed there are infinitely many. This function (as amended by the suggestion of @mdickinson) chooses the simplefractions answer, which is the fraction with the lowest numerator and denominator. If that is not what the user wants then the user would not request the functionality. Would it help if the name of the flag that selects this behavior (and which is False by default) were assume_rounded_from_simplest_fraction?

pf_moore · February 28, 2022, 7:15pm

I think the point that you’re missing here is that you’re proposing that one of a multitude of possible alternative behaviours is worth adding to the Fraction constructor. But you don’t provide any sort of justification for why that specific behaviour is sufficiently important to add, when the others are not.

We get that the new behaviour is optional. We get that people shouldn’t enable it if it’s not what they want. What you haven’t explained is why people who want a different behaviour aren’t also entitled to having an option to enable their preferred behaviour.

zware · February 28, 2022, 7:47pm

Instead of a boolean “assume it’s rounded” option, what about an “allowed error” option that, if non-zero, specifies the absolute value the returned Fraction is allowed to deviate from the exact value after simplification? If you know how much rounding you’re dealing with, you can specify a constant value; otherwise you can calculate a value based on your input in whatever way makes sense for your application.

EpicWink · February 28, 2022, 11:13pm

It’s not always the case that the user choosing which functionality to use (the software developer) is the same as the user entering the input (the end user, or a client application).

steven.daprano · February 28, 2022, 11:51pm

“assume_rounded_from_simplest_fraction”

I want to say “you cannot possibly be serious” but I fear that you actually are.

Not every functionality needs to be crammed into the default constructor. Boolean flag arguments are a code smell (if not outright anti-pattern):

So your proposal to add this to the constructor is already a bit wiffy. But at the point that we are proposing a parameter name which is more than double the length of the fully qualified class (fractions.Fraction), it positively reeks.

So let’s think a bit harder about the API:

What is it that this thing actually does? Give a short but descriptive name for it.
What arguments does it accept? Just strings, or floats and Fractions, etc?)
Obviously it returns a Fraction. So it could be an alternative constructor, like Fraction.from_decimal and from_float, or a method like limit_denominator.
If this is based on the continued fraction algorithm for Best Rational Within An Interval, the obvious API is for a method that allows the user to provide the interval.

Consider the design principles expressed in the Zen of Python (import this). While they aren’t necessarily intended to be taken entirely seriously, they do offer some good guidelines to think about. In this case, I argue that the koan “Explicit is better than implicit” applies.

If your aim is to have a function that takes a decimal written as a string, and returns the best rational approximation to that assuming that the string was rounded to N digits, then your method should take two arguments: the decimal string, and the number of digits.

Don’t rely on trailing zeroes being significant:

"%.2f" % (19/27)  # 0.70
"%.2g" % (19/27)  # 0.7

although I guess it wouldn’t be too bad if the user could explicitly opt in to “guess the number of significant figures from the string”, e.g. if you pass -1 as the number of digits.