I have recently started learning python using online courses.
Whilst watching a video on python this evening the comparison operators were discussed:
=, ==, !=, <, >, <=, >=
During the video I wondered if an approximate operator (a wavy equals sign) would be useful in Python? I am not sure if already exists in someway (but I don’t think a specific one does?)
I am not entirely sure what the uses could be for a operator like this but I am sure there could be many? For instance a way to analyse scientific data for results that are similar to a specific answer. A range could be applied for instance ± 10% or 5% etc.
An approximatley equals operator symbol could look something like this:
~=
Who would decide, and what would be the real benefit? What if it existed with a range of ± 5%, but I need 1% or 8%?
Consider that we can already use the available operators and functions to test whether a value is within a specific range. Sorry, I don’t mean to rain on your parade, but I don’t think it would be feasible or particularly useful.
There is a math.isclose function, but no operator. The problem with an operator is that there is no place for additional information, such as your 10% or 5% values. But with math.isclose you can use the rel_tol (relative tolerance) parameter to achieve that.
The devil is, as they say, in the details. Approximation is a vague
concept: something humans are fairly good at (for varying
definitions of “good”), but the same can’t really be said for
computers. Ironically, you’d need to settle on an extremely precise
definition of what “approximation” means before you could teach a
computer to perform that task.
There was a similar discussion recently about determining number’s
“order of magnitude” with Python, and it quickly became apparent
that there was no clear agreement on what an order of magnitude is
in concrete terms. Particularly, how you’d determine a number’s
order of magnitude in isolation isn’t the same as how you’d
determine it when comparing it against another number.
All that is to say that it’s an interesting idea, but without an
actual definition and examples, it’s hard for anyone to even start
evaluating it.
Yes, approximation is a vague concept for a computer to implement but in a world where AI is becoming more widely used(?) maybe it is something that could have applications.
In terms of operators then a ~= approximation operator could add to the category; even if it does not have practical applications.
I am not an mathematican, or computer scientist, so I can’t comment on how difficult it would be implement but I imagine defining it would be a key step. I grabbed this off a webpage:
"Approximation theory is a part of mathematics. Approximation is employed when it’s difficult to seek out the precise value of any number."
Also there is an approximation that occurs, I think, when a float is converted to an integer in python (gets rounded down I think?). So in that sense approximation occurs in python when float converts to integer. It might be the case that any float that gets converted to an int, and is rounded, or any float that is rounded down, becomes an appoximation?
It would be interesting if an object had the ability to define an ‘approximately equal’ operator. I’ve had cases where I use __eq__ (semantically incorrectly) for that. Imagine a __approx_eq__:
I wouldn’t say that int or str should have __approx_eq__ defined but allowing objects to and then use ~= or something like that could be kind of cool.
I can’t be categoric and say it won’t have a practical application. People use approximations all the time in daily life eg measuring an angle in a pool game or painting a picture, measuring a distance etc.
It just seemed to be a missing mathematical symbol from the comparison operators group (and a mathematical symbol I am somewhat fond of using).
Maybe it is something that applies to rounded values either float or int eg 3.14 ~= 3.142 or 1900 ~= 1949?
Perhaps, it could also mean ‘also equals’ and allow for rapid switching between a low precision and high precision value? Perhaps, to perform a complex calculation, that takes time, to see if the answer might be close to being correct before performing the precise calculation. This could be useful in time critical situations where the necessary answer cannot be fully known because of rapidly changing parameters? The time save might be marginal but could in time critical situations be an advantage.
Maybe it could apply to language translations using string values eg “Howdy” ~= “How do you do?”
I am not a experienced programmer, so what I am saying is hypothetical…
Assert that two numbers (or two ordered sequences of numbers) are equal to each other within some tolerance.
…
Tolerances
By default, approx considers numbers within a relative tolerance of 1e-6 (i.e. one part in a million) of its expected value to be equal. This treatment would lead to surprising results if the expected value was 0.0, because nothing but 0.0 itself is relatively close to 0.0. To handle this case less surprisingly, approx also considers numbers within an absolute tolerance of 1e-12 of its expected value to be equal. Infinity and NaN are special cases. Infinity is only considered equal to itself, regardless of the relative tolerance. NaN is not considered equal to anything by default, but you can make it be equal to itself by setting the nan_ok argument to True. (This is meant to facilitate comparing arrays that use NaN to mean “no data”.)
There’s enough debate/variation around what is the best definition of “approximately equal”, and the answers differ by application, that picking one definition that could be the meaning of an ~= operator is never going to satisfy enough people.
Having a function like math.isclose, which can be controlled explicitly using extra parameters, is much more flexible (and exists already!)
Considering how @ was introduced, I suppose ~= could map to a __approx__() magic method. Users could override it to mean whatever they want (e.g. math.isclose()), similar to how the numpy folks handled the matrix multiplier.
Of course what you say is possible. As @fungi said, I think this is the sort of feature that someone would have to demonstrate its utility on real code.
The problem with that is that it gets the control backwards.
For operator overloading, we want each class to override what the operator means. Putting dunder methods in the class to override operator behaviour is the right choice.
But for approximate equality it is the caller who needs to decide what it means to be “approximately equal”, not the object themselves. The caller needs to decide what counts as “close enough”, whether to use (for example) absolute or relative error, or ULPs (units in last place) for numeric comparisons, or for fuzzy text matching, what sort of matching algorithm to use, and the threshold to be considered the same.
Only the caller can make that choice, not the objects themselves.
A function works well for that:
approx_equal(x, y, absolute_error=0.01)
or a context manager:
with fuzzy_match_threshold(distance=3, similarity=0.9):
str1 == str2
but there is no way to put that logic into a dunder.
One could define an Approx class with a __eq__ which could be used like this
Approx(a, tol) == b
I would provide a naive implementation but I’m on my phone.
I don’t think it’s a good idea though, defining an approx method for objects where it makes sense, which takes inputs that control what ‘approximate’ means is the way to go IMO. (Like float.)
Indeed. I thought about this some more later on and realized involving kwargs would be a problem and setting opinionated defaults would likely be a headache to most people involved.
I considered __approx__ implemented between strings too, say to return True if a regex pattern matches a string, e.g. if "a" ~= "abcde": .... We have similar issues however. Deciding on which regex to use could be a battle.
@ajoino A bit tangential, one alternative application of an Approx() class is direct use in match-case comparisons. This example is from R. Hettinger’s toolkit:
>>> x = 1.1 + 2.2 # Rounds up to 3.3000000000000003
>>> match Approximately(x):
... case 3.0:
... print('No')
... case 3.3:
... print('Yes, we want this to be a match.')
... case _:
... print('No')
...
Yes, we want this to be a match.