whow! … rarely created a thread with that intense activity …
I am interested in all aspects of computer math from the nickname of Prof. Kahan, rather ‘Velvel’ than ‘Welvel’, over accuracy, speed, space, usability, universality …
and to have and spread qualified info about it, which is often not present in discussions about shortcoming of ‘as it is’.
For that this thread, besides answering my original question(s), gave lot! of info I’ll need some time to go through.
In pausing this thread … a little awareness test … about neglectible small deviations … pls. guess before trying to code!: setting ‘a’ to ‘0.2 + 0.1’, how many times can you calculate ‘a = 2 * a - 0.3’ in binary64 ( python floats ), pls. post your guesses while not spoilering the result. Then estimate how many calculations are used in building a spaceship, and after that take a look at How Not to Land an Orbital Rocket Booster
The correct rounding is useful at a lower level. The fact that the decimal module has correctly rounded arbitrary precision arithmetic provides a good basis for building intervals and other things over the top and means that a lot of the hard work is already done.
Within arb the arf type reimplements mpmath’s mpf type and provides correct rounding for elementary arithmetic. The arb type is then built on top. Functions like sin are implemented for arb but not for arf and provide bounds rather than correct rounding. While correct rounding is useful it is generally only guaranteed for a single operation so at a higher level it is usually more useful to have something like arb that gives hard bounds on the result of a multistep calculation.
There are many aspects to this but one of them is that to make the intervals fully useful you really want to have multiprecision interval arithmetic so that there is always a general way to make the intervals smaller. In fact this is really the reason I want multiprecision arithmetic in the first place. I don’t actually want thousands of digits but just a few correct ones but I am sometimes prepared to use thousands of digits if necessary to get there.
With arb this is catered for because you have two properties that are very useful as a foundation for rigorous numerics:
Computing at any fixed “precision” p will use reasonably bounded computational expense to give an interval that guarantees to enclose the true result.
If p is made large enough then the interval can be made arbitrarily small at the expense of increased computational cost.
These two properties together make it possible to have controlled computational cost while still being able to solve hard problems without error when needed. Most of the time you won’t need to increase p from the default 53 bits but when you do need to you can.
There is a third property that arb provides loosely and is not guaranteed in general:
For simple operations the relative width of the interval will be something like ~1/2^p for a given precision p.
This is a much weaker guarantee than correct rounding. Strengthening this to something that resembles correct rounding would compromise the “reasonably bounded computational expense” property though. In many applications that is not a good trade off and it is better to compute a wide interval quickly and then afterwards consider whether or not there is any reason to spend more resources making it smaller.
(Oscar did work better in explaining arb approach, so I’m giving up in improving my post on that side. Hopefully, this might be at least partially useful.)
In fact, they (arbitrary-precision floating-point numbers) are used. The arf_t (used for the ball midpoint) is more or less same as the mpfr_t (but no signed zero and “unlimited” exponent). On another hand, the ball radius has limited mantissa (30bit) and fmpz_t as exponent, just as arf_t.
The rest is more or less similar to the interval arithmetic: the result of an operation over balls is a ball (not minimal!) which contains the result of the (mathematically exact) operation applied to any choice of points in the input balls.
No. Signed integers are implemented (fmpz_t, might be better than mpz_t for small values, but same or worse in “big enough” integers) and used, see above.
I don’t see a big difference. Hardly we can do anything here:
>>> x = flint.arb(mid=0, rad=3)
>>> x
[+/- 3.01]
>>> x - x
[+/- 6.01]
It’s possible we’re talking about different things. I’m going solely by Fredrik Johansson’a paper (which Oscar linked to). While Fredrik wrote the arb library, that’s not what the paper is about. The paper is about how some transcendental functions were implemented insidearb. The paper couldn’t be clearer about some of the things I said about it, like:
We use the mpn layer of GMP, since the mpz layer has unnecessary overhead.
and
The main contribution of the paper, described in section 4, is an improved version of Smith’s rectangular splitting algorithm for evaluating Taylor series, in which we use fixed-point arithmetic efficiently …
Whether or when “magical” interval arithmetic is helpful isn’t really relevant here. It’s not used in the function implementations Fredrik sketched. They’re using fixed-point arithmetic (conceptually scaled non-negative integers, and all with the same scale factor so “we can add numbers without any rounding or shift adjustments”). Software help isn’t needed to keep track of rigorous error bounds for the very constrained kinds of arithmetic these specific function implementations use. “Head arguments” are (believed to be) enough in this context, and don’t cost machine cycles
Yes, but it’s not so much “correctly rounded” on its own that’s helpful here but that there are multiple rounding modes supported “as if to infinite precision”. The “directed” rounding modes (to plus or minus infinity) were added to 754 specifically to enable more-efficient computation of narrowest-possible rigorous interval bounds. Intervals do “blow up” in practice, and narrowest-possible is a very worthy goal.
But this is all far afield (to my eyes) to the question of whether decimal would benefit users by adding some support for more common transcendental functions. I’m a “good enough” guy in general, but like the possibility for extreme “purity” when that’s needed. However, I’m essentially never in favor of letting “extreme purity” block “good enough” forever. Particularly in CPython, where other libraries with solid Python bindings already exist, and attract world-class experts to expend the sometimes extreme effort required to achieve extreme purity.
Usually the same here, but as a pragmatic matter if I “just” double the precision and run again, getting essentially the same result to single precision, “good enough” usually rules for me.
Way back when I worked on compiler and library development for Cray Research, selling very expensive early “supercomputers” with novel (at the time) 64-bit HW .floats. People complained they were getting wrong results. So, as a hack, we added a compiler option to chop each float result back to N bits. Jaws dropped around the world Plots of results against the number of bits retained often dramatically showed that their code was numerically unstable. The more bits were retained, the more consistent the results - but generally matched the results they claimed were “right” if N was picked to match the HW they were porting from.
“Extra” precision covers a multitude of (but not all) numeric sins.
Correction gratefully accepted! It’s a mistake I’ve made before, the one and only blind spot I have
Yup, what “serious numeric programmers want” is “all of the above, but stressing different aspects at different times”. “The fast drives out the good” is a general rule of reality, although 754 and Python both worked some against it.
Using interval arithmetic is revealing here. Here’s a start:
>>> from mpmath import iv
>>> p1 = iv.mpf(.1)
>>> p2 = iv.mpf(.2)
>>> p3 = iv.mpf(.3)
Those intervals are zero-width: the input floats are taken as being exact(*):
The higher upper bound on p1+p2 then relentlessly grows across iterations.
>>> a = p1 + p2
>>> for i in range(50):
... a = 2.0 * a - p3
... print(a)
...
[0.2999999999999999889, 0.30000000000000009992]
[0.2999999999999999889, 0.30000000000000021094]
[0.2999999999999999889, 0.30000000000000043299]
[0.2999999999999999889, 0.30000000000000087708]
[0.2999999999999999889, 0.30000000000000176525]
[0.2999999999999999889, 0.30000000000000354161]
[0.2999999999999999889, 0.30000000000000709433]
[0.2999999999999999889, 0.30000000000001419975]
[0.2999999999999999889, 0.30000000000002841061]
[0.2999999999999999889, 0.30000000000005683232]
[0.2999999999999999889, 0.30000000000011367574]
[0.2999999999999999889, 0.30000000000022736257]
[0.2999999999999999889, 0.30000000000045473625]
[0.2999999999999999889, 0.3000000000009094836]
[0.2999999999999999889, 0.3000000000018189783]
...
About every 3 iterations, the uncertainty spreads to cover another decimal digit, and there’s no end to that. While mpmath emulates binary floats with 53 bits of precision by default, its exponent range is essentially unbounded, so this continues far, far beyond the range of Python’s floats.
(*) Although “the input floats” are binary floats, and 0.1 does not mean “one tenth” here - it means “whatever binary float Python created in response to the 0.1 literal”). If you mean “one tenth”, the argument has to be a string instead:
>>> from mpmath import iv
>>> iv.mpf(.1)
mpi('0.10000000000000001', '0.10000000000000001')
>>> iv.mpf(".1")
mpi('0.099999999999999992', '0.10000000000000001')
Your description here suggests that you are just eyeballing the results. If you want to build a robust algorithm over the top of these numerics then you want more than this. This is the situation that motivates having something like arb and where I am quite happy for the intervals to grow as long as I know that it is happening and never end up with a falsely accurate-seeming result. I can deal with a result like x \in [-\infty,\infty] but I don’t want x = 0.1 without even the knowledge that x is definitely a well defined number.
It’s not a property of any specific implementation of interval arithmetic, it’s that the approach itself is inherently blind to the higher-level reasoning humans see at once: “if I subtract x from itself, it doesn’t matter what values x may span: the result is exactly 0”.
Correlations among inputs are common in real life, but IV is blind to them. Much the same holds for any univariate polynomial. For example, x*(x-1). Suppose we know x is in the interval [-3, 3]. IV thinks "OK, x may be as close to +inf as 3, and x-1 may be as close to -inf as -4, so the product may be as close to -inf as -12`.
But that can’t actually happen, because x has the same value in both subexpressions, whatever it may be. The actual smallest value the expression can have is -0.25, if x happens to be 0.5.
Johansson heavily relies on univariate Taylor-series polynomials, and can “out think” these things by head.
Right - but note the “usually”. It means what it says
Which I usually don’t want. “One & done”, speculation, playing around, … plus the obvious one: when I suspect a numeric problem and try this, it’s usually the case that the doubled-precision result shows “yup! numeric problem - the result is waaaay off from the single-precision result”.
Sure. Me too. That’s just not usually the business I’m in, though, and “try it with twice the precision” is good enough for most of my actual purposes.
If I need more than that (which I usually don’t), I’ll usually think about proof next. “Oops! the result isn’t plausible - start over with more precision” (lather, rinse, repeat), would be a last-resort approach to my sensibilities. Although it may well be the most pragmatic approach to solving Johansson’s problem (implementing “correct rounding” in a context where the amount of precision needed apparently cannot be known in advance - “table maker’s dilemma”).
In any case, what does any of this suggest to you about Python’s future? I don’t really see any connection to the OP’s complaint (decimal is missing a bunch of transcendental functions). Perhaps that was the point of your much earlier:
You think Python should support robust interval arithmetic first? If so, that’s pretty much the same to me as saying “no way”. As you also said:
People don’t “find time” for such big undertakings. They have to ruthlessly force time for it, by dropping most other interests for the duration. Somebody has to want it a whole lot, and fight for it. “Interval arithmetic for Python” would predictably be an uphill battle, starting with that there are already capable packages implementing it for binary floats.
So I expect (but don’t know) that the only realistic hope for this path would be for Stefan Krah to implement it inside libmpdec.
You are right, I was talking more about data structures in the current flint (not sure how much they differ wrt ~10 year ago). Indeed, Fredrik uses mpn_* functions directly in code for some elementary transcendental functions, while e.g. in MPFR usually use own primitives (like mpfr_mul, etc, which in turn use mpn layer). Not sure how big gain is, I doubt it’s too much.
That depends on what is “x” and what is “subtraction”.
Just as the ball arithmetic, which is one form of IV. You should expect here some magic:
>>> x = flint.arb(0, 3)
>>> x
[+/- 3.01]
>>> x*(x-1)
[+/- 12.1]
>>> y = mpmath.iv.mpf([-3, 3]);y
mpi('-3.0', '3.0')
>>> y*(y-1)
mpi('-12.0', '12.0')
My mind model for x - x is that I want to subtract two quantities a - b. Each span some interval, maybe infinite, maybe with some nontrivial probability distribution. To account correlations I should use a correlation function, right? Suggested above delta-function sounds too strong for me in real world.
That’s a fair point, but - alas - is typically overlooked entirely when enthusiasts try to “sell” interval arithmetic to the public. The typical user:
has some numerical code they’re queasy about
is told “redeclare all your floats to be of an interval type - then you’ll get guaranteed rigorous bounds on their final values! all by magic”
they do that
they do get rigorous bounds
which are so very wide they’re most often useless
In their original code, the meaning of, e.g., x*(x-1) was utterly clear. After converting to an interval type, its actual meaning is something of a mental nightmare. They weren’t told about this in advance, and there’s very little accessible info about how they “should” go on to rewrite their code to get useful results from intervals.
Kahan has written extensively elsewhere about that there are essentially only two gimmicks he’s seen that “working programmers” (including scientists - not limited to those “naïve” about floating point) can actually use successfully to determine when their code is numerically rotten:
The ability to use extra precision, and preferably significantly more than they “believe they should need”.
The ability to change the default rounding mode.
Those match my experience. Those can very quickly expose problems. The second may be surprising, but it’s really not in hindsight: numerically naïve code is often so very delicate that the tiniest changes can “blow up” very quickly. For example, catastrophic cancellation can change a difference in the last bit into a difference in one of the most-significant bits via a single + or -. In modern HW, changing the rounding mode doesn’t even cost more cycles (although very few programming languages make it easy to change it).
Python’s decimal module makes both very easy for users (except for the initial one-time burden of learning how to spell them). Those are powerful tools just about anyone can use without significant training.
Fixing flawed code remains expert-level work, although it’s often possible to get away with just using more precision.
hi,
being short in time and lacking skills to understand all the points this discussion has branched to … just let me throw in 6 conceptual points and one technical question:
let’s accept: after nearly 100 years of computing, and 40 years of the standard, we are near to but not fully in a position to provide users with what they need, ‘numerical mathematics’.
let’s recall Prof. Kahan: ‘Redesigning computers costs less than retraining people, so it behooves us to adapt computers to the way people have evolved rather than try to adapt people to the way computers have evolved.’
in that sense Tim is one of few people who repeatedly mentions this goal, in other words but in meaning.
reading this thread for a normal user is … chinese … but to safely use computers he needs to know about interna once they deviate from what he’s expecting, in this respect deviate from ‘math’. And they do.
even all our simple hints ‘round your results’, ‘calculate scaled to integers’, ‘use decimal datatypes’ do help often, however not always.
we require users - ‘users’ … you know what type of people that is? - to ‘know about their use case’, ‘choose appropriate datatypes’ and ‘use numerically stable algorithms’ … WHAT!!!
And the technical question:
astonished, if I recall correctly I’d read quite some warnings that switching rounding modes severely harms performance in modern systems because of the discarding of the cache lines … wrong?