Contexts for floating-point math?

Yes, with a thread-local storage. That’s true not just for rounding modes, but for FPE exceptions/status flags as well.

I don’t see it in your example, but I think I understood you correctly. Yes, it’s expected that all code in the new context will work with same altered settings for FPE. E.g. not raise exceptions for certain cases or use some non-default rounding mode.

IIUIC, they are trying to make rounding more “local”, i.e. some per-function setting. Perhaps, it’s more close to Oscar’s proposal to have mathematical functions (like sin or add) on the context object.

I’m not aware of any proposal to replace fset/getround() in C, though the c comitee is a strange place nowadays and did strange things for floating-point math in C.

Really?

>>> FE_TONEAREST, FE_UPWARD = 0, 0x800  # magic number might be different on your system
>>> import ctypes, math
>>> libm = ctypes.CDLL('libm.so.6')
>>> libm.fesetround.argtypes = [ctypes.c_int]
>>> libm.fesetround.restype = ctypes.c_int
>>> libm.fesetround(FE_TONEAREST)
0
>>> math.log(123)
4.812184355372417
>>> libm.fesetround(FE_UPWARD)
0
>>> math.log(123)
4.812184355372418

Sorry for posting again after many months after forgetting about this issue, but the topic might come up again, so this answer might be helpful. Yes, you can already change the rounding modes using ctypes, but this actually breaks stuff. I tried this on my Linux machine and got the follwing results.

>>> import ctypes
>>> libm = ctypes.CDLL("libm.so.6")
>>> libm.fesetround.argtypes = [ctypes.c_int]
>>> libm.fesetround.restype = ctypes.c_int
>>> FE_TONEAREST = 0
>>> FE_DOWNWARD = 1024
>>>
>>> libm.fesetround(FE_DOWNWARD)
0
>>> 0.1
0.09999999999999999

This shows that also the interpretation of literals in the repl is affected by the rounding mode. More precisely, the conversions between str and float are affected, which breaks an invariant Python aims to keep: the repr-eval roundtrip.

>>> 0.1 == eval(repr(0.1)), 0.1 == float(repr(0.1))
(False, False)

Also the floating point arithmetic is affected:

>>> 1.0/10.0
0.09999999999999999

Here, the numbers 1.0 and 10.0 can be represented exactly, so they themselves are not affected. However, the result is rounded differently. So, is floating point arithmetic not pure in Python? Cpython’s optimizer thinks that floating point arithmetic was pure since it does constant folding, which is only a correct optimization for pure functionality:

>>> libm.fesetround(FE_TONEAREST)
0
>>> def test():
...     libm.fesetround(FE_DOWNWARD)
...     x = 1.
...     y = x/10.
...     z = 1./10.
...     return y, z, y==z
...
>>> test()
(0.09999999999999999, 0.1, False)

Another thing that would be broken if changing the rounding mode was supported. Do I consider these things currently as bugs? No, because Cpython currently does not support changing rounding modes. If you do it using ctypes, you are on your own. You can also change the content of a tuple using ctypes, nonetheless they are documented as immutable.

However, if changing the rounding mode would become supported (by providing a function in the math module), I would consider the break of the repr-eval roundtrip and any of the constant foloding of floats to be bugs.

1 Like

Some code in the CPython assumes specific rounding mode. I would guess, that temporary switching to that one will fix eval-repr roundtrip.

CPython blindly ignores possibility of altering default rounding mode in any way (ctypes, extension module, whatever).

Note that C also does constant folding, which suffers from same problem. Some compilers have dedicated options (e.g. -frounding-math in gcc) to disable optimizations that assume default floating-point rounding behavior (round-to-nearest).

Maybe it would be nice to have similar option in the CPython. But I don’t see a big problem without: if you care about differences in constants up to ±1ulps — why not specify them in code directly, not by constant expressions like 1/10?

That is exactly the reason why there are proposals in the C and C++ committees to remove that facility: it is extremely hard to get it right. Sometimes, its correctness depends on implementation details of the compiler, like which terms are folded.

People who would use fesetround would care. Surely, one would probably not just write 1.0/10.0, but other terms, like in a formula x*(a+b). If I want to round towards -\infty, could I use fsetround(FE_DOWNWARD)? No, its not that easy. If x was negative, one would need to calculate a+b with fsetround(FE_UPWARD) first. The whole concept with global rounding modes is screwed. In fact, it would be more appropriate to set the rounding mode for each operation separately (which is also what Hans Boehm outlined in the C++ paper I have linked previously). This could be done by a context object.

1 Like

I agree. What we should be aiming for in Python is something like:

ctx = Context(rounding=half_even)
result = ctx.add(a, b)

ctx_dn = Context(rounding=down)
ctx_up = Context(rounding=up)
r = ctx_dn.divide(a, ctx_up.add(b, c))

Or alternatively:

r = ctx.divide_dn(a, ctx.add_up(b, c))

I think that for something like half-even it makes sense to create a context object with the rounding mode as a property of the context that you use for many operations. For something like rounding down the only reason you would make that a context setting is because you are working global rounding modes and the only way to effect operations is by mutating those global variables. It never makes sense to say that you want every operation to round down e.g. for r above if you want r to be definitely rounded down then you need b+c to be definitely rounded up (assuming a,b,c positive).

3 Likes

I assume, a and b are just different float literals to apply constants folding. Then, honestly, I don’t see a principal difference here wrt the previous example. You also can use variables for a and b to apply custom rounding settings consistently.

IMO, the difference wrt notion of the “current context” is tiny. Like with decimal’s getcontext()/setcontext() - you can apply different context settings on any level, up to individual arithmetic operations.

What’s wrong with (like gmpy2/decimal):

ctx = Context(rounding=half_even)

setcontext(ctx)
r = a+b

ctx_dn = Context(rounding=down)
ctx_up = Context(rounding=up)

with localcontext(ctx_up):
    r = b + c
with localcontext(ctx_dn):
    r = a / r    

?

The first example is nice if you assume that only one person (or library etc) is using the global context but breaks down as soon as you have e.g. two libraries that want to use the Decimal module at the same. The second example also doesn’t work if you need to call any functions that themselves may be affected or need to not be affected by you messing with the context.

Who is supposed to control the context inside func here? Did the authors of func know it would be used this way?

def stuff(x, func):
    with localcontext(ctx):
        func(x)

If you really want to do something as simple as a / (b + c) rounded down though then why would you prefer this:

with localcontext(ctx_up):
    r = b + c
with localcontext(ctx_dn):
    r = a / r 

over this:

r = ctx_dn.divide(a, ctx_up.add(b, c))

I would say that neither is especially easy to read but the latter using only local variables and no manipulation of global state is far easier to understand and use in library code.

2 Likes

If the author of func() doesn’t force use of the context they want, they have to expect that the thread-local “global” context will be used. That’s all the precise code above does - makes a copy of the thread-local context and temporarily makes the copy the thread-local context. A difference is that if func() changes the thread-local context and intends for that to persist, it will lose, as the copy is thrown away when the with block exits.

However, the latter forces use of all the settings in the canned ctx_dn and ctx_up. It’s quite common, e.g., to want to use “the current” working precision too, which canned contexts can’t know.

So I’d usually expect to see something like

with localcontext(rounding=decimal.ROUND_FLOOR):
    r  = b + c

instead. Very explicit and clear. Calling arbitrary func() within the block would, however, be a Bad Idea™,

But one size doesn’t fit all. decimal contexts also expose almost all functionality via function attributes too, so things like your

r = ctx_dn.divide(a, ctx_up.add(b, c))

work. I’ve used that too - but only when I know the canned contexts have sufficient precision.

Shortcuts can also help. Prime example: I love mpmath’s extraprec(bits) context manager. Precision is, by far, the setting I want to fiddle with most often, and usually specified as a delta from whatever the current precision is.

Coma to think of it, I don’t actually have much use for changing hardware IEEE-754 context gimmicks. I know the committee intended that the directed rounding modes be useful for implementing interval arithmetic, but changing FPU control bits is extraordinarly slow on modern boxes.

1 Like

How noticeable is this in the context of Python code that does scalar arithmetic with floats?

Assuming that the context was implemented in C and the FPU control bits were used for ctx_dn.add(x, y) then would you actually notice the overhead when using CPython?

Yes, and this is awkward if you want to write code that is not affected by the global context. I would think of it like the global context is for end users and the default position for library code is that it should usually be isolated from the global context in both directions. That means that in library code you have to choose between these:

def func(x, y):
    # Every public function needs to do this: 
    with localcontext(myctx):
        return x + 1/y

Or

def func(x, y):
    return myctx.add(x, myctx.divide(1, y))

The latter is far simpler to reason about especially once you start thinking about generators or callback functions or anything else. I think it is just better to start from the premise that there will be interfaces that provide pure functions that are not affected by and that don’t affect global state. Afterwards if you want global contexts for end-user convenience then okay but it means that library code needs to carefully avoid that global context

In the case of float which is a fixed precision type I’m not sure that changing the global context adds much value. I would skip that idea and focus on providing local contexts that can be used without mucking around with global variables.

This is why I suggested this:

That is actually the way I would think about directed rounding. The divide_dn operation is a different operation from divide. It doesn’t make sense to flip a flag and say “okay now divide always means divide_dn” but it does make sense to have a calculation where you sometimes call divide_dn and sometimes divide or divide_up. The idea of a global flag here presumably comes from what makes sense at the hardware level so I guess that gets exposed more or less directly in C but I don’t think it is a good model for actually writing numeric code and I don’t think that Python should emulate that interface.

2 Likes

Yes, exactly. Therefore, I would have suggested that CPython should not do temporary changes of the rounding mode since then everybody has to pay CPU cycles for something that is only used by a few. Instead, CPython should just document in its C-Api that if third-party code (like in extension modules) changes the rounding mode (or other parts of the floating point environment?), they should restore it before other Python code is executed.

Concerning the support for changes to the floating point environment in the Python standard library: Many proposals have been rejected with the argument that they would be better implemented as a PyPi-package. I think, changing rounding modes is an expert-only tool. So, I think, there is no need to have it in the standard library. (Since I consider it as actually harmful for the reasons outlined above, I argue against it. If I would not consider it harmful, I would just have ignored the whole thread.)

Just one more extra: Are functions in math.h required by the C23 standard to be correctly rounded?

Answer

My reading of the standard is: No. Correct rounding is only expected for a few “basic” functions.

If I try on my Linux machine:

>>> import ctypes
... libm = ctypes.CDLL("libm.so.6")
... libm.fesetround.argtypes = [ctypes.c_int]
... libm.fesetround.restype = ctypes.c_int
... FE_TONEAREST = 0
... FE_DOWNWARD = 1024
... 
>>> import math
>>> 
>>> x = math.pi/2.0
>>> 
>>> libm.fesetround(FE_DOWNWARD)
0
>>> math.sin(x)
1.0
>>> _.as_integer_ratio()
(1, 1)

Since \pi/2 is an irrational number, it cannot be exactly represented by a floating point number. So, x cannot be exactly equal to \pi/2. Therefore, the true mathematical value of sin(x) is strictly smaller than 1. So, if we set the rounding mode to FE_DOWNWARD, a correctly rounded result must be smaller than 1.0. However, this is not what we get here…

2 Likes

Actually there can be some value in this but it would be for changing error control like numpy’s errstate rather than for precision or rounding. I would still prefer that both numpy and the stdlib provide non-global contexts for these things but as long as a + b is something that can only be influenced by global variables then it would be useful to be able to influence whether it uses nan vs warning vs error for elementary operations and for e.g. disabling nan-pass-through. I imagine that is probably not something CPython would want to implement though.

Depends on how often it’s done and on the implementation of Python. Float arithmetic generally goes much faster under PyPy than under CPython, because lists of floats under PyPy magically specialize to contiguous arrays of raw machine bits (no “Python object” space, or representation conversion, overheads).

But in general, no, people using floats in Python aren’t primarily concerned about speed.

Doubt it in general, but don’t know - and don’t really care :wink:

A good use for fiddling global rounding is as a low-effort way to do empirical perturbation analysis: fiddle rounding to, say, round-to-0, once at the start, and rerun the program. If results change “a lot”, the numerical stability of the code is in question. With luck, iterate fiddling the mode on smaller regions of code, and sometimes you can zero in on the function that’s overly sensitive to slightly different rounding.

There just isn’t “one size fits all” to be had.

I’m not following. Now you’re forcing use of ctx’s precision. which is precisely what I didn’t want to do. I want the precision the function’s caller used.

“The fast drives out the good.” HW has direct support for quite a large global floating-point “control and reporting state”, fiddling which is the only way to change what the HW does. All implementations that care a lot about speed will use it as directly as possible.

Unless they say they’re compliant with the relevant fp standards, they’re not even required to round +, -, *, / correctly. If they do say they’re compliant, they do have to round those correctly, and also \sqrt{x}. Other than those, no guarantees.

“Correct rounding” isn’t yet “a thing” across platform libms. Over time, they do seem to be converging on “faithful rounding” (one of the two representable floats closest to the infinitely precise result, usually less than 1 ULP error), but generally ignore the rounding mode is effect (all of IEEE’s rounding modes are “faithful”).

2 Likes

This is actually will be required only if rounding mode for the current context is different from the FE_TONEAREST. I.e., it will costs a test of some struct field, unless user is in the camp of “few”.

This requires to reinvent all floating-point arithmetic and related stdlib parts (math/cmath modules).

It is something we have in the gmpy2 package.

Keep in the mind, that rounding modes control is only tiny part of the proposal. The floating-point context notion could be used to control exceptions (vs return IEEE 754 value), to access FPE status flags.

Then the caller passes ctx explicitly:

def func(a, b, c, ctx):
    return ctx.divide_dn(a, ctx.add_up(b, c))

I understand that this is not what “end users” would often want but for library code this is the best way. The caller passes in one context with one precision and one error handling mode but directed rounding is not part of the context parameters because it never makes sense to call func with a context that always rounds down.

1 Like

For the actual current state of libm accuracy across platforms, this paper from late last year did its best to investigate and summarize:

Lots of work!

4 Likes

That is a wonderful package. As I have learned form its docs, that package also provides many correctly rounded algorithms - as opposed to the C standard library, which, as @tim.one explained, does not provide comparable guarantees.

According to man pages, there are only 2 things form the floating point environment that can be accessed with portable interfaces: the rounding mode and the floating point exceptions. Did I overlook something?

I have provided feedback on possible problems if the rounding mode is changed.

I think reading and clearing the floating point exceptions is ok. (However, also this is affected by constant folding: The floating point exceptions would be set when the code is compiled, not when it is executed. It would even be possible to add bytecode instructions that raise the floating point exceptions at runtime.)

1 Like

That’s true, but I think that libm’s state of art will be subject of change in the near future. Recent editions of the IEEE 754 recommend correct rounding of mathematical functions.

And it’s not an interface to hardware floating-point arithmetic and libm’s functions. In particular, much more slow:

$ python -m timeit -u nsec -s 'a=1.1;b=2.2' 'a*b'
5000000 loops, best of 5: 66.3 nsec per loop
$ python -m timeit -u nsec -s \
   'import gmpy2;gmpy2.set_context(gmpy2.ieee(64));a=gmpy2.mpfr(1.1);b=gmpy2.mpfr(2.2)' \
   'a*b'
1000000 loops, best of 5: 290 nsec per loop

Yes, but it’s something, which is not a part of the C standard, rather Python’s implementation of floating-point arithmetic and bindings to the libm:

With the context notion, this could be configurable, like in the decimal module (or gmpy2).

2 Likes

Thanks for the detailed reply. For me, this are then two rather orthogonal aspects of the proposal: (a) rounding modes and (b) error handling (that includes C and Python side).

In think changes to error handling can be rather easily implemented without causing disruption (the easiest way would be to exclude operations from constant folding that would raise a floating point exception; already now 1.0/0.0 is not folded).