`math.exp` and `math.log` delegate to `exp` and `log` methods?

NeilGirdhar · December 16, 2024, 6:20pm

I’m not sure what you mean. The graphs do not contain concrete array values. They contain ordinary symbols just like SymPy symbols. Is it the fixed-width dtype that’s bothering you? If so, you could create a dtype like symbolic, but I don’t see what’s wrong with just leaving the dtypes as float64 or something.

oscarbenjamin · December 16, 2024, 6:27pm

Neil Girdhar:

oscarbenjamin:

The other thing that is missing is the actual generic functions like sin and cos.

sin and cos are already in the specification. You can use them like so:
from array_api_compat import get_namespace
import numpy as np

def f(x):
  xp = get_namespace(x)
  return xp.square(xp.sin(x)) + xp.square(xp.cos(x))

There is a basic point here that you keep overlooking. That sin function needs to be in a library that does not depend on any of the array libraries. There needs to be an agreed upon package that provides these functions so that users know where to get them from. The packages providing these should not be numpy and scipy though because they should be usable without having numpy and scipy installed.

NeilGirdhar · December 16, 2024, 6:33pm

I guess the Array API designers considered that. Their design allows all adopters to release independently of any central library (except array_api_compat, which has a tiny interface). And yet the Array API still satisfies the goal of allowing library-agnostic code to be written.

Yup, that’s what the Array API designers did . All you need is array_api_compat.

Anyway, I think we’re really drifting from the point of the proposal at the top. Why don’t we move the Array API discussion somewhere else if you like?

The only reason I brought it up was to say that the interoperability problem with arrays is already solved. And while I guess you disagree, I feel like we should discuss this somewhere else.

I think the proposal at the top is asking for a way to write code that:

operates on probability distributions (at least), and
interoperates between probability distributions and arrays (ideally).

ngoldbaum · December 16, 2024, 6:56pm

That’s a good point, it would be better if NumPy had a smarter heuristic than “look for a sin attribute on the scalar” there. We could even eventually deprecate getting a sin attribute in the future to save on overhead a tiny bit and do something about the proliferation of methods in downstream APIs that exist mostly so this example works.

oscarbenjamin · December 16, 2024, 11:30pm

Exactly. The point that I am trying to make here is that there should be a non-heuristic way of doing this: an agreed mechanism for how you compute sin(x) for some arbitrary object x whenever that is a well defined operation. There should also then be a sin function somewhere that implements that mechanism without needing to depend on e.g. NumPy. Then NumPy can either use that function or reimplement it by the same mechanism but it would also be used for many other things besides NumPy’s object arrays.

oscarbenjamin · December 17, 2024, 4:32pm

There are difficulties in making generic functions without multiple dispatch. The multiple dispatch approach works better in Julia where it is a core part of the language, not just a feature of some mathematical functions. In Julia any function/callable can be used in a symbolic expression which enables a form of symbolic meta-programming that goes much further than just computing a function like sin(x) over different domains or representing array operations.

There are multiple dispatch libraries in Python but the idea has not really taken off and there are issues around how exactly to implement it efficiently. This is not such a problem in Julia because dispatch is resolved by the type system typically at “compile time”. Now with typing in Python it is more problematic because the type system has not been designed in a way that can work in general for runtime dispatch.

The approach taken in Python for mathematical code is more that you have a math module which has functions like math.sin etc that are defined for a particular domain. Then each function coerces to its own type for the operation. This means that you typically have a well defined domain of calculation but you need to be explicit everywhere about which is the domain that is being used and that you use all the mathematical functions from there.

There are advantages in using an explicit domain like cmath.sin compared to the multiple dispatch approach but for it to work interchangeably and extensibly you need that “math module” interface to be consistent across the different libraries and types and you need a way to query an object to get its associated “math module”. The array API provides these two things for arrays with fixed-width dtypes but is missing a lot of the features that you would want to make this sort of thing useful for other types.

The most obvious thing that you would want if going beyond the dtypes in the array API is context management. For e.g. decimal, mpmath and gmpy2’s mpfr type the equivalent of the “math module” is instead referred to as a “context” e.g.:

>>> import math
>>> from decimal import getcontext
>>> ctx = getcontext()
>>> math.sqrt(2)
1.4142135623730951
>>> ctx.sqrt(2)
Decimal('1.414213562373095048801688724')

Now a whole bunch of complications are introduced by the fact that the context is a mutable object. Different libraries model the global precision and rounding modes in different ways e.g. for decimal a + b always uses the global precision whereas for mpmath the context is associated with the individual instances: the precision of the context connected to a is used for the operation a + b. This distinction is not necessarily something that can be handled in a generic way without knowing anything about what types you are working with.

For simple end user things it is fine to set the global precision at the start of a script but for library usage you will want to use the context methods like ctx.add(a, b) rather than a + b. Now if you want to have arrays of Decimal or arrays of mpmath objects you would want to have a way of doing the equivalent of ctx.add(a, b) for those arrays: you need a mechanism that allows both context-explicit and context-implicit operations to distribute over the array operations.

Another thing that is needed if working with multiprecision types is a well-defined way to deconstruct an instance into a structured format for converting one type into another. All floating point types can be represented in terms of integers as m*b**e with the exception of special values like nan and inf. The Real ABC in the numeric tower does not provide any way to detect special values or to get these integers. Real only defines __float__ but the whole point of using Decimal etc is that they can do things that float cannot so you don’t want to use float as an interchange format for converting between e.g. gmpy2.mpfr and decimal.Decimal. The context object could be used for conversion, deconstruction and for detecting special values but these things need to provide a consistent interface somehow:

>>> ctx.is_infinite(2)
False
>>> math.isinf(2)
False

Besides multiprecision then you have things like the uncertainties package mentioned above but then also things like interval arithmetic and ball arithmetic. It is very useful to be able to repeat some floating point calculation with these things to gain measures of uncertainty or hard bounds on rounding errors. At the same time though it is not always possible to write generic code that does not need to care about the distinction between floating point and interval arithmetic so there is a need to be able to distinguish which case you have. A context object could provide information about what sort of type/arithmetic it works with which implies what sort of operations are supported. This can be propagated both at runtime and through the type system.

Beyond approximate types you have exact calculations and here rather than having a context you would have a “domain” object e.g. in sympy or sage you have ZZ, QQ for integers and rational numbers or GF(p) for integers mod p or ZZ[x,y] for polynomials in x and y with coefficients in ZZ and so on. Now all operations will use the domain object explicitly and for conversions between different types e.g.:

# sympy
>>> from sympy import *
>>> x, y = symbols('x, y')
>>> R = QQ[x,y]
>>> R
QQ[x,y]
>>> R.ring
Polynomial ring in x, y over QQ with lex order
>>> px, py = R.gens
>>> (px + py)**4
x**4 + 4*x**3*y + 6*x**2*y**2 + 4*x*y**3 + y**4
# python-flint
>>> import flint
>>> R = flint.fmpz_mpoly_ctx.get_context(2, flint.Ordering.lex, 'x')
>>> R
fmpz_mpoly_ctx(2, '<Ordering.lex: 0>', ('x0', 'x1'))
>>> x0, x1 = R.gens()
>>> p = (x0 + x1)**4
>>> p
x0^4 + 4*x0^3*x1 + 6*x0^2*x1^2 + 4*x0*x1^3 + x1^4
>>> p.sqrt()
x0^2 + 2*x0*x1 + x1^2

In this situation you use your domain object to create instances like R(1) or R.one etc. Now the domain object can give you all sorts of information like R.is_Field tells you whether or not division is possible in the domain. The domain object should really be passed around explicitly in any code that operates with the elements of the domain because it is often necessary to query its properties and functions are often methods on the domain object. These domains have different notions of division, sqrt etc that are not the same as those for floating point numbers e.g. a square root mostly does not exist but sometimes does.

I can list many more examples of these things but the main point is just that all of these namespaces, contexts and domains are all kind of the same thing and have the same purpose and similar interfaces. They could be usable in an interchangeable way except that they have not been designed with compatible interfaces and so there is no way to write generic code that can work for each of them. An alternative to multiple dispatch that is potentially more suitable for Python is to have a specified interface for contexts. That is sort of what the array API at least in terms of how it specifies functions like sin and cos.

A consistent interface for these things would make it possible to write generic library code that can be applied extensibly to new types by passing contexts around explicitly. For end users though I would still want them to be able to write some code without thinking about contexts explicitly and to be able to use global context/precision if they want:

from genericmath import sin, exp

# Users don't necessarily need to think about contexts:
def user_code(x):
    return sin(x) * exp(x)

# Always pass explicit contexts in generic library code:
def library_code[T: Real](x: T, ctx: TransContext[T]) -> T:
    return ctx.mul(ctx.sin(x), ctx.exp(x))

Passing the context explicitly as an argument here allows us to constrain the types in two ways:

I am supposing here that TransContext means a Context (protocol-)subclass with common transcendental functions (e.g. sin and exp).
We can also constrain the type T e.g. here it is constrained to be Real so we know what operations like +, < etc are supported although we may prefer ctx.add etc.

Note that in the typical multiple dispatch approach there is no ctx argument and thids would just have to dispatch on the type of T. It would not be possible to have a function that takes the same type with different contexts e.g. you might want to allow an int to be treated as a float in some cases but as an exact integer in others:

library_code(1, ctx=RR) # float: 0.12345
library_code(1, ctx=Sym) # symbolic: sin(1)*exp(1)

Without the ctx argument you would have to choose between these possible outputs as the dispatch for int. In Julia I think this can be resolved by specifying through the type system which dispatch to use. In Python the runtime dispatch only sees the runtime type which is int. That is why we need the runtime object ctx to carry this additional type information.

Having a separate ctx argument also enables another capability that you would not have with multiple dispatch: you can make a context for a type e.g. int without needing to have any control over the code for that type. Actually it is also possible to do that with multiple dispatch but it gets messy (this is called “type piracy” in Julia).

A well defined typed interface like this would allow this generic code to be fully checked to ensure that the different types and contexts are indeed compatible. This can propagate all the way back to a user who is trying to pass the wrong domain or incompatible types into some function.

The array API example from before is:

from array_api_compat import get_namespace

def f(x):
    xp = get_namespace(x)
    return xp.square(xp.sin(x)) + xp.square(xp.cos(x))

To me using get_namespace like that is not really right for users or for library code but can make sense at the API boundary between the two: if end users doing scientific/mathematical calculations are not necessarily expected to pass context/namespace objects around everywhere then some library API needs to have a way to conjure one up. It might be better though if that API allowed providing the context explicitly rather than using the equivalent of get_namespace:

def f(x, ctx = np):
    return ctx.square(ctx.sin(x)) + ctx.square(ctx.cos(x))

mikeshardmind · December 17, 2024, 4:42pm

I was thinking about it closer to a central registry of implementations, modeling somewhere between type-piracy in Julia and traits in rust, only with a dedicated module for handling libraries registering their own types and implementations since we don’t have a compiler in play here that can work all of this out at compile time. This becomes more obviously necessary with functions that take multiple arguments where one of the arguments may have come from a calculation another library did.

While I didn’t spell all of it out, I agree with the majority of your assessment above, though I don’t think passing a context explicitly around is necessary for newly written code, this could just be a few uses of contextvars, some context managers that manipulate them, and something for libraries interacting with that module to call once during import to register their types and implementations.

NeilGirdhar · December 17, 2024, 5:26pm

Can we please move this to a new thread?

Just FYI, the Array API already supports contexts. The context lives on every array object.

In the Array API, you just do

x = my_library.asarray(..., context=context)

Then when you pass x around, all of the operations have access to the context. You can write an Array API library that does all the things you want. E.g.,

>>> x = lib.asarray(2, context=ctx)
>>> xp = get_namespace(x)
>>> xp.sqrt(x)
Decimal('1.414213562373095048801688724')

def user_or_library_code[T: Array](x: T) -> T:
    xp = get_namespace(x)
    return xp.sin(x) * xp.exp(x)  # much easier to read.

library_code(numpy.asarray(1)) # float: 0.12345
library_code(symbolic_lib.asarray(1)) # symbolic: sin(1)*exp(1)

I understand that you don’t like the design of the Array API, but there are benefits to pulling the namespace out of x:

it eliminates multiple dispatch,
it eliminates mismatches between the genericmath version and the multiple array library (numpy, jax, whatever you’re using) versions, and
it’s probably more computationally efficient since the various math functions don’t each need to query for the array namespace (the query happens once per function).

Anyway, I think that ship has sailed. NumPy 2 has been out for a long time now, and many Array API adopters have fully implemented it. I also really don’t think complaining about its design or wanting to introduce a competing standard is answering the post.

oscarbenjamin · December 17, 2024, 5:33pm

I have not once said that I don’t like it. Let me be very explicit here: I think that the array API is very good.

I just think that more than the array API is also needed. That is also not intended to be a criticism of the array API though: as I said before it had a well-defined scope.

ntessore · December 17, 2024, 5:40pm

There are probably also instances where the semantics of the Array API and the semantics of a “maths API” are different. A custom Matrix library might want to do xp.exp(A) element-wise in the former but math.exp(A) as a matrix exponential in the latter.

NeilGirdhar · December 17, 2024, 5:52pm

Sorry if I misinterpreted your comments! I just thought with all of the “limitations” that you’ve suggested that you were dissatisfied.

I hope you at least see that everything that you want to do, you can already do with Array API, but maybe not in a way that is pleasing to you. (For example, you can’t always pass bare Python scalars to Array API functions).

I think we can work within the context of the Array API to ultimately make room for the probability distributions in this post. I haven’t yet seen any limitations to the Array API.

That would be incredibly confusing since exp is defined to be elementwise. The function you’re looking for is called linalg.matrix_power. Array types that change the meaning of operations are a design error (e.g., see the numpy.matrix type).

ntessore · December 17, 2024, 6:39pm

Indeed, that is my point. In the Array API, the exp operation is defined element-wise, which makes sense when you are working in a numerical setting where a matrix is a stack of numbers. But when you are working in an analytical setting, the exp operation might take an element from the space of NxN matrices to the general linear group. So it makes little sense to me to try and bend the Array API out of shape to fit this application.

NeilGirdhar · December 17, 2024, 6:47pm

My point is that you should choose a different name. The meaning of the identifier exp is elementwise exponentiation irrespective of the context.

The idea that you should use the same identifier but have it mean something different in a different context was exactly what NumPy did with their matrix type. This was a horrible design error that they then spent a decade trying to correct.

That’s why you should use matrix_power when you mean matrix-power.

That said, there is a way in which exp could do what you want, and that is if the dtype itself were matrices. For example,

>>> x = matrix_lib.asarray([[1,2],[3,4]], dtype=matrix_2x2_float32)
>>> x.shape
()
>>> xp.exp(x)  # Okay.  Result also has shape ().

Not sure who would ever use something like that, but it’s possible within the context of the Array API.

The whole point to having a unified exp function is that you can write functions that use these functions and they mean the same thing no matter what array type you’re passing in. If exp changes meaning depending on the type, then you lose the main benefit of having an API in the first place.

jamestwebber · December 17, 2024, 6:50pm

What do you mean? According to who?

NeilGirdhar · December 17, 2024, 6:55pm

According to the API. I mean this in general: no matter how you define the API, every identifier needs to have a consistent definition so that it can be useful. You could define exp to be “either elementwise or not”, but that severely limits how it can be used. People would invariably ask for a way to distinguish matrix-power from elementwise-power.

oscarbenjamin · December 17, 2024, 8:41pm

The spec does not say this at all. It says

exp(x: array, / ) → array¶

Calculates an implementation-dependent approximation to the exponential function for each element x_i of the input array x (e raised to the power of x_i, where e is the base of the natural logarithm).

You seem to be so deep in this idea that everything is an array that I can’t see how else to explain this besides putting a matrix into an array:

In [1]: from flint import arb_mat

In [2]: import numpy as np

In [3]: M = arb_mat([[0, 0], [0, 0]])

In [4]: M
Out[4]: 
[0, 0]
[0, 0]

In [5]: a = np.array([M])

In [6]: a
Out[6]: 
array([[0, 0]
       [0, 0]], dtype=object)

In [7]: np.exp(a)
Out[7]: 
array([[1.00000000000000,                0]
       [               0, 1.00000000000000]], dtype=object)

In [8]: a.shape
Out[8]: (1,)

This is the element-wise exponential of the array a: a new array of the exponential of the scalars in the original array. The scalar in this array is a 2x2 matrix and the exponential of that (scalar) matrix is the matrix exponential and not the element-wise one. There is no mathematical sense in element-wise exponentiation for the elements of a ring of matrices.

The array API spec for exp also says:

x (array ) – input array. Should have a floating-point data type.

In other words it has absolutely nothing to say about what should happen when the scalars in the array are matrices: that is simply out of scope for the array API.

Now the question is: why is it that np.exp worked for this example?

The answer is because arb_mat has an .exp method:

In [9]: M.exp()
Out[9]: 
[1.00000000000000,                0]
[               0, 1.00000000000000]

Calling .exp() on the elements of an object array is just a weird numpy convention rather than any well-specified or widely implemented way to compute exp(x) for an arbitrary object x. This is what the array API does not cover: how do you compute the exp of a scalar?

Your suggestion above was that the array API covers everything:

Neil Girdhar:

In the Array API, you just do
x = my_library.asarray(..., context=context)
Then when you pass x around, all of the operations have access to the context. You can write an Array API library that does all the things you want. E.g.,
>>> x = lib.asarray(2, context=ctx)
>>> xp = get_namespace(x)
>>> xp.sqrt(x)
Decimal('1.414213562373095048801688724')

I didn’t read this closely enough at first and thought that this was something already implemented that I had missed from specification. I tried numpy and jax before looking in the spec to see that there is no context argument to asarray. Then I read what you said more closely:

You can write an Array API library that does all the things you want .

So if I want the context management for decimals that go into an array I should implement my own multidimensional array library? And I should do that by implementing the extremely complex array API even though having decimals in the array is already explicitly outside the scope of that specification?

That is definitely not what the array API is for!

It should be possible to compose implementations of multidimensional arrays with implementations of scalars. That means that there needs to be some API for the scalars that can (among other things) go into arrays though. If this was Julia then that API would just be multiple dispatch. Since we don’t have multiple dispatch we need something else.

The array API is great but not everything is an array and even arrays can only exist by holding scalars that are generally not arrays themselves. The array API (prudently) looks past this complexity by declaring that it covers 13 fixed-width machine precision types and by assuming that the array library itself is going to be the one that implements exp(x) for each of those scalar types. No mechanism of extensibility for exp is needed if the library already computes the exp of every possible scalar.

zhangyx · December 17, 2024, 9:11pm

I searched within this thread but found no occurrence of keyword “defer”. I am not sure if I understood OP’s problem correctly, but the defer object idea I’ve been working on might solve this problem from a completely different angle:

Python 3.14.0a1+ (heads/feat/defer-expr:39136869fb, Dec 11 2024, 20:36:44) [Clang 16.0.0 (clang-1600.0.26.6)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from random import random
>>> X = defer.Mutable(random)
>>> 3*abs(X)**2 + 1
1.950801234788364
>>> 3*abs(X)**2 + 1
1.5514411634812437
>>> 3*abs(X)**2 + 1
1.2161244611889943

NeilGirdhar · December 17, 2024, 9:19pm

That is elementwise exponentiation.

Yes, I specifically mentioned the case where the elements are themselves “matrices” here:

Neil Girdhar:

That said, there is a way in which exp could do what you want, and that is if the dtype itself were matrices. For example,
>>> x = matrix_lib.asarray([[1,2],[3,4]], dtype=matrix_2x2_float32)
>>> x.shape
()
>>> xp.exp(x)  # Okay.  Result also has shape ().

Good point, that’s true.

Your point that decimals are out of scope is a good one. In that sense, the Array API is too narrowly defined for some of the things you want to do. That said, maybe you’d get some traction petitioning them to broaden their scope since the benefit would be the interoperability that you want.

Regarding the “complex API”, you can just implement what you like I think. Your library just won’t work in every situation. But yes, if you want Decimal to adopt the API, you’ll have to write it yourself since no one has done that.

I don’t see why this is such an important point since you can always put scalars into arrays.

I think everything can be put into an array. I personally like this design.

Regarding probability distributions (the topic of this thread), they are most likely themselves going to be arrays (in the sense that they have a shape, support indexing, etc.). So, I don’t see this point about scalars to be that motivating.

oscarbenjamin · December 17, 2024, 11:24pm

What exactly is the resolution for the probability distributions? Is it an approach that also works in other situations where you might want to compute exp(x) and log(x) for some object x?

NeilGirdhar · December 18, 2024, 12:17am

That’s a great question. I don’t know what Matt’s looking for in the long run.

`math.exp` and `math.log` delegate to `__exp__` and `__log__` methods?

`math.exp` and `math.log` delegate to `exp` and `log` methods?