Numeric Generics - Where do we go from PEP 3141 and present day Mypy?

When you get far enough into trying to write generic code for different implementations of numbers or rings and fields what you’ll find is that you basically always need to know a few properties about the kind of ring/field/number you are dealing with to be able to write the proper code. In other words nontrivial code cannot be completely generic. The problem with the numbers ABCs is that apart from Integral and Rational they don’t provide any information to be able to do anything useful except in very simple situations.

For example in the case of Real if I have a Real that is not a float then what could it be? It might be one of the np.float32 etc types that @rgommers referred to above or it might be:

  1. Decimal (assuming Decimal was allowed)
  2. mpmath.mpf (an arbitrary precision binary float)
  3. sympy.Float (an mpmath.mpf with an attached measure of accuracy)
  4. An interval (or ball) from an interval arithmetic library.
  5. An element of the algebraic field Q(sqrt(2)) as provided in SymPy/SageMath and others.
  6. A more complicated symbolic representation of an exact real number.

None of the above can in general be handled by the same generic code unless you are doing something extremely simple. Projects like SymPy and SageMath that have lots of different types like this will have associated “context” or “domain” objects that are used for keeping track of which kind of field/number is being represented and how to implement things that need non-generic implementations.

That’s not to say that there can’t be generic code for these. The problem is that nontrivial generic higher-level code will need to call into some non-generic lower-level routines. The numbers ABCs don’t provide enough of a usable foundation for the code above them to be generic without being highly suboptimal.

A simple example would be a function that takes Real and returns some calculated result where the calculation needs to do more than basic arithmetic such as computing the exponential exp(x) of one of its arguments. There is no extensible way to do that in Python that will perform correctly for ordinary floats and at the same time 1000 digit mpmath.mpf floats (where you care about the 1000 digit accuracy). The ABC only defines __float__ and therefore you can use math.exp but it will reduce your 1000 digits to 53 bits.

Likewise nontrivial usage of Decimal that actual needs the important decimal-ness of Decimal probably needs to know that it is using Decimal so that it can set up a context and set rounding modes and precision and so on. Nontrivial usage of mpmath needs to set the precision and needs to use the mpmath functions for sin, cos, exp etc. The only thing that the Real ABC provides for interoperability is __float__ but the whole point of saying Real rather than float is because you want to handle more than float and the whole point of using mpf/Decimal is that they are better than float in some ways: converting everything to float with __float__ misses the point of using the other types in the first place.

The ABCs don’t give any way to know that any operations will be exact or not but that distinction typically leads to completely different algorithms. Likewise in some fields or rings some operations will be unknowable or undecidable which also leads to completely different algorithms. No information is provided by the ABCs that would enable exact conversions even between different floating point types (__float__ doesn’t cut it and neither does as_integer_ratio).

I would like to see better interoperability of different numeric types in Python. I think that making type annotations the primary focus of that is misguided though. We really need to improve the actual semantics and usefulness first and think about typing after. I worry that some of the people in the mypy issue are just looking for a way to hint all their code and shut up mypy and perhaps don’t really have a clear application for why they even need to use the ABCs in the first place. It would be a shame to solve their problem the easy way without actually improving anything useful.

One thing that I think would be a big improvement for generic numeric code would be a stdlib version of the math module that had overloadable singledispatch functions for common mathematical operations like sin, cos, etc. Then you could do something like sin(obj) and have it do the right thing for every type of obj. Then you could use that in a generic routine instead of needing to choose from math.sin, cmath.sin, numpy.sin, mpmath.sin, sympy.sin, … These could be used recursively so e.g. a numpy array with dtype=object could use the single dispatch sin to operate on its elements. (Note that this is exactly how it works in Julia - one sin function overloaded by different types.)

5 Likes