Description
PEP 483 describes the theory of type systems. The germane part is subtype relations. It declares the second (strong) type criterion: “every function from first_type
is also in the set of functions of second_type
”. This seems to be the reason why NewType
requires “a proper class (i.e., not a type construct like Union
, etc.)” What I am proposing is some way to loosen the criterion to “every function within an important subset of functions from first_type
is also in the set of functions of second_type
”.
Example
Would it be possible to use some other annotation information to say that an object belongs to a type in a different typing system? e.g.
from numpy import NDArray
from scipy.sparse import sparray
Array = typing.ExternalType("Array", NDArray, sparray, *other_classes)
CovarianceMatrix = NewType("CovarianceMatrix", Array)
arr: typing.External[NDArray, CovarianceMatrix]
Here, typing.ExternalType
declares a name to be the root of a separate type system as well as the python types that are allowed to take on this type. Equivalently, this declares the limited subset of functions as the intersection of legal functions on all arguments after the name. NewType
, Union
, etc. can then declare arbitrary additional types. The annotation typing.External
takes a parameter that can restrict the python type, and then any number of types that live in independent ExternalType
graphs.
Use Case
I would like to be able to use mathematical types to prevent avoidable errors in numerical algorithms. Consider the following:
def z_score(x: float, mean: float, stdev: float) -> float:
return (x-mean) / stdev
def hyperparam_search(data) -> float:
"""Calculate the optimal variance of the training data"""
...
def fit_something()
var = hyperparam_search(data)
return z_score(x, mean, var) # should be z_score(x, mean, sqrt(var))
This seems like a clear case of “situations where a programmer might want to avoid logical errors by creating simple classes” by using NewType
. Properly annotating these functions with Variance = NewType("Variance", float)
and StdDeviation = NewType("StdDeviation", float)
would show the problem.
Now consider that because of duck typing, my functions work on float | np.float64
. However, I can’t create Variance
and StdDeviation
on a union type. In fact, this was the exact problem I ran into. I had built an equation using variance because of convexity rules, but a reviewer asked to see plots in standard deviation because the units were more understandable. When I modified my code, I introduced the error.
More generally, this problem manifests a lot when working on difficult math, because testing a function requires proving the proper invariants and identifying algorithm condition number, which are both hard. At the same time, a lot of code can utilize sparse matrices or other types, which are faster, via duck typing without any modification. Off the top of my head, I can think of a dozen types of sparse or dense matrices and that I use any day: inequality constraints, positive definite matrices, positive semidefinite matrices, covariance matrices, finite difference matrices, toeplitz matrices, etc.
Alternative/Current Workaround
class FakeFloat(np.float64, float): # type: ignore # incompatible method definitions
def __new__(cls, other):
return other
class StDev(FakeFloat):
pass
class Variance(FakeFloat):
pass
def foo(a: Variance) -> None: pass
foo(StDev(3.0))
mypy will correctly point out that foo()
takes a variance, not a standard deviation.
What I’d love to understand, even if this isn’t possible
I’m a bit out of my depth when it comes to this level of type theory, but
- Does type theory (beyond python) have an existing concept of “every function within an important subset of functions from
first_type
is also in the set of functions ofsecond_type
”? Sort of like how probability theory has a concept conditional independence: X \bot\!\!\!\bot Y | Z - Where does the documentation specify how static analyzers should use things like
__mro_entries__
and__supertype__
? - How does this differ when handling
Protocol
andUnion
types? - Is there a way to see this interactively?
I have a better understanding of typing now, but I once posted related questions under Python Help, and once as a discussion in Pyright,