In NumPy, I have decided to go the MetaClass route to describe datatypes. There are various reasons for this, but generally, I like the outcome and system:
array.dtypeis an instance of a
DType(type/class). The DType (like a typical type) describes behaviour with respect to the possible values.
array.dtypeadditionally can store “parameters”:
- Storage parameters (fixed length strings)
- “homogeneous properties” that might be attached to a single value normally, but apply to all array elements. For example a physical unit.
- Most importantly, the split clearly gives a clear level of abstraction: Users write new DTypes (class/types) and functions operate on arrays with certain DTypes (the class, the user provided functionality deals with the instances) (This is a multi-method like dispatching.)
This may be a bit different from Python, where dispatching based on the type directly is not common (multi-methods). But I don’t see much of an alternative: NumPy requires a bit more structure in its (d)types than most Python types.
This means, we have roughly:
class DType(metaclass=DTypeMeta): parametric : bool abstract : bool @classmethod def __common_dtype__(cls, other : DTypeMeta) -> DTypeMeta: """ Returns a DType that can describe the values of both self and other (or return NotImplemented). """ # some logic. return cls def __dtype_setitem__(self, item_pointer : `char *`, value): """ C-level method, setting `item_pointer` to represent value. """ pass # And some more functions/metadata.
There are a few classmethod, and others are methods, but in general I like the idea of using the type (and thus a metaclass instance) as the level of abstraction that the user can modify.
The tricky part
In the above the methods are fairly natural in C, and I want to have them easily and quickly available in C. NumPy has to call many of these commonly on the C-level.
I have done just that,
DTypeMeta is a C-subclass and extends the
(heap)type struct with additional slots. In a sense, I am adding my own slot fields (although right now, not as an
nb_slots like pointer).
That is a bit awkward: The limited API
PyType_FromSpec functions, do not support metaclasses that extends the type struct.
I need to allow users to write new DTypes, ideally in C and dynamically. I am not worried about Python ABI stability (unless it concerns things like HPy).
However, I need users to subclass
np.dtype and call a
(PyType|PyDType)_FromSpec API provided by NumPy to “fill” in the DType specific slots. (Basically
PyType_Ready, but with a FromSpec API.)
I have no yet figured out a good way to do this:
- I can fix
DTypeMeta.__basicsize__by allocating an opaque
void *npy_dtype_slotsstruct (so that I can extend it in the future). This should make things fairly clean for static types/DTypes. Add my own
PyType_Read()for the user.
- From Python, things seem OK: The metaclass can call
type.__new__which I think allocates the correct size. And then fill the slots (based on Python side definitions – which could be capsules).
Both of those together may just be good enough: Static declaration can do most things, and from Python dynamic declaration should be fine (or at least solvable). Some things might not work (or be ugly), but so be it.
But I am wondering if I am missing some better solution to all of this? Maybe there are some ways to avoid the whole problem, even a good pattern to completely extending the
type struct. (I am aware that ABCs store things on a special slot, but it seemes a bit convoluted.)
- One thing that I could probably try is to modify/fix
PyType_FromSpecto gracefully allocate the larger struct size for metaclasses (that might allow me to create a function that calls
PyType_FromSpecinternally)? That might solve the problem of allowing dynamic creation of new DTypes from C (and it would be nice if users don’t have to use the static type API!).
- I thought that I could allow the user to only create a “mixin” baseclass
NewDType(user_mixin, np.dtype)created by NumPy. But that seems