I’m helping a group add type annotations to their data structures in an existing project. The basic idea is that there’s a primary data array, an always-present metadata array (with one entry per row in the data array), and some additional data, depending on the subclass, and they want to be able to index all of these arrays simultaneously:
from dataclasses import dataclass
import numpy as np
@dataclass
class ZippedData:
data: np.ndarray
metadata: np.ndarray
def __getitem__(self, idx) -> tuple[np.ndarray, np.ndarray]:
return self.data[idx], self.metadata[idx]
@dataclass
class ExtraZipped(ZippedData)
addon: np.ndarray
def __getitem__(self, idx) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
return *super().__getitem__(idx), self.addon[idx]
It seems like the thing to do is to make the class generic:
from dataclasses import dataclass
from typing import Unpack
import numpy as np
@dataclass
class ZippedData[*Ts = Unpack[tuple[()]]]:
data: np.ndarray
metadata: np.ndarray
def __getitem__(self, idx) -> tuple[np.ndarray, np.ndarray, *Ts]:
return self.data[idx], self.metadata[idx], *self._extra_data(idx)
def _extra_data(self, idx) -> tuple[*Ts]:
return ()
@dataclass
class ExtraZipped(ZippedData[np.ndarray])
addon: np.ndarray
def _extra_data(self, idx) -> tuple[np.ndarray]:
return (self.addon[idx],)
I get complaints from both mypy and pyright about the default implementation of def _extra_data()
, even though it matches the default in the header. I can simply # type: ignore[return-value]
, which is what I’m doing, but I hope there’s a way to do this properly. It would be nice if a type checker were to be able to go as far as complain if _extra_data()
wasn’t overridden in the subclass because *Ts
no longer unpacked to an empty tuple.
I don’t have a particular need for it at this point, but I’m curious if ExtraZipped
could itself be made generic, so that you could extend to another additional field by subclassing.