Generics in metadata of `Annotated`

Hi folks,

Context:
I am developing a feature for the GitHub - pipefunc/pipefunc: Lightweight function pipeline (DAG) creation in pure Python for scientific workflows 🕸️🧪 package which allows to easily construct pipelines (workflows) from a set of callables. Now I would like to make sure that types between the pipeline functions match. It supports N-dimensional map-reduce operations where outputs are stored in a numpy.ndarray(..., dtype=object) or equivalently np.ndarray[Any, np.dtype[object]].

Problem:
I want to be able to additionally keep the type of the elements of the object array in the type hint.

I thought I could use typing.Annotated for this and store the type of the elements in my object array in the metadata. For example, for strings I would have Annotated[np.ndarray[Any, np.dtype[object]], str]. Now the problem is that this is quite verbose and would like to use generics to do:

T = TypeVar("T")
Array = Annotated[np.ndarray[Any, np.dtype(object)],T]

and then have folks use it like:

def f(x: Array[int]) -> int: ...

However, as I discovered, generics don’t work in Annotated’s metadata.

So I came up with:

T = TypeVar("T")

class Array(Generic[T]):
    def __class_getitem__(cls, item):
        return Annotated[np.ndarray[Any, np.dtype[object]], item]

But this doesn’t work for mypy (expected I guess) and additionally, it is giving me a lot of trouble with evaluating ForwardRefs, so therefore I am looking for advice on how to best implement this.

Related:

class Array(Generic[T], np.ndarray[Any, np.dtype(object)]):
  def __class_getitem__(cls, item):
        return Annotated[np.ndarray[Any, np.dtype[object]], item]

I’d expect something like that to work. You can add some tricks with TYPE_CHECKING if you need to make class_getitem extra invisible for some reason. Metadata is something type checkers ignore so to encode np.ndarray[Any, np.dtype[object]] just have that part as a parent class.

As for forward refs issue that’s one of common challenges of any runtime type manipulation. I ended up having code that did tricks/hacks of looking at parent stack frames to heuristically determine reasonable globals/locals to evaluate it. In general though forward refs cause challenges for bunch of libraries and easiest answer is don’t try to support every edge case involving them and come with ways to allow users to work around them like registering certain forward refs for later to be included in eval context.

1 Like

Yes that works, thanks a lot @mdrissi!

For future reference, the full implementation is in Add parameter and output annotations to PipelineFunction class by basnijholt · Pull Request #6 · pipefunc/pipefunc · GitHub where I also verify the metadata at runtime.

1 Like