Custom fixed-size heterogeneous sequence

pyo3 generates classes that behave as fixed-size heterogeneous collections without being subclasses of tuples. I want to write the type in a way that type checkers allow me to deconstruct them using the old-style iteration protocol. I.e. this is the correct typing for one of them:

from typing import Self, Literal, assert_type, overload

class MyTupleLike:
    __match_args__ = ("_0", "_1")
    _0: int
    _1: str

    def __new__(cls, _0: int, _1: str, /) -> Self: ...
    def __len__(self) -> Literal[2]: ...
    @overload
    def __getitem__(self, key: Literal[0]) -> int: ...
    @overload
    def __getitem__(self, key: Literal[1]) -> str: ...

x = MyTupleLike(1, "2")

a, b = x
assert_type(a, int)
assert_type(b, str)

match x:
    case TupleLike(a, b):
        assert_type(a, int)
        assert_type(b, str)

However, no type checker I tried (mypy, PyRight, or ty) understands.

  • mypy and PyRight are as usual quite unhelpful and just say “it’s not iterable” without explaining why they think that (PyRight, while giving more context, is actually the most wrong: it says “__iter__ method not defined” as a reason for it “not being iterable” and doesn’t mention the old-style iteration protocol)

  • ty is much more helpful and tells me

    info: It has no __iter__ method and its __getitem__ method has an incorrect signature for the old-style iteration protocol
    info: __getitem__ must be at least as permissive as def __getitem__(self, key: int): ... to satisfy the old-style iteration protocol

however that’s no true: as long as it supports all ints >=0 and <=len, it’s of course iterable. Switching to int doesn’t help, since then the information about which one is int and which one is str is lost.

Is there some secret trick to make this work or is this another limitation of Python’s type system?

3 Likes

As a workaround, maybe adding __iter__ (possibly behind a if TYPE_CHECKING: check) would help, even if it doesn’t exist at runtime?

My goal is that the assert_types work. No workaround that doesn’t achieve that helps. E.g. @AlexWaygood suggested I add

@overload
def __getitem__(self, key: int) -> str | int: ...

but that doesn’t help either: the overload spec says that if multiple overloads match, the return value is the union of all matching overloads. And since Python doesn’t have Exclude (or Not together with Intersection), I can’t do

@overload
def __getitem__(self, key: Intersection[int, Not[Literal[0, 1]]]) -> Never: ...
3 Likes

but that doesn’t help either: the overload spec says that if multiple overloads match, the return value is the union of all matching overloads.

Nope, step 5 would filter out the third overload. These pass even with the overload added. It doesn’t explain why typecheckers don’t seem to handle unpacking correctly though.

assert_type(x[0], int)
assert_type(x[1], str)
1 Like

I’ve been told in Accept `__getitem__`-iterables in `builtins.enumerate` by jorenham · Pull Request #12294 · python/typeshed · GitHub that wrapping this in iter() is the recommended workaround, and that the typeshed maintainers are not willing to add additional support for this flavor of iterable in the stdlib stubs.
In the stubs for e.g. ctypes.Array (which only has __getitem__ but no __iter__ at runtime), you can see an example of the previously suggested __iter__ workaround: typeshed/stdlib/_ctypes.pyi at 567b488fc28978642bfeb70b29f11453b0ff124c · python/typeshed · GitHub

If you decide to go with the __iter__, then I’d recommend marking it as @typing.type_check_only, which only works in a .pyi or within one of those if TYPE_CHECK_ONLY: blocks.
For what it’s worth; I personally think that everything that works at runtime, should also be accepted by type-checkers. But I’ve noticed that it’s not a very popular opinion around here.

@sterliakov has a good answer to this very question as long as you are allowed to redefine the base class of your tuple-like class. That is, make the class inherit from tuple while type checking:

from typing import TYPE_CHECKING

if TYPE_CHECKING:
    base = tuple[int, str]
else:
    base = object

class MyTupleLike(base):
    _0: int
    _1: str

    if not TYPE_CHECKING: # define iterable methods only when not type checking
        def __iter__(self): ...
        def __getitem__(self, key): ...

Of course not, this should be fixed by the type checkers, nor worked around by lying in the type stubs. These types don’t have an __iter__ method, so we shouldn’t pretend it does.

No, because of the above and also because it doesn’t get me closer to my goal of having precise typing for heterogeneous unpacking.

No, I’m not. I don’t control the code here, just the typing.

I think if Python decides to fully deprecate the old style iteration protocol, and instead decides to introduce a “fixed length heterogenous iterator” type that would solve the problem in the title, I would try to get that into pyo3, otherwise I’ll work with what exists.

2 Likes

I see. Then maybe define the tuple-inheriting class in a stub (.pyi) file?

That was never in question. After Kroppeb’s answer, the remaining issues here are:

  1. there is no way to express a heterogeneous collection in a type-safe way using the new-style iteration protocol.

  2. No type checkers correctly use overloads for the old-style iteration protocol. This should work:

    class MyTupleLike:
        def __new__(cls, _0: int, _1: str, /) -> Self: ...
        def __len__(self) -> Literal[2]: ...
        @overload
        def __getitem__(self, key: Literal[0]) -> int: ...
        @overload
        def __getitem__(self, key: Literal[1]) -> str: ...
        @overload
        def __getitem__(self, key: int) -> int | str | Never: ...
    
    a, b = MyTupleLike(1, "2")
    assert_type(a, int)
    assert_type(b, str)
    

So the next steps should be:

  1. we should introduce something like typing.HeterogeneousIterator[*Types] which could be used for fixed-sized heterogeneous collections and variable-sized ones (like def __iter__(self) -> HeterogeneousIterator[str, *tuple[int, ...]])
  2. type checkers should be fixed to support the above test case.
1 Like

What @blhsing is suggesting is that in the .pyi stub file you make MyTupleLike inherit from tuple, even if that isn’t the case at runtime.

1 Like

lying about base classes can cause other problems, such as narrowing patterns not working correctly.

For instance, if a user has one of your type or a tuple, isinstance(unknown, tuple) works at runtime correctly, but since you’ve lied to the typechecker…

2 Likes