# Use * operator to type fixed-size collections

Every now and then, I happen to work with fixed-size homogenous collections… and find no satisfying way to annotate them.

Let’s say I have a function to compute in-place a Fibonacci generalized sequence depending on the first two terms:

``````def compute_fibonacci_terms(seq, n_terms):
a, b = seq
for _ in range(n_terms):
a, b = b, a + b
seq.append(b)
return seq
``````
``````>>> compute_fibonacci_terms([3, 4], n_terms=5)
[3, 4, 7, 11, 18, 29, 47]
``````

To annotate this function, the two choices I see are

``````def compute_fibonacci_terms(seq: list[int], n_terms: int) -> list[int]: ...
``````
``````def compute_fibonacci_terms(seq: tuple[int, int], n_terms: int) -> list[int]: ...
``````

The first solution allows to pass list with any number of items, resulting in a `ValueError` in this (admittedly contrived) example. The second solution is even worse, disallowing passing lists (which is why this function was designed that way) and rightly reporting an error on `seq.append`.

Even when working with tuples, the situation is not ideal when it comes to large tuples:

``````def get_weekday_names(locale: str) -> tuple[str, str, str, str, str, str, str]:
...
``````

Here type-checking works well, but is quite cumbersome to read and error-prone.

## Proposal

What I’d like to propose here is to overload the `__mul__` operator of `type` to denote fixed size homogenous collections.

So the above functions could be written as

``````def compute_fibonacci_terms(seq: list[int * 2], n_terms: int) -> list[int]:
...
``````
``````def get_weekday_names(locale: str) -> tuple[str * 7]:
...
``````

At runtime, `type` (as well as of some typing special forms usable as type annotations) would define `__mul__` to ony accept an integer and return some special typing construct, similar to `typing.GenericAlias`. Type checkers would only allow literal integers.

## Considerations

I believe type checkers should only allow this syntax inside generic types derived of `Collection` [1].
Type-checking wise, `tuple[int, int, int]` and `tuple[int * 3]` should be considered fully equivalent.

Since `type` and typing special constructs cannot currently be multiplied by anything, this should be fully backwards compatible. One case that could be problematic are defered anotations, since `list["LateBind" * 3]` would become `list["LateBindLateBindLateBind"]` , but I believe type checkers can just flag this as invalid (and suggest using `list["LateBind * 3"]` instead).

Future improvements this may open the way for...
• Typing of heterogenous sequences (`list[int * 3, str, int * 2]`), even it I find that less readable and probably a way less common use case;
• Bounded iterable size: for example, `itertools.tee` could have an overload
``````def tee[I, N: Literal[int]](iterable: Iterable[I], n: N=2) -> tuple[I * N]: ...
``````
or even, to get back to my example,
``````compute_fibonacci_terms[N: Literal[int]](seq: list[int * 2], n_terms: N) -> list[int * (N + 2)]: ...
``````
• Some way to denote “at least X” / “at most X” items ?

Has something in these lines been discussed before? I’m pretty sure I read a discussion about this issue somewhere here, but I couldn’t find it and I don’t think it really had a proposal.

NB: I hesited between Typing and Ideas for this post, I preferred the later since this has implications on runtime and does not rely on advanced typing notions, but feel free to move it!

1. maybe `arg: int * 3` could be made sugar for `arg: Collection[int * 3]`, but I’m not sure I’m fan of the idea. ↩︎

There was some previous discussion on this problem at Feature Request: Type hint for Fixed Length Homogeneous Sequences · Issue #786 · python/typing · GitHub

2 Likes

This is an interesting idea, but…

I don’t think lists could ever work in this context. First, I think it’d be extremely difficult for a typechecker to know how many elements might be in a given list; these are basically dependent types, right? That’s a very advanced typing concept.

Secondly, assume you have a class that has a field of type `list[int * 4]` and you happen to have a list of 4 ints, the checker knows it, and you pass it in. Then a few lines later you pop an element and now it’s a `list[int * 3]`, but the class instance still thinks it has 4 elements.

I think for this to be sound the type should be immutable, so tuples. Maybe `tuple[int * n]` (or `tuple[int; n]`) could just be a shorthand for a homogenous tuple of n elements?

2 Likes

I see your point. Indeed, I did not considered the type checker work that would be needed outside of the “passing a literal list to a typed function”, but I suppose allowing `list[int * 5]` would not make sense within the current scope of type checkers.

I think this “syntax” may be useful even limited to frozen collections (aka tuples and frozensets?) though! This is more in line with the discussion linked by Alex above, indeed

1 Like

This sounds akin to design-by-contract which I think is beyond the scope of the type system. However, there are some libraries on PyPI that implements this but I’m not sure if they do static analysis.

I like the proposal. The implementation is simple, and type-checkers should detect the odd cases.

This is already detected as a typing error by mypy:

``````y: list[int, str] = []
``````

error: “list” expects 1 type argument, but 2 given [type-arg]

A neat idea to ease an ergonomic pain-point. I’d agree that it’s tedious to write `tuple[T, T, T, T]`. However I’d personally prefer no more “special cases” be added until more rigorous foundations are specified for the type system in general. The usage of the subscript syntax in annotations feels overloaded and inconsistent enough as it is.

5 Likes

Yeah, I totally understand that! That’s more or less the reason why I didn’t push my proposal about extracting kwargs from function signatures further. Hope this project and related debates will progress quickly and lead so exciting features!

Do you think special-casing `tuple[SomeType, Literal[int]]` could still be relevant? Or even `tuple[SomeType, TypeVar(bound=Literal[int])]`, which is AFAIK not doable today!

This is the part of the proposal that feels most consistent to me, at least as far as consistency with the existing `tuple` annotation syntax. For example:

1. `tuple[T, U, V]` - heterogenous tuple of known length
2. `tuple[T, …]` - homogenous tuple of unknown length
3. `tuple[T, 5]` - homogenous tuple of known length

(1) and (2) we already have; (3) seems like a reasonable extension of (2) and (I imagine) could be implemented as sugar for (1).