Every now and then, I happen to work with fixed-size homogenous collections… and find no satisfying way to annotate them.
Let’s say I have a function to compute in-place a Fibonacci generalized sequence depending on the first two terms:
def compute_fibonacci_terms(seq, n_terms):
a, b = seq
for _ in range(n_terms):
a, b = b, a + b
seq.append(b)
return seq
>>> compute_fibonacci_terms([3, 4], n_terms=5)
[3, 4, 7, 11, 18, 29, 47]
To annotate this function, the two choices I see are
def compute_fibonacci_terms(seq: list[int], n_terms: int) -> list[int]: ...
def compute_fibonacci_terms(seq: tuple[int, int], n_terms: int) -> list[int]: ...
The first solution allows to pass list with any number of items, resulting in a ValueError
in this (admittedly contrived) example. The second solution is even worse, disallowing passing lists (which is why this function was designed that way) and rightly reporting an error on seq.append
.
Even when working with tuples, the situation is not ideal when it comes to large tuples:
def get_weekday_names(locale: str) -> tuple[str, str, str, str, str, str, str]:
...
Here type-checking works well, but is quite cumbersome to read and error-prone.
Proposal
What I’d like to propose here is to overload the __mul__
operator of type
to denote fixed size homogenous collections.
So the above functions could be written as
def compute_fibonacci_terms(seq: list[int * 2], n_terms: int) -> list[int]:
...
def get_weekday_names(locale: str) -> tuple[str * 7]:
...
At runtime, type
(as well as of some typing special forms usable as type annotations) would define __mul__
to ony accept an integer and return some special typing construct, similar to typing.GenericAlias
. Type checkers would only allow literal integers.
Considerations
I believe type checkers should only allow this syntax inside generic types derived of Collection
[1].
Type-checking wise, tuple[int, int, int]
and tuple[int * 3]
should be considered fully equivalent.
Since type
and typing special constructs cannot currently be multiplied by anything, this should be fully backwards compatible. One case that could be problematic are defered anotations, since list["LateBind" * 3]
would become list["LateBindLateBindLateBind"]
, but I believe type checkers can just flag this as invalid (and suggest using
list["LateBind * 3"]
instead).
Future improvements this may open the way for...
- Typing of heterogenous sequences (
list[int * 3, str, int * 2]
), even it I find that less readable and probably a way less common use case; - Bounded iterable size: for example,
itertools.tee
could have an overload
or even, to get back to my example,def tee[I, N: Literal[int]](iterable: Iterable[I], n: N=2) -> tuple[I * N]: ...
compute_fibonacci_terms[N: Literal[int]](seq: list[int * 2], n_terms: N) -> list[int * (N + 2)]: ...
- Some way to denote “at least X” / “at most X” items ?
Has something in these lines been discussed before? I’m pretty sure I read a discussion about this issue somewhere here, but I couldn’t find it and I don’t think it really had a proposal.
NB: I hesited between Typing and Ideas for this post, I preferred the later since this has implications on runtime and does not rely on advanced typing notions, but feel free to move it!
maybe
arg: int * 3
could be made sugar forarg: Collection[int * 3]
, but I’m not sure I’m fan of the idea. ↩︎