Recursive type annotation for a nested list of lists of lists of

I’m trying to figure out the way to write type annotations to represent a type nested[T,N] such that:

  • nested[T,0] = list[T]
  • nested[T,1] = list[list[T]]
  • nested[T,2] = list[list[list[T]]]
  • etc

Is there a good way of doing this?

I have hundreds of functions with signatures mostly being like:

def func1(p: nested[T,0]) -> T:
    ...
def func2(p: nested[T,0], q: nested[T,0]) -> nested[T,0]:
    ...
def func3(p: nested[T,N], q: nested[T,N], N: int) -> nested[T,N]:
    ...
def func4(p: nested[T,N], N: int) -> nested[T,N-1] | T:
    ...

There is no problem annotating nested[T,0] (i.e. list[T]) but the difficulty is that the functions taking nested[T,N] also need to call the nested[T,0] functions when N = 0.

It would be great but I imagine that it is too much to ask for a type checker to understand the connection between the integer parameter N and the nesting level in the types of p and q. I do want a checker to at least understand when different nesting levels are being mixed though like if p and q don’t have the same level but I have not succeeded in making that work yet. So far I have come up with this:

    dup: TypeAlias = list[T]
    dmp: TypeAlias = list["dmp[T]"]

Then I can use dup as nested[T,0] and dmp as nested[T,N] but there is no way to distinguish between nested[T,N] and nested[T,N-1]. This definition of dmp is of an infinitely nested type so there is no distinction between nesting levels and no understanding of the relationship between dmp and dup.

Here is a complete demo of the sort of code I have and what I have come up with so far for the annotations:

from __future__ import annotations
from typing import TYPE_CHECKING, TypeAlias, TypeVar

if TYPE_CHECKING:
    T = TypeVar("T")
    dup: TypeAlias = list[T]
    dmp: TypeAlias = list["dmp[T]"]

    def _dup(p: dmp[T], /) -> dup[T]: ...
    def _dmp(p: dup[T], /) -> dmp[T]: ...
else:
    def _dup(p, /): return p
    def _dmp(p, /): return p

def dup_mul(c: int, p: dup[int]) -> dup[int]:
    """Multiply a dense univariate polynomial by a constant."""
    return [a * c for a in p]

def dmp_mul(c: int, p: dmp[int], lev: int) -> dmp[int]:
    """Multiply a dense multivariate polynomial by a constant."""
    if not lev:
        return _dmp(dup_mul(c, _dup(p)))
    else:
        return [dmp_mul(c, pi, lev - 1) for pi in p]

p = [_dmp([1, 2, 3]), _dmp([2, 4, 6])]
print(dmp_mul(3, p, 1))

This type checks fine and a checker can pick up on various errors like if the elements in the list are not actually int or if you use dup_mul when you should use dmp_mul etc. However there is no checking/understanding of the nesting levels so you could do e.g. p + p[0] and a type checker would not recognise that the types are inconsistent. Also I need to call redundant do-nothing functions _dup and _dmp to “convert” between the two types at the base of recursion.

No. This in general requires dependent types, which python’s type system does not currently have.

You can maybe get a few workarounds and explicitly provide overloads for all the functions using Literal[1], Literal[2], … but there is no way to make a general solution.

1 Like

Okay thanks for confirming.

I had hoped that maybe this recursive case might be something that a type checker could handle. It is a simpler version of typing the shape of a numpy array so I thought maybe TypeVarTuple could do something here.

In context these functions are hardly ever called with explicit values so recognising that e.g. N = 1 means list[list[int]] doesn’t really help anything. Rather the situation is that we have been called with nested[T,N] but the local code will need to operate with nested[T,N-1] and probably returns a nested[T,N] but sometimes returns a nested[T,N-1] or a nested[T,N+1].

There are basically two cases that it would be helpful if a type checker could understand:

  • nested[T,N] is list[T] when the lev parameter is 0 because the N in the type is connected to the parameter value. I assume that this is what requires dependent types and is therefore not possible as you say.
  • nested[T,N] is not the same as nested[T,N-1]. This is what I hoped I could do and wondered if TypeVarTuple could work somehow.

If we can distinguish nested[T,N] from nested[T,N-1] and nested[T,N+1] then that handles almost all cases where the nesting can get mixed up.

I tried this which seems to work a little bit with pyright:

type DMP[T, L, *L1] = list[DMP[T, *L1]]

def func[T,L,*L1](p: DMP[T,L,*L1]) -> DMP[T,L,*L1]:
    if 3:
        return [p[0]] # checks fine
    elif 5:
        return p[0] # check error
    else:
        return [func(p[0])] # checks fine

That looks horrible though so if there someway I can wrap it up to make it look nice that would be better. Also mypy doesn’t like it:

$ mypy demo3.py 
demo3.py:1: error: TypeVarTuple cannot be split  [type-arg]
Found 1 error in 1 file (checked 1 source file)

Another workaround is to make a type alias that’s the union of all of the types you want and then provide explicit overloads for however many cases you want to explicitly handle with a fallback using the type alias. E.g. something like this

type NestedList[T] = list[T] | list[NestedList[T]]

@overload
def something[T](p: list[T]) -> T: ...
@overload
def something[T](p: list[list[T]]) -> list[T]: ...
@overload
def something[T](p: list[list[list[T]]]) -> list[list[T]]: ...
@overload
def something[T](p: list[list[list[list[T]]]]) -> list[list[list[T]]]: ...
@overload
def something[T](p: NestedList[T]) -> NestedList[T]: ...

Though, if you tend to use the general version of that type and rarely have a concrete level in a type annotation then this won’t help you very much since it’ll all just collapse to the alias.

2 Likes

Yes, that’s the situation. There are hundreds of these functions and for the most part they are just called by each other. Even the higher-level interfaces that expose them for use elsewhere all go via classes that don’t have concrete levels either.

I guess I’ll just stick with:

    dup: TypeAlias = list[T]
    dmp: TypeAlias = list["dmp[T]"]

There are a lot of benefits from annotations and checking here just by keeping track of the T regardless of the nesting.