Improve support for infinite and recursive types

Improve support for infinite and recursive types

There seem to be two characteristics (among several) we can use to categorise types:

  1. By their cardinality:
    1. finite
    2. infinite
  2. By their “referential-ness”:
    1. Self-referential
    2. Non self-referential.

I would like to propose some support for correctly typing these objects, and possibly also inspecting/querying these properties on instantiated objects.

Most of what follows naturally applies to iterables and containers, so I’ll limit my discussion to these for convenience.

To ease discussion, let’s suppose the following helper functions exist:

def is_finite(something) -> bool: ...

def is_self_referential(something) -> bool: ...

Some illustrative examples

  • A finite list
x = [1,2,3]

assert is_finite(x)
  • An infinite iterable, such as the following generator
def all_integers() -> Generator[int]:
  i = 1
  while (i := i + 1):
    yield i

x = all_integers()

assert not is_finite(x) 

Note: We would need to determine this without iterating through the container or relying on a maximum recursion exception being raised.

  • A self referential list
x = x[0] = [None]  # x = [[...]]

assert is_self_referential(x)
  • A non self referential list
x = [1, 2, 3]

assert not is_self_referential(x)

Some motivating examples

These sort of structures do arise naturally. Consider trying to type hint something a bit like a JSON

type JSONLike = dict[str: int | JSONLike]

Python (and tools like MyPy) support recursive types (cf. link and link).

Similarly, there is support in Python for nicely handling self referential data structures, especially in the repr method (link).

Additionally, anyone who has written a flatten function (example from NumPy) is at risk of encountering a self referential list.

Possible solutions

Some form of annotation or a predicate could be ideal for assessing the cardinality:

from typing import Finite, Infinite, Annotated

# Using Finite as an annotation
type Finite[JSONLike] = dict[str: int | Finite[JSONLike]]

def all_integers() -> Infinite[Generator, int]: ...


# Using Finite as a predicate
type Annotated[JSONLike, Finite] = dict[str: int | Annotated[JSONLike, Finite]]

def all_integers() -> Annotated[Generator[int], Infinite]: ...


# Using is_finite as a predicate
from annotated_types import Predicate

JsonLike = Annotated[JSONLike, Predicate(is_finite)]

I could imagine very similar functionality for the self referential types.

Not all infinite objects are self referential, and hence the two are distinct. Consider for example:

def integer_sequences() -> Generator[Generator[int]]:
  def all_integers(i) -> Generator[int]:
    k = i
    while (k := k + 1):
      yield k
  yield (all_integers(i=j) for j in all_integers(1))

all_integer_sequences = integer_sequences()
# all_integer_sequences -> [[1,2,3,...], [2,3,4,...], [3,4,5,...], ...]

In this example, all_integer_sequences is infinite, but it is not self referential.

Similarly, we can construct a self referential object which is finite, consider

"""
linked_list = 
a -> b -> c
     ^    |
     |    V
     e <- d
"""
linked_list: SelfReferential[Finite[LinkedList]] = ...
linked_list: Annotated[LinkedList, Finite, SelfReferential] = ...

assert len(linked_list) == 5
assert is_finite(linked_list)
assert is_self_referential(linked_list)

Benefits of possible solutions

As mentioned with the JSON example, this allows us to represent the types “more correctly” and closer to the intention of the developer without having to worry about all the corner cases of Python.

Likewise, it can enable us to detect dead code, infinite loops, etc.

It enables us to detect and raise proper exceptions. With the flatten example, if we had access to the is_finite method, we could easily raise some appropriate exception to the caller rather than getting caught in an infinite recursion.

Which is best?

Based on the examples given, I think there is utility in both the type annotations and also the run-time methods.

I appreciate there are lots of corner cases, especially surrounding efficient runtime checking of these properties, or statically analysing the consequences of some of these properties, but nonetheless I think there is utility is the added expressiveness this offers for expressing the intent and the desired constraints on the types.

1 Like

Looking at the rest of the post, it seems these are properties of the objects themselves? I also don’t think typing “referential-ness” is not really possible on mutable objects, and at the same time strict immutable objects usually don’t have self reference unless they were mutable at a time and then frozen later.

For the finite/infinite: determining whether a generator will generate items forever or for a finite amount of time is undecidable. I don’t know how far typecheckers are willing to go to implement those checks or if developers will annotate their generators with the correct annotation.

1 Like

The run time finiteness checks are non-trivial to implement. Isn’t proving any arbitrary iterator is finite, equivalent to solving the halting problem?

With the annotations, is mypy or Python expected to do anything special? Or is this simply to add a check to avoid passing nominally Infinite iterators into something that only supports Finite iterables? Likewise SelfReferential types into something that only supports NonSelfReferential types.

Personally I don’t see much value in this. But one never knows when something will be useful.

I’m confused. First you assert that all_integers is NOT finite. Then later you annotate it as Finite. Is that a typo?

Separately, am I right that you are really looking mostly for a runtime check? Since the type system already supports these, and Annotated is for extra runtime properties.

Is that a typo?

Good catch about the typo, all_integers was meant to be Infinite (not Finite), I have edited the original post to correct the typo there.

am I right that you are really looking mostly for a runtime check?

No. It would be nice to have a runtime checker where one might feasibly exist, but I don’t think that’s where the emphasis is. The foremost tool I was after was something much closer to the type annotation, where I can express the cardinality of the object in the type as a static property I expect/promise in a function contract. (Whether such a property is actually checkable is less clear).

I would like some mechanism where I can say “these infinitely recursive corner cases” are out of contract, and it strikes me that a type annotation would be ideal for expressing this intent.

The type system seems to support these recursive style type annotations (the “self-referential” property as I called it), but I don’t think it captures the finite aspect of the data structure. The two are related but distinct properties as far as I can tell.

I should highlight, I would like some facility where I can express the constraint on the type. Whether the constraint is easily checkable, or even checkable at all, is a “nice to have” bonus by comparison. Ultimately it might have to be a “take it on trust” property of the type, and it’s not to say that a type checker can exploit all the consequences of this, but there may be some aspects of this that it would/could exploit, (e.g. reachability comes to mind).

It sounds as if in the case of recursive types you want to be able to say “objects of type won’t contain cycles”. But you’re okay if that’s not actually checkable. Checking it may require an arbitrary amount of memory, which makes it unattractive to have a runtime check. You may also only care about cycles through certain attributes or operations and not through others – if an object x has fields f and g (of possibly different types), and you only traverse it through x.g, you won’t care if x.f points to something containing a cycle.

For “infinite” objects (mostly never-ending iterators I imagine) the situation is even worse – there, checking if it’s finite is not only expensive, it’s sheer impossible without destroying the iterator’s state.

Given this I see little hope to get the feature you appear to want. From the responses it already looks like there’s little appetite for something checked by type checkers. If you’re looking to use Annotated and write a checker yourself, you don’t need Python to change – you can just do that as a 3rd party package.

2 Likes

I don’t see this as appropriate for the type system. There’s a difference between a type being recursive and a value of that type containing cycles of values within its structure. The latter appears to describe the case you are trying to avoid, but this isn’t something modelable by type.

2 Likes

Sometimes potential self-reference and potential infinite loops are bugs, sometimes they aren’t. It depends on the application. A Finite or Infinitetype hint won’t highlight a bug the way assigning an int to a str variable does. I’m not sure I see a significant benefit, though I can foresee an overzealous dev insist on adding Finite and Infinite to every collection in a code base.