RepeatableIterable type

Hello,

I published a package on PyPI with source code on GitHub:

The goal is to avoid invisible bugs like when you iterate many times over an iterator.
See also:

Currently, only get_repeatable_iterable() function in my package is nice and usable to prevent the bug.
But RepeatableIterable suffers from the fact that it “cannot be subscripted”.

I tried to improve it, see this script: ri_vs_mypy.py


from typing import Union, List, TypeVar
from python_repeatable_iterable import *

T = TypeVar("T")

RI = Union[
    list[T],
    tuple[T],
    range,
    str,
    bytes,
    bytearray,
    memoryview,
    set[T],
    frozenset[T],
    dict[T=Tuple[U,V]],
    dict_keys,
    dict_values,
    dict_items,
]

def foo(x: RI[List]) -> List:
    result = []
    for y in x:
        result.extend(y)
    for y in x:
        result.extend(y)
    return result

a = [[],[]]
print(foo(a))

I’m using Python 3.11 so I do not have access to type RI = syntax yet.
Moreover, I would like to do very fancy stuffs,
like RI[T] = Union[..., dict[T=Tuple[U,V]],...],
a kind of types destructuring.
This is still work in progress but I would be very happy to receive constructive feedback to improve my package :).

Thanks for reading me, best regards,
Laurent Lyaudet

I noted also that:

  • dict_keys,
  • dict_values,
  • dict_items

are not Generic types and I cannot use dict_keys[T] for example.

I’m pretty sure Collection[T] would cover that entire union, it’s what I usually use for this exact case. collections.abc.Collection is a Sized Iterable Container.


And here’s the proof: mypy Playground

Thanks. Unfortunately Collection does not exclude some subtypes that are not RepeatableIterable. For example, you can look at my source code here:

The class MySet(set): is a collection but it is not RepeatableIterable.
That is one of my big problems also.
I don’t even know if MyPy can typecheck with type(some_object) is some_class instead of isinstance(some_object, some_class).
Right now my function to obtain a repeatable-iterable works, but I don’t even know if there is a truly nice way to use the type system for that.
I’m using get_repeatable_iterable() as a safe-guard inside other functions that have an Iterable argument ; but I’m keeping Iterable instead of RepeatableIterable for type hints right now.

That’s fair enough, note however that NewType does not actually work with a Protocol, or at least it’s not supposed to work and Iterable is a Protocol, so you would have to manually create a nominal type that implements just __iter__ and use that as your marker type.

i.e. something like the following seems more sensible:

class RepeatableIterable(Iterable[T]):
    def __iter__(self) -> Iterator[T]:
        # instances of RepeatableIterable don't actually exist
        return NotImplemented

    def __new__(
        cls,
        iterable: Iterable[T],
        safe_classes: Iterable[Type] = (),
    ) -> 'RepeatableIterable[T]':
        """
        Here is an implementation avoiding the previous problem.
        """
        iterable_type = type(iterable)
        for some_class in (
            list,
            tuple,
            range,
            str,
            bytes,
            bytearray,
            memoryview,
            set,
            frozenset,
            dict,
            dict_keys,
            dict_values,
            dict_items,
            *safe_classes,
        ):
            if iterable_type is some_class:
                return cast('RepeatableIterable' iterable)
        return cast('RepeatableIterable', list(iterable))

Although I don’t really think this is in the spirit of duck typing. Restricting other people’s collections, just because they might do something weird just does not seem very Pythonic, this does allow passing in custom types to address this, but that only really helps if you force people to create the RepeatableIterable themselves.

1 Like

I feel like this is a misguided attempt at solving a problem that doesn’t really exists. Yes, Iterable is too lose of a type hint. Collection however should cover everything I can imagine being useful in this situation. If someone creates a Collection class that isn’t repeatably iterable, they IMO broke the contract of the ABC. Because it’s python, this can never be enforced anyway, and trying to add arbitrary restrictions to what types are acceptable is not going to solve anything. If you really want to help with this, create a wrapper that tracks how many items were yielded the first time, and check that all future iterations yield the same amount of items. Or just make a copy of the input iterable.

Out of curiosity, what makes sets problematic?

set isn’t problematic, what he meant is that it’s possible to create a subclass of any of these types that does something weird inside __iter__ like removing elements, or do something different on the first iteration, like yielding an extra imaginary item. So it’s mostly a theoretical issue, not a practical one.

1 Like

Thank you very much :).
It took me a little time to type correctly everything and add test code for the typing.
But I added a version 2.0.0 to python-repeatable-iterable and listed you in CONTRIBUTORS.md.

Well I had the problem in production, so the problem exists :).
I understand your point of view that it is not Pythonic,
but the same argument may reject all typing saying it is not Pythonic :wink: XD.
At least I force nobody to use my package :).