Make itertools.product object support the Sequence interface

The objects produced by itertools.product() are iterable but do not implement many other Collections methods. However, since the iterables that the product is run over are stored as tuples in self.pools it is relatively straightforward to determine in advance the length of this iterable. Similarly the nth item in this iterable can be efficiently determined without having to pass through the first n terms and a tuple is contained in this sequence if and only if each of it’s terms is contained in the iterables that this product is run over. For example:

class ProductObject:
    # Note self.pools is a list of tuples.
    ...
    def __len__(self):
        return math.prod([len(pool) for pool in self.pools])
    def __contains__(self, item):
        # We should also check that item is a tuple of length len(self.pools)
        return all(x in pool for x, pool in zip(item, self.pools))
    def __getitem__(self, index):
        result = []
        for i in range(len(self.pools)):
            size = math.prod([len(pool) for pool in pools[i+1:]])
            result.append(self.pools[i][index // size])
            index -= size * (index // size)
        return tuple(result)

This would allow the following to be done:

>>> X = product(range(5), 'abc')
>>> len(X)
15
>>> X[5]
(1, 'c')
>>> (7, 'x') in X
False

Of course this should be implemented at the C level for performance, but by doing this and adding:

  • __contains__
  • __len__
  • __getindex__

this would make itertools.product objects instances of collections.abc.Sequence.

Note, it is also relatively straightforward to implement __reversed__ so that we can also do:

>>> reversed(X)[1] == X[-2]
True

I personally agree but people have been asking for this for many years so it’s not happening any time soon.

See Issue 24849: Add __len__ to map, everything in itertools - Python tracker

Thank you for letting me know about Issue 24849. I think I agree with a lot of the statements raised in that discussion, in particular:

No, you may not iterate the iterator in order to compute the len, because then the iterator would be exhausted.

However while this is definitely a problem for the general case of trying to add a __len__ method to all of the itertools functions, I think that this is not a problem for this specific case of itertools.product since it must ensure that its iterables are reentrant and it does this by converting them to tuples. So there may be some merit in considering implementing these methods for just the “combinatorial” itertools functions (e.g. not itertools.takewhile).

This proposal repeats the mistake of the one in Issue 24849, that of confusing, or trying to mash together, non-iterator iterables, with a non-trivial __iter__ function, such as range, and iterators, with trivial def __iter__(self): return self and non-trivial __next__. The itertools module comprises callables that specifically return iterators, not iterables, and that is extremely unlikely to change.

If one wants a corresponding iterable class, write it, and call the itertools’ function in the iterable’s __iter__. Write the specific rules and methods one wants. Mark, your ProductObject (just call it Product) would be an example if completed with __init__ and __iter__. Of course, the input rule could be loosened to accept other sequences with __len__, __contains__, and __getitem__. Either test on input or let exceptions happen, as you wish. There is no need to smash the two classes together.

We (mostly Raymond) have provided tools that do the hard part of fast and accurate iteration. Use them to produce the specific class one wants.

A bit more on how product and Product are different: a Product based on fixed sequences is fixed. There is a 1-to-1 mapping between Product tuples and indexes. Indeed, __contains__ could be replaced or implemented by an index method that is the inverse of aproduct[n]. Iterators, on the other hand, represent mutable sequences with a real or fake order. Repeated next(it) is essentially equivalent to repeated alist.pop(0), except for the terminating exception.