Allow itertools.islice to support negative start and stop values

Itertools.islice currently raises ValueError: Indices for islice() must be None or an integer: 0 <= x <= sys.maxsize. when its start or stop arguments are negative. However this requirement could be removed.

For example, by having a queue of items we can see n items into the future. This would allow us to determine whether we are within n items of the end of the iterable. Obviously this comes at the cost of having to queue up n items and so is impractical when stop=-1_000_000, for example.

One prototype for this (this skips implementing the start and step arguments but these can be added) is:

def islice(iterable, stop):
    if stop >= 0:
        for index, item in enumerate(iterable):
            if index >= stop: break
            yield item
    else:  # stop < 0
        queue = collections.deque()
        iterable = iter(iterable)
        try:
            queue.extend(next(iterable) for _ in range(abs(stop)))
        except StopIteration:
            return

        for item in iterable:
            queue.append(item)
            yield queue.popleft()

Iterators don’t necessarily have an end. It is an unsafe assumption to
assume that every iterator has a known or predictable end, or indeed an
end at all.

What would islice(itertools.count(1), -3) do?

2 Likes

So islice(iterable, -3) would return an iterable that stops three items before iterable stops.

This means that islice(itertools.count(1), -3) would return an iterable that behaves just like itertools.count(1), since this iterable never terminates.

How does it actually do that though? An iterator itself doesn’t provide indication whether it ends or not.

How does it actually do that though? An iterator itself doesn’t provide indication whether it ends or not.

That is true. However, you don’t need to know whether the iterator stops or not; you only need to know whether the iterator stops in the next n steps.

The example block in the first post shows one way to do this. Create a queue of n items and store the next n items of the iterator in it. While you have n items in the queue you are not within n of the end of the iterator and so should pop and yield the item at the head of the queue and then add the next item of the iterable to the queue. Otherwise, you are within n of the end of the iterator and so should stop.

Hence the example in the first post has this behaviour and islice(itertools.count(1), -3) returns an iterable that yields the numbers 1, 2, 3, ...

This is islice_extended from the more_itertools project. I recommend looking in more_itertools any time itertools is not enough :‍)

As you can see from more_itertools docs, there are many iteration-related functions that can be helpful. But most of them aren’t universally useful enough to be in the standard library.

3 Likes

Mark suggested:

“So islice(iterable, -3) would return an iterable that stops three
items before iterable stops.”

Right. How is islice supposed to know where the iterable stops, without
losing the benefit of iterators, which is that values are computed
lazily, on demand?

Your idea of a queue is just an arbitrarily large look-ahead function
with a cache, and caches are problematic because they become stale. When
you look ahead into an iterator, the value you get out of it now may
not be the same value you would have got if you waited until you needed
the value.

Then there are the iterators that have side-effects.

Your queue needs to do an arbitrarily large amount of work ahead of time
just to get the first value out of the iterator.

Your queue idea might suit your purposes, in which case you should just
write yourself a little utility function and use that. But as a general
purpose iterator tool suitable for the itertools module, I think that
the disadvantages outweigh the advantages.

If you search the Python-Ideas mailing list, I believe that you will
find that this has been suggested, and rejected, before.

Something similar was proposed before (by me, 14 years ago):

https://mail.python.org/archives/list/python-dev@python.org/thread/C6S6SL5D5QNOGR3FITMJYZXYOTRWV6D3/#C6S6SL5D5QNOGR3FITMJYZXYOTRWV6D3

It seems that Roundup links withou “issue” no longer work, here’s the link to the patch

https://bugs.python.org/issue1749857