Backward enumerator

Currently python provides enumerate(Iterable) for forward enumeration. The API returns a generator of tuples of index and item.

However, there lacks a builtin function to enumerate from the back of a Sequence (Iterable plus len and getitem). Currently the most popular solution on StackOverflow is reversed(list(enumurate(...))) (reference to this post). This consumes the entire generator and stores them as a list before it can be reversed.

I propose providing an additional builtin function enumerate_back or add an optional flag enumerate(..., reversed=True) to support reversed enumeration of Sequence objects.

P.S. For the flag version, the input type should be narrowed from Iterable[T] to Sequence[T].

Here is a demo python implementation of enumerate_back:

from typing import TypeVar, Sequence, Generator, Any

T = TypeVar("T")


def enumerate_back(seq: Sequence[T]) -> Generator[tuple[int, T], Any, None]:

    if not hasattr(seq, "__len__") or not hasattr(seq, "__getitem__"):
        raise TypeError(f"{seq} does not support sequencing")

    indexes = range(len(seq) - 1, -1, -1)
    items = reversed(seq)
    yield from zip(indexes, items)

Given that __getitem__() may incur reading from file or even a remote fetch, the proposed API will be more memory efficient than the solution from the aforementioned SO post which consumes everything before reversing.


My own use case (ROS2)
import rclpy
from rclpy.node import Node
from collections import deque

class TaskScheduler(Node):
    def __init__(self, maxlen: int = 10):
        self.task_queue = deque[tuple[float, callable]](maxlen=maxlen)

    # Irrelevant code omitted ...

    def schedule_task(self, ddl: float, task: callable):
        while len(self.task_queue) >= self.task_queue.maxlen:
            # No timeout (wait forever) - we need at least one task to be executed
            rclpy.spin_once(self)

        # Insert the task into task queue in ascending order
        # Searching from back because the new task is likely to be the last
        for idx, (ts, _) in enumerate_back(self.task_queue):
            if ts < ddl:
                self.task_queue.insert(idx + 1, (ddl, task))
                break
        else:
            self.task_queue.appendleft((ddl, task))

EDIT: inspired by an earlier GitHub issue shared by @Stefan2 (thanks!), I came up with another alternative approach of implementing this feature. IMO this will be the most ideal one among all proposed solutions:

from typing import Iterable
import builtins

class enumerate(builtins.enumerate):

    # EDIT2: make `isinstance(..., collections.abc.Reversible)` happy.
    def __new__(cls, obj, **kwargs):
        if hasattr(obj, "__len__") and hasattr(obj, "__reversed__"):
            return cls.reversible(obj, **kwargs)
        else:
            return builtins.enumerate(obj, **kwargs)

    class reversible(builtins.enumerate):
        def __init__(self, iterable: Iterable, start: int=0):
            self.iterable = iterable
            self.start = start

        def __reversed__(self):
            indexes = range(len(self.iterable) - 1, self.start - 1, -1)
            items = reversed(self.iterable)
            return zip(indexes, items)

Usage:

# Works on all objects that supports len() and reversed()
reversed(enumerate(range(10)))
# Throws TypeError for objects that do not support these methods
# (e.g. a Generator)
reversed(enumerate((i for i in range(10)))) # TypeError
2 Likes

Why not ((len(seq) - i, e) for i, e in enumerate(reversed(seq)))?

edit: Only took me three tries :joy:

1 Like

Because the actual index needs to be computed in each iteration :joy:

That’s true for enumerate too?

Well it should incur fewer numeric operation (for each iteration).

Sure, a tiny bit. Does that micro-optimization justify adding this code, though? I’m skeptical.

Oh, another option to avoid this:

zip(itertools.count(), reversed(seq))

Is that better?

I am running timeit tests for them. The point is to provide a standardized way of doing this so you do not have to hand write it every time and suspect its correctness when debugging.

The same claim that you made applies to enumerate() itself - everyone can implement their own in a few lines of python. But yet it exists and has been used widely.

1 Like

Because all indices are 1 too high.

1 Like

This is equivalent to my demo implementation after a tiny modification, but more efficient (maybe due to lack of yield from and lack of type checking?)

zip(itertools.count(len(seq) - 1, -1), reversed(seq))

Test Code:

def back_enumerate(seq):
    ... # See above

def comparison1(seq):
    return ((len(seq) - i, e) for i, e in enumerate(reversed(seq)))

def comparison2(seq):
    from itertools import count
    return zip(count(len(seq) - 1, -1), reversed(seq))

from timeit import timeit

seq = range(10000)

print("back_enumerate", timeit('list(back_enumerate(seq))', globals=globals(), number=1000))
print("comparison1   ", timeit('list(comparison1(seq))', globals=globals(), number=1000))
print("comparison2   ", timeit('list(comparison2(seq))', globals=globals(), number=1000))

Timeit results:

~/playground $ python3 back_enumerate.py
back_enumerate 0.557957124983659
comparison1    0.8546551250037737
comparison2    0.44767408299958333 # Best
1 Like

Yeah, that needlessly wraps the zip iterator in a generator, slowing it down. It’s an anti-pattern.

1 Like

I’ve modified the original implementation and added the SO solution for comparison. My modified original implementation now wins constantly.

Test code
from typing import TypeVar, Sequence

T = TypeVar("T")

def back_enumerate(seq: Sequence[T]):

    if not hasattr(seq, "__len__") or not hasattr(seq, "__getitem__"):
        raise TypeError(f"{seq} does not support sequencing")

    indexes = range(len(seq) - 1, -1, -1)
    items = reversed(seq)
    return zip(indexes, items)

def no_type_check(seq: Sequence[T]):
    indexes = range(len(seq) - 1, -1, -1)
    items = reversed(seq)
    return zip(indexes, items)

def stack_overflow(seq: Sequence[T]):
    return reversed(list(enumerate(seq)))

def comparison1(seq: Sequence[T]):
    return ((len(seq) - i, e) for i, e in enumerate(reversed(seq)))

def comparison2(seq: Sequence[T]):
    from itertools import count
    return zip(count(len(seq) - 1, -1), reversed(seq))

from timeit import timeit

seq = range(10000)

def test(stmt: str):
    t = timeit(stmt, globals=globals(), number=10000)
    return f"{t:.8f}"

print("back_enumerate", test('list(back_enumerate(seq))'))
print("no_type_check ", test('list(no_type_check(seq))'))
print("stack_overflow", test('list(stack_overflow(seq))'))
print("comparison1   ", test('list(comparison1(seq))'))
print("comparison2   ", test('list(comparison2(seq))'))

timeit Results:

back_enumerate 4.11149596
no_type_check  4.09617529 # back_enumerate without type checking
stack_overflow 4.35428938
comparison1    8.54001383
comparison2    4.47707404

Also note this:

Could you please include the needed imports?

I don’t think this is needed often enough to justify adding to the stdlib. It might be welcome in the more_itertools library, though, or maybe as a recipe in the itertools docs.

3 Likes

Sry, fixed.

Interesting. How about exposing enumerate().__reversed__() when the object being enumerated supports __len__ and __reversed__?

1 Like

Coming back with a demo implementation (I’ll update my OP and add this one):

from typing import Iterable
import builtins

class enumerate(builtins.enumerate):
    def __init__(self, iterable: Iterable, start: int=0):
        self.iterable = iterable
        self.start = start

    def __reversed__(self):
        indexes = range(len(self.iterable) - 1, self.start - 1, -1)
        items = reversed(self.iterable) # Will throw TypeError if not supported
        return zip(indexes, items)

Usage:

reversed(enumerate(range(100)))
Reproducible Test Code
from typing import Iterable, Sequence
import builtins

def back_enumerate(seq: Sequence):

    if not hasattr(seq, "__len__") or not hasattr(seq, "__getitem__"):
        raise TypeError(f"{seq} does not support sequencing")

    indexes = range(len(seq) - 1, -1, -1)
    items = reversed(seq)
    return zip(indexes, items)

def no_type_check(seq: Sequence):
    indexes = range(len(seq) - 1, -1, -1)
    items = reversed(seq)
    return zip(indexes, items)

def stack_overflow(seq: Sequence):
    return reversed(list(enumerate(seq)))

def comparison1(seq: Sequence):
    return ((len(seq) - i, e) for i, e in enumerate(reversed(seq)))

def comparison2(seq: Sequence):
    from itertools import count
    return zip(count(len(seq) - 1, -1), reversed(seq))

class enumerate(builtins.enumerate):
    def __init__(self, iterable: Iterable, start: int=0):
        self.iterable = iterable
        self.start = start

    def __reversed__(self):
        indexes = range(len(self.iterable) - 1, self.start - 1, -1)
        items = reversed(self.iterable) # Will throw TypeError if not supported
        return zip(indexes, items)

from timeit import timeit

seq = range(10000)

def test(stmt: str):
    return timeit(stmt, globals=globals(), number=10000)

results = [
    ("back_enumerate", test('list(back_enumerate(seq))')),
    ("no_type_check ", test('list(no_type_check(seq))')),
    ("enumerate cls ", test('list(reversed(enumerate(seq)))')),
    ("stack_overflow", test('list(stack_overflow(seq))')),
    ("comparison1   ", test('list(comparison1(seq))')),
    ("comparison2   ", test('list(comparison2(seq))')),
]

baseline = max(tuple(zip(*results))[1])

for name, time_cost in results:
    pct = 100.0 * time_cost / baseline
    print(name, f"{time_cost:.8f}", f"({pct:.2f}% baseline)")

timeit Test Results:

back_enumerate 4.07600862 (47.78% baseline)
no_type_check  4.09555842 (48.01% baseline)
enumerate cls  4.14349108 (48.57% baseline)
stack_overflow 4.34685933 (50.95% baseline)
comparison1    8.53149825 (100.00% baseline)
comparison2    4.45040158 (52.16% baseline)

That was literally the suggestion discussed in that github issue and rejected by the core devs in there. You would need to counter those arguments, and performance alone is not going to be enough.

My latest proposed solution does not modify reversed() at all – instead it implements __reversed__ method in the enumerate class.

IMO this makes enough distinction for the proposal to be considered again.

Did you read the linked issue all the way through? That approach was explicitly stated in the discussion, and the arguments against the feature were stated in full awareness of it.

1 Like

My bad. I just noticed it. I jumped ahead of myself when reading it.