Groupby iterator consumed by groups unpacking

Consider a dummy function generating a groupby with a single group:

from itertools import groupby

def get_groupby():
    return groupby(range(1))

The following works as expected:

>>> group = next(get_groupby())
>>> print(group)
(0, <itertools._grouper object at 0x...>)
>>> print(list(group[1]))
[0]

But this doesn’t:

[group] = get_groupby()
>>> print(group)
(0, <itertools._grouper object at 0x...>)
>>> print(list(group[1]))
[]

It looks like the _grouper iterator was consumed by unpacking the groupby object, but I can’t understand why.

With a generator function, it works as I expect:

def gen_func():
    yield 0, iter(range(1))
>>> [group] = gen_func()
>>> print(group)
(0, <range_iterator object at 0x...>)
>>> print(list(group[1]))
[0]

Could anyone enlighten me on what’s going on here?

Did you see the note in the groupby documentation about how the state of the underlying iterator is shared by the groupby iterator as well as each group iterator?

The size of the sequence on the left has to match that of the right when unpacking. Since the size of an iterator returned by a generator is unknown, the interpreter has to attempt to fetch one more item than the LHS sequence needs to ascertain that the RHS iterator doesn’t have any items left.

Unfortunately with the itertools._grouper being dependent on nonlocal variables tied to the current state of the groupby iterator (please refer to the pure Python equivalent implementation of itertools.groupby), it means that by fetching one more item the _grouper object would lose the value of the last iterator state.

1 Like

I dit, but failed to see it was the cause of my problem…

But I totally see now, that’s very clear, thank you both!

3 Likes