Let generator.close() return StopIteration.value

I often write functions that go through a lot of data and apply arbitrary transformations. The best interface for this is often using generators, which might look like this:

def transform():
    while True:
        try:
            data = yield
        except GeneratorExit:
            break
        # gather data
        ...
    # compute
    ...
    return result

In short, the generator first gathers a lot of data before performing a computation and returning the final result.

The only way to obtain the return value of such a generator is currently to throw the GeneratorExit and catch the resulting StopIteration manually:

def close_and_return(g):
    try:
        g.throw(GeneratorExit)
    except StopIteration as e:
        return e.value
    else:
        return None

It would be much more convenient if the .close() method of the generator, which already catches StopIteration, also returned its value. Since .close() currently never returns anything, this change would not break existing code. The improved convenience might even give a lease of life to generator return values, and this type of coroutine more generally. Or, at the very least, we’d know why we are wasting precious keypresses to put the third type into Generator[...] annotation.

That being said, I do imagine the existing behaviour was chosen on purpose, but I didn’t find anything specific in e.g. PEP 479.

+1

A possible extension to this idea would be to always keep the generator return value in the generator object and return it once on .close(). This enables simple access to the return value whether or not the close() was what stopped the generator.

It is either too early or too late.

  1. If generator.close() is called before the iteration starts, the code of the generator function has not even started to execute yet, and there is no return value.
  2. If it is called on non-exhausted generator, the generator function has been paused on the yield expression which will be interrupted by GeneratorExit before reaching the return statement.
  3. If the generator object has been exhausted, the returned value was only available as a StopIteration attribute raised by the last __next__(), and was likely discarded when StopIteration was implicitly (in for) or explicitly handled. When generator.close() is called later, the returned value is already gone.

In case 3 you need to save the returned value in the generator object. It can prolong the life of it, and can even create unwanted reference loops which will prolong the life of the returned value and the generator object and all linked objects even more.

In case 2 you need to explicitly catch GeneratorExit in a generator function and silence it. It is considered an antipattern. GeneratorExit was intentionally not made a subclass of Exception to prevent it from accidentally being silenced.

1 Like

I only want to improve the ergonomics of the specific case that a generator exits gracefully because of the call to close(), i.e. the StopIteration case here.

That’s not quite the same; the call to close() raises GeneratorExit, not StopIteration. (The other check in the same code.) So there won’t be any useful return value. StopIteration happens when the generator hits a return statement.

Indeed, but the generator can only exit gracefully by explicitly catching GeneratorExit and returning.

Now I’m not sure if @storchaka considers that “silencing” and hence part of the anti-pattern he mentions, but I don’t, because the generator still acts on the GeneratorExit by exiting. The fact that close() ignores StopIteration seems to confirm that interpretation.

For this certain example, what about this workaround?

def transform(set_result: Callable[[T], None]) -> T:
    while True:
        try:
            data = yield
        except GeneratorExit:
            set_result(result)
            raise
        # gather data
        ...
    # compute
    ...
    return result


def main():

    def set_result(res):
        nonlocal result
        result = res

    result = None
    g = transform(set_result)
    next(g)
    for data in data_set:
        g.send(data)
    g.close()
    print(f"{result = }")

or something like:

STOP_TRANSFORM = object()  # a sentinel object

def transform():
    while True:
        data = yield
        if data is STOP_TRANSFORM:
            break
        # gather data
        ...
    # compute
    ...
    return result

def close_and_return(g):
    return g.send(STOP_TRANSFORM)

Thanks, but the example is already a workaround that works well. The idea is to make the workaround unnecessary.

However, there is an upshot to using a sentinel value, in that it makes the entire boilerplate “nicer” to some people:

while (data := (yield)) is not SENTINEL:
    …

But then you are just replicating the GeneratorExit mechanism by other means.