Foot Gun: no RuntimeError on non-`None` return from generator outside of `yield from`

I accidentally created a generator function by including an unreachable yield statement after a return statement while re-writing some code. This made returning a list create an empty generator, which was pretty confusing and sucked up a bunch of my time:

def y1(*args):
    return args

def y2(*args):
    return args
    yield 3

print("y1: " + str(list(y1(1, 2, 3, 4))))

# OUTPUT: y1: [1, 2, 3, 4]

print("y2: " + str(list(y2(1, 2, 3, 4))))

# OUTPUT: y2: []

Not sure what’s possible change wise, as StopIteration with a non-None argument has a valid (albeit confusing) in PEP 380 as mentioned in the associated issue.

I’d love a RuntimeError to happen here if it’s possible to do so while honouring StopIteration use in PEP 380, but I’m not sure if there are better alternative solutions or if there is any possibility of catching a non-None valued StopIteration error specifically outside the context of a yield from call stack.

Folks have suggested linter changes, but it’d be nice to have a way to prevent this from being the awful foot-gun that it was without one. This isn’t an issue in Python 2.

Non-None return value from generator should throw RuntimeError #127306

PEP 380

If you add type hints to your parameters and return type, a type checker will warn you about the bug:

from typing import Any

def y2(*args: Any) -> tuple[Any]:
    return args
    yield 3

Output (run online):

main.py:3: error: The return type of a generator function should
                  be "Generator" or one of its supertypes [misc]
2 Likes

Linters already warn about this. For example B901 from bugbear:

B901: Using return x in a generator function used to be syntactically invalid in Python 2. In Python 3 return x can be used in a generator as a return value in conjunction with yield from. Users coming from Python 2 may expect the old behavior which might lead to bugs. Use native async def coroutines or mark intentional return x usage with # noqa on the same line.

Making this a RuntimeError would essentially break a lot of code which I don’t think is justified.

3 Likes

Can you give me an example of code that would break if a RuntimeError was triggered only outside the call chain of a yield from?

Using return with a non-None value outside of a yield from call returns an empty generator. I can’t imagine this would ever be what someone would have wanted, outside of the (IMHO obscure) case of sub-generators in PEP 380?

Similar to using linting tools or an IDE, but it works without any external tools:

import types


def y1(*args):
    return args


def y2(*args):
    return args
    yield 3


assert isinstance(y1(1), types.GeneratorType)
assert isinstance(y2(1), types.GeneratorType)

print("y1: " + str(list(y1(1, 2, 3, 4))))

# OUTPUT: y1: [1, 2, 3, 4]

print("y2: " + str(list(y2(1, 2, 3, 4))))

# OUTPUT: y2: []

Make sure to test your code and use assertions to catch issues early.

I guess this is possibly confusing for generators, but we can’t remove it, not without breaking coroutines. Coroutines, which are defined exactly the same as generators, rely on StopIteration for return values. Even if we changed how coroutines themselves worked to get around StopIteration, we’d break legacy coroutines by removing support for non-None return values:

import types
import asyncio

@types.coroutine
def legacy():
    yield from asyncio.sleep(1)
    return 42

async def main():
    print(await legacy())  # 42

asyncio.run(main())
4 Likes

Thanks @ZeroIntensity – if coroutines leverage this behaviour, I imagine it’s not a fixable problem.

There’s existing maintained real-world code that uses generators as synchronous coroutines too, sometimes to avoid the function color problem, others to not use asyncio as the program’s event loop.

3 Likes

Would changing the underlying implementation of coroutines to use another exception type for yielding execution be possible?

Is there any reason why the dead code yield 3 makes this function a generator?

import types


def y2(*args):
    return args

    exit()
    yield 3


assert isinstance(y2(1), types.GeneratorType)
1 Like

Probably not. I think the most common case for implementing coroutines via generators is probably for C extensions that need to perform asynchronous operations. There’s nothing in the C API right now for defining an async def function, so implementing an object that supports the generator protocol (via next() and StopIteration) is your only option. We’d break tons of extensions by changing how coroutines return values :frowning:

1 Like

Thanks. I figured this wouldn’t be an easy problem to solve, and it looks like existing uses paint us into a corner here.

Appreciate the feedback.

Or to be more precise: Generators returning non-None values is not a problem to be solved.

Now, there is a potential problem here, and it’s the way that you can turn something into a generator by adding a yield that you might not notice. If you want to focus on solving THAT problem, that would be more doable (and it would have solved your original issue, since you’d have known that you had a generator there). There are a few directions you could go with that; a type checker has been suggested, and another option would be to configure your editor to highlight generator functions in some way.

As I mentioned previously, I am certainly interested in alternative ways ways of solving the problem. I agree that valued StopIteration exceptions aren’t the problem, it was just proposed as a possible solution as it had ergonomics I liked.

I understand this is a divisive subject, but I am not personally interested in adopting type checking and annotation as a solution for this problem for reasons that appear to have been discussed extensively in other threads.

Hopefully your (and others) suggestions to solve this with type checking will be helpful to others.

Appreciate all the feedback.

No no, that’s fine. I don’t use type checkers and annotations either. They are optional for a reason. There’s no shame in not using type hints.

Look into your editor’s configs to see if you can have generator functions shown differently.

1 Like

exit is a free variable and can be rebound, so it’s not guaranteed that yield 3 is dead code.

It seems that if yield is present in the function body, it turns the function into a generator. However, according to the definition of the return statement, any code after return is considered dead code, as the function execution is terminated at that point.

At least, visually, it can be ambiguous. But luckily, I don’t even trust the code I write. I always assert and test rigorously.

The yield statement:

Using yield in a function definition is sufficient to cause that definition to create a generator function instead of a normal function.

The return statement:

In a generator function, the return statement indicates that the generator is done and will cause StopIteration to be raised.

The alternative design would have been to alter to the function definition to declare it as a generator and make yield and yield from syntax errors if this is not present. Many other languages do this, such as JavaScript:

function *generator() {
   yield 5;
}
function syntaxError() {
    yield 5; // SyntaxError: yield expression is only valid in generators
}

This also makes defining an empty generator easier:

function *generator() {}

Whereas in Python you have to do something like:

def empty_generator1():
    return
    yield

def empty_generator2():
    yield from []

This is much like the fact that Python lacks variable declarations so that:

a = 2
def f():
    print(a) 
    return
    a = 7
f()

Raises UnboundLocalError: cannot access local variable 'a' where it is not associated with a value because a is assigned in the unreachable code at the end.

The Python compiler takes a multiple pass approach and the presence of unreachable code can change how reachable code is compiled. It is a tradeoff that reduces verbosity and it cannot be changed now.

I use return x; yield fairly often to create a generator that raises StopIteration with a value fairly often. I don’t see how it’s a foot gun

3 Likes