I accidentally created a generator function by including an unreachable yield statement after a return statement while re-writing some code. This made returning a list create an empty generator, which was pretty confusing and sucked up a bunch of my time:
Not sure what’s possible change wise, as StopIteration with a non-None argument has a valid (albeit confusing) in PEP 380 as mentioned in the associated issue.
I’d love a RuntimeError to happen here if it’s possible to do so while honouring StopIteration use in PEP 380, but I’m not sure if there are better alternative solutions or if there is any possibility of catching a non-None valued StopIteration error specifically outside the context of a yield from call stack.
Folks have suggested linter changes, but it’d be nice to have a way to prevent this from being the awful foot-gun that it was without one. This isn’t an issue in Python 2.
Non-None return value from generator should throw RuntimeError #127306
Linters already warn about this. For example B901 from bugbear:
B901: Using return x in a generator function used to be syntactically invalid in Python 2. In Python 3 return x can be used in a generator as a return value in conjunction with yield from. Users coming from Python 2 may expect the old behavior which might lead to bugs. Use native async def coroutines or mark intentional return x usage with # noqa on the same line.
Making this a RuntimeError would essentially break a lot of code which I don’t think is justified.
Can you give me an example of code that would break if a RuntimeError was triggered only outside the call chain of a yield from?
Using return with a non-None value outside of a yield from call returns an empty generator. I can’t imagine this would ever be what someone would have wanted, outside of the (IMHO obscure) case of sub-generators in PEP 380?
I guess this is possibly confusing for generators, but we can’t remove it, not without breaking coroutines. Coroutines, which are defined exactly the same as generators, rely on StopIteration for return values. Even if we changed how coroutines themselves worked to get around StopIteration, we’d break legacy coroutines by removing support for non-None return values:
There’s existing maintained real-world code that uses generators as synchronous coroutines too, sometimes to avoid the function color problem, others to not use asyncio as the program’s event loop.
Probably not. I think the most common case for implementing coroutines via generators is probably for C extensions that need to perform asynchronous operations. There’s nothing in the C API right now for defining an async def function, so implementing an object that supports the generator protocol (via next() and StopIteration) is your only option. We’d break tons of extensions by changing how coroutines return values
Or to be more precise: Generators returning non-None values is not a problem to be solved.
Now, there is a potential problem here, and it’s the way that you can turn something into a generator by adding a yield that you might not notice. If you want to focus on solving THAT problem, that would be more doable (and it would have solved your original issue, since you’d have known that you had a generator there). There are a few directions you could go with that; a type checker has been suggested, and another option would be to configure your editor to highlight generator functions in some way.
As I mentioned previously, I am certainly interested in alternative ways ways of solving the problem. I agree that valued StopIteration exceptions aren’t the problem, it was just proposed as a possible solution as it had ergonomics I liked.
I understand this is a divisive subject, but I am not personally interested in adopting type checking and annotation as a solution for this problem for reasons that appear to have been discussed extensively in other threads.
Hopefully your (and others) suggestions to solve this with type checking will be helpful to others.
It seems that if yield is present in the function body, it turns the function into a generator. However, according to the definition of the return statement, any code after return is considered dead code, as the function execution is terminated at that point.
At least, visually, it can be ambiguous. But luckily, I don’t even trust the code I write. I always assert and test rigorously.
The alternative design would have been to alter to the function definition to declare it as a generator and make yield and yield from syntax errors if this is not present. Many other languages do this, such as JavaScript:
function *generator() {
yield 5;
}
function syntaxError() {
yield 5; // SyntaxError: yield expression is only valid in generators
}
This also makes defining an empty generator easier:
function *generator() {}
Whereas in Python you have to do something like:
def empty_generator1():
return
yield
def empty_generator2():
yield from []
This is much like the fact that Python lacks variable declarations so that:
a = 2
def f():
print(a)
return
a = 7
f()
Raises UnboundLocalError: cannot access local variable 'a' where it is not associated with a value because a is assigned in the unreachable code at the end.
The Python compiler takes a multiple pass approach and the presence of unreachable code can change how reachable code is compiled. It is a tradeoff that reduces verbosity and it cannot be changed now.