(Speaking for the SC, although the text is all mine and I apologise for not having the time to make it shorter.)
Given the newly uncovered change in semantics, the SC feels we need to unaccept the PEP and revert the implementation (at least for 3.12). It’s not so much about the likelihood of code breaking as it is about creating more confusion about scopes in an already muddled area (class bodies). The (realistically very small) performance win doesn’t seem worth it.
The SC actually discussed this at length, and apart from the decision to unaccept the PEP (for now) we haven’t really drawn any conclusions. We want to know what the community thinks, really. So here are some ideas and questions we discussed, for everyone here to consider as well:
Does the change of semantics (of comprehensions in class bodies) make sense? Unrolling comprehensions into for loops is easy to explain, but the way the loop variable is handled makes it more complicated than that. Class bodies are already subtly different, and we can’t reasonably change that (looking up names in the class body from methods would be dangerously confusing). Does it make more sense for comprehensions to behave like for loops, or like functions?
If the new semantics make sense, can we reasonably change the semantics without a transition period? That isn’t just about breaking existing code but also about how discoverable the error is if you run code that relies on the new behaviour in an older version of Python.
If we need a transition period, what would that look like? In this case, I think we could have the compiler detect the problematic cases (comprehensions in class bodies) and require a future import to enable the new semantics or emit a warning that the semantics will change. Considering how rare affected code appears to be, the future import and warning would probably be very rarely seen.
Alternatively, if the new semantics don’t make sense (or aren’t really desirable enough to warrant the change), what about just not doing the optimisation in that case? (That’s assuming it’s a detectable case.) That would mean comprehensions in class bodies would have more frames in a traceback than comprehensions elsewhere. Is that too confusing to users?
And if we’re talking about not performing the optimisation in class bodies, what about not performing the optimisation in other places where there are end-user noticeable effects, specifically when the loop variable clashes with a function-local variable? (That might make the implementation of this PEP easier, since it does not have to do any special handling of the loop variable at all.) This would exacerbate the traceback-difference question even more, since it would mean changes in a different part of the function (e.g. using the same loop variable for an unrelated loop) would change whether the comprehension is optimised and thus the traceback.
That does leave open the question of what ‘end-user noticeable effects’ should include, though. Clobbering variables would obviously be right out. The presence of frames in a traceback? What you see in a debugger? What locals() returns? The SC accepted that those things can change (by accepting PEP 709). Does it get more problematic if they only sometimes change, though?
In all of that, does it matter how big the impact of the optimisation is? PEP 709 has only a relatively small impact on real code. Would the answers to any of the above questions change if we’re talking about, say, a 10% overall performance benefit? And at what point do the optimisations become disruptive enough, especially if it isn’t clear when they are or aren’t applied, that we need a way to disable them altogether, for debugging if nothing else?
(FWIW, if the semantic differences can be resolved in the next couple of days, the SC is not opposed to considering an updated PEP for 3.12… but it’s getting really tight, and considering the relatively small size of the performance win it’s probably a better idea to take a few more weeks and land it early in 3.13 instead.)