PEP 709: Inlined comprehensions

carljm · February 25, 2023, 5:39pm

Hello! Following up on previous discussion in Inlining comprehensions -- what are the compatibility constraints?, I’ve written a short PEP for inlining comprehensions.

The text of the PEP is currently available at PEP 709: Inlined comprehensions by carljm · Pull Request #3029 · python/peps · GitHub – once the PR is merged, I’ll update this thread with the link to the rendered PEP.

Looking forward to your feedback!

hugovk · February 25, 2023, 8:44pm

Until it’s merged, here’s the rendered preview of the PR:

tjreedy · February 25, 2023, 10:39pm

I was never a fan of the nested functions, so this looks good to me. I’ll leave it to others to consider the cost versus the benefit of the proposed change.

oscarbenjamin · February 25, 2023, 10:51pm

Thanks for this, it looks great.

I have a question about generator expressions which are not mentioned in the PEP. I guess it’s probably more difficult but there could perhaps be greater potential for optimisation if a locally used generator expression did not need to have an actual generator function with associated per-yield frame-switching cost.

Are generator expressions out of scope here?

carljm · February 25, 2023, 11:56pm

Thanks for bringing up generator expressions; I will add some text to the PEP about them. The intent is not to change them at all in this PEP. In general they are more difficult to inline, because the generator may escape, and if so it needs to be a real resumable generator function. But you’re right that in many cases where we know the generator does not escape (e.g. something like tuple(x for x in lst)) we can avoid generator overhead. If PEP 709 is merged, the most straightforward approach would be to convert such comprehensions at compile time to (inlined) list comprehensions, and then e.g. introduce a fast path for list-to-tuple conversion. There’s already discussion of this possibility at Handling "tuple/any/all comprehensions" in the compiler. · Issue #545 · faster-cpython/ideas · GitHub

carljm · February 26, 2023, 12:30am

I pushed an update to the PEP which addresses this. I realized that although the reference implementation doesn’t inline any generator expressions, the PEP should be clear that this is planned as a likely follow-up.

tjreedy · February 26, 2023, 2:37am

Thank you. I figured that you intentionally left GEs out, but wondered if something could be done.

CAM-Gerlach · February 27, 2023, 2:41am

The PEP is now merged and live on the PEPs site:

malemburg · February 27, 2023, 10:28am

Thanks for the PEP, Carl.

I’m definitely +1 on this change and would love to see the GE optimization as well.

thomas · February 27, 2023, 11:30pm

What about other name collisions between the loop variable and other variables accessible in the function? For example, calling builtin functions:

def f(lst):
    x = SomeClass([dir for dir in lst])
    # Calling the builtin dir()
    return dir(x)

or referencing global variables:

result = 1
def f(lst):
    results = [calculate(result) for result in lst]
    if results[0] != result:
        ...

The change from NameError to UnboundLocalError suggests the semantics would change in these cases, as well.

carljm · February 28, 2023, 12:08am

Great observation, I was wondering if someone would ask this The semantics don’t change for these cases, but only because the implementation in the compiler also does a push/pop of its internal scope data, such that e.g. the variable is still treated as a global (LOAD_GLOBAL is used) in the outer scope, and treated as a local only in the inner scope.

The change from NameError to UnboundLocalError is different mostly because trying to tell the compiler that a name doesn’t exist at all is more complex than telling it the name is a global. To be honest, it might be possible to work through that and eliminate the change to UnboundLocalError; I haven’t tried because I didn’t feel it mattered enough to be worth more complexity.

I debated whether to include these details (and similar ones regarding edge cases with cells) in the PEP. On the one hand, including all these details could reassure an observant reader of the PEP, such as yourself, that the necessary cases have been considered (and tested in the implementation) and semantics won’t change. On the other hand, I thought the purpose of the PEP was to evaluate the behavior changes it does cause, which means that implementation details involved in avoiding behavior changes in other edge cases might be an unnecessary distraction.

If you think it’s important in evaluating the PEP to fully understand how all edge cases are handled in the implementation, I can certainly extend the PEP with that level of detail.

thomas · February 28, 2023, 12:16am

A PEP is supposed to encapsulate discussion around the idea, so yes, since it came up in the discussion it should probably be mentioned in the PEP

(Also, I’m curious how this can work in the face of unknown globals, but I should probably take a look at the implementation instead.)

carljm · February 28, 2023, 12:38am

I’ll add these details to the PEP.

The compiler treats any referenced name that isn’t a local (i.e. is referenced but never assigned) as a global, it doesn’t know or care about whether the global exists or is “known” in any sense.

Which also means (I just realized, and verified) that the entire section about UnboundLocalError vs NameError in the PEP is just wrong The support for globals that I mentioned above already kicks in the same for this case (because it’s exactly the same case as far as the compiler is concerned; the only difference is whether that name exists in module globals at runtime.)

I think that bit of the PEP dates back to a much earlier version of the implementation, but I didn’t put a test in specifically for the NameError/UnboundLocalError difference, and didn’t realize that I also fixed that difference when I fixed the use of globals generally. So it’s no wonder you thought that section of the PEP implied that globals would be broken; it would indeed imply that, if it were true

Rosuav · February 28, 2023, 2:19am

That’s a curious quirk. I think I like it, but it effectively does mean that Python has sub-function scope built into the library. Do other implementations than CPython have views on this?

ChrisA

carljm · February 28, 2023, 3:45am

Yes; really just into the bytecode compiler. “Sub-function scope” is a reasonably good name for what this PEP introduces.

I don’t know! I can reach out to directly ask someone from PyPy, at least. Are there other implementations actively keeping up with CPython that I should proactively reach out to?

Glenn · February 28, 2023, 3:58am

Does it matter? Isn’t this just internal to the CPython implementation?
It isn’t a language feature, just an implementation optimization.

carljm · February 28, 2023, 4:11am

Since it has user-visible effects beyond just performance (on tracebacks and the output of locals(), as described in the PEP), albeit minor ones, it could be considered a change to the language, and other implementations may want to adapt to those effects.

Rosuav · February 28, 2023, 5:25am

Not sure. MicroPython has lagged behind lately (I think it’s half way through the 3.x versions somewhere), IronPython is talking about a version 3.4 (although it includes “some syntax and features from newer Python versions”), and Jython is just starting on a 3.x beta.

Brython appears to be at version 3.11, although I wouldn’t call that a major implementation. But they do seem to be tracking CPython reasonably closely.

I looked up Cython’s FAQ, and they don’t claim to be a Python implementation, but it is an official goal to be able to “compile regular Python code and run (most of) the normal Python test suite”, but they’re currently looking at Python 3.6 as a requirement, according to the FAQ. Does anyone know if that’s also the language compatibility target?

So, it’s probably only PyPy of major implementations, sadly.

Rosuav · February 28, 2023, 5:27am

BTW, to quickly eyeball a list of Pythons and see if any seemed to be current, I used this list from the Python wiki. So if anyone here is involved in an active Python implementation (or knows of one) that isn’t on this list, do please add it!

timfelgentreff · February 28, 2023, 9:35am

Since GraalPy tries to not lag behind CPython more than a year or so, we would be affected by this. Even if it might be considered a minor visible effect and/or an implementation detail, it is always surprising (to me) how much of those details end up being relied upon

Because of details such as this we have recently switched to re-using the PEG parser and following the CPython bytecode compiler as closely as we possibly can, so I think that we would just pick up those same changes in observable behaviour without much headache.