It took me some time to understand this strange behavior:
>>> [j:=3*i for i in range(3)]
[0, 3, 6]
>>> [j:=3*i for i in range(3) if j > 2]
[0]
>>> [j:=3*i for i in range(3) if j < 2]
[0, 3]
>>> [j:=3*i for i in range(3) if j < 2]
[]
I think these weird results are a combination of two factors:
the assignment expression’s variable j is not visible in the if-clause and
thus a global variable is used,
the assignment expression leaks the assigned variable into the global namespace.
As a result, the value of j used in the if-clause is not the one from the current
iteration, but rather that from the previous iteration.
I find this behavior very misleading, and would call it a bug rather than a (bad) feature. My questions:
is my observation correct,
do other people also consider this behavior unwanted,
what can/should be done to fix it: propagate the assigned values through the whole comprehension, or avoid the leak into the global namespace, or both (my preference).
I’m not looking for a way to solve this. I never use assignment expressions in comprehensions anyway. But look at the last two statements which are exactly the same, but they yield a different result. And if you change the name of the variable j, you get again something else. This looks like a flaky design to me, and certainly very confusing. While most of Python does precisely what you think it does, that is certainly not the case here.
I do find it surprising that the assignment expression leaks out of a comprehension (I only learned that in a recent discussion). It’s been known about for several years so I guess it isn’t considered a bug? Or maybe there’s some reason to allow it.
Wrong. It does do precisely what I think it does. I even hid the results from my view and tried to predict them, got all of them correct without any trouble.
There are a few notable exceptions to the usual rule of “evaluate left to right”, and if you don’t comprehend them (pun intended), you’ll be very confused. Some are fairly obvious to anyone who’s done any sort of programming work (eg the body of a function isn’t executed at the time of definition, it waits till it’s called), but others are less obvious. Keep in mind this evaluation order:
expr2 if expr1 else expr3
[expr3 for var in expr1 if expr2]
With that in mind, everything else makes sense. It’s only if you expect to first evaluate the result expression and only afterwards the condition that it’ll be confusing; and while that might seem logical, it also wouldn’t work the way every other guard does - imagine [1/x for x in range(-5, 5)] and then add a guard against division by zero [1/x for x in range(-5, 5) if x] which clearly has to be checked prior to the 1/x part.
I think you’re overblowing the problem here a bit. Calling something “disturbing” might be appropriate if you’re calling out someone’s lack of faith, but this is simply a fact to be learned. Calling the design “flaky” is definitely inaccurate - this is entirely reliable and dependable, it just wasn’t what you came in expecting. Treat it as a discovery moment, welcome it, and move on.
Yes, you’re right. I should have called it ‘surprising’ rather than ‘disturbing’. And it was surprising to me, obviously not to others. I would have liked that the variables assigned in the comprehension would have a local scope. But it is like it is, and changing that would be a breaking change. So it is something to just learn and remember how it works, and I’ve just done that. I’ve even become confident now to start using assignment expressions in comprehensions. Thanks all for the explanations.