`x or x or x` versus `(x or x) or x`

jeff5 · March 29, 2024, 2:41pm

Another way of looking at this is through the AST (3.12):

>>> print(ast.dump(ast.parse("x or x or x", mode='eval'), indent=2))
Expression(
  body=BoolOp(
    op=Or(),
    values=[
      Name(id='x', ctx=Load()),
      Name(id='x', ctx=Load()),
      Name(id='x', ctx=Load())]))
>>> print(ast.dump(ast.parse("(x or x) or x", mode='eval'), indent=2))
Expression(
  body=BoolOp(
    op=Or(),
    values=[
      BoolOp(
        op=Or(),
        values=[
          Name(id='x', ctx=Load()),
          Name(id='x', ctx=Load())]),
      Name(id='x', ctx=Load())]))

I’m not totally sure these mean the same thing. At least, it appears that in the triple-or, evaluation takes the truthyness of each in turn, returning the first truthy value, or the last value. In the nested case, it does this for the first pair (returning x) then has to compute the truthyness of that value (again).

jamestwebber · March 29, 2024, 3:27pm

Hm, I think you could make the argument that expression still evaluates correctly, but that future evaluations might not be what you expect because an assumption about short-circuiting was violated.

This is arguably a bug in that the specification states what will be evaluated, and it’s common to rely on that behavior (e.g. to avoid evaluating something that raises exceptions). Although this specific example is pathological and it’s not clear if a realistic example exists.

kknechtel · March 29, 2024, 5:49pm

Python doesn’t know the types at compile time. But it might be able to apply some simple optimizations when it sees literals.

I wouldn’t say that the second evaluation is “extra” in 3.12, because it makes sense if you consider what the parse tree looks like. Instead, there’s a missed optimization that used to be applied, but I guess was overlooked now in the changes (yet again) to the bytecode spec and/or the code generator.

jamestwebber · March 29, 2024, 5:54pm

It doesn’t do this if the first element is a truthy variable, either.

jamestwebber · March 29, 2024, 6:02pm

Oh, I think I understand what’s happening better. edit: this statement was incorrect, nvm

In 3.12 the evaluation of (x or x) doesn’t return True, it returns x. So you end up testing the truthiness of x again thus you call __bool__ again, printing aha or whatever.

In 3.10 it seems like the initial truthiness short-circuits the whole thing?

class B:
    def __init__(self, val):
        self.val = val
    def __bool__(self):
        print(f'aha: {self.val}')
        return True

a = True

(B(1) or a) or B(2)
# prints 
# aha: 1
# aha: 1

Because (B(1) or a) evaluates to B(1) by testing it for truthiness, and then B(1) or B(2) does the same.

MegaIng · March 29, 2024, 6:53pm

In the end, this change in behavior happened from Python 3.11 to 3.12 because of the removal of JUMP_IF_FALSE_OR_POP, see Remove `JUMP_IF_FALSE_OR_POP` and `JUMP_IF_TRUE_OR_POP` · Issue #567 · faster-cpython/ideas · GitHub . I think the peephole optimizer hasn’t quite been updated to optimize the new sequence of instructions.

tjreedy · March 29, 2024, 7:02pm

@pablogsal This thread is about a possibly incomplete change in parsing/optimizing in 3.12. Seems better to ping you here rather than suggest an issue.

pablogsal · March 29, 2024, 7:18pm

Thanks for the ping! This looks more on the compiler side CC @iritkatriel

nathan-chappell · April 2, 2024, 1:08pm

Not sure that’s a good example, many orms trigger side effects (eg database queries) when attributes are accessed.

MegaIng · April 4, 2024, 9:24am

Yes, right, side effects in a general sense are fine. But two accesses to the same attribute should result in the same value each time.

cben · April 4, 2024, 9:56am

[everything is this response is from 3.12.2]
AFAICT flat x or y or z is compiled in 3.12 same x or (y or z).

Python 3.12.2 (main, Mar 12 2024, 11:02:14) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> dis.dis('x or (y or z)')
  0           0 RESUME                   0

  1           2 LOAD_NAME                0 (x)
              4 COPY                     1
              6 POP_JUMP_IF_TRUE         8 (to 24)
              8 POP_TOP
             10 LOAD_NAME                1 (y)
             12 COPY                     1
             14 POP_JUMP_IF_TRUE         3 (to 22)
             16 POP_TOP
             18 LOAD_NAME                2 (z)
             20 RETURN_VALUE
        >>   22 RETURN_VALUE
        >>   24 RETURN_VALUE

That looks like a sensible optimization to me — ignoring “broken” objects, or is associative:

>>> for x,y,z in itertools.product([0,1], repeat=3):
...    print(x,y,z, '(x or y) or z =', (x or y) or z, 'x or (y or z) =', x or (y or z))
... 
0 0 0 (x or y) or z = 0 x or (y or z) = 0
0 0 1 (x or y) or z = 1 x or (y or z) = 1
0 1 0 (x or y) or z = 1 x or (y or z) = 1
0 1 1 (x or y) or z = 1 x or (y or z) = 1
1 0 0 (x or y) or z = 1 x or (y or z) = 1
1 0 1 (x or y) or z = 1 x or (y or z) = 1
1 1 0 (x or y) or z = 1 x or (y or z) = 1
1 1 1 (x or y) or z = 1 x or (y or z) = 1

Same for (x and y) and z vs. x and (y and z).
However mixing is NOT associative e.g. (x or y) and z != x or (y and z).

But it’s not enough to be “sensible optimization”; Python strives for predictable behavior, so does this match parser, and does this match official grammar?

Well, AST surprised me

it’s neither this associativity nor that, but a single flat “BoolOp with 3 children” :

>>> print(ast.dump(ast.parse('(x or y) or z'), indent=2))
Module(
  body=[
    Expr(
      value=BoolOp(
        op=Or(),
        values=[
          BoolOp(
            op=Or(),
            values=[
              Name(id='x', ctx=Load()),
              Name(id='y', ctx=Load())]),
          Name(id='z', ctx=Load())]))],
  type_ignores=[])
>>> print(ast.dump(ast.parse('x or (y or z)'), indent=2))
Module(
  body=[
    Expr(
      value=BoolOp(
        op=Or(),
        values=[
          Name(id='x', ctx=Load()),
          BoolOp(
            op=Or(),
            values=[
              Name(id='y', ctx=Load()),
              Name(id='z', ctx=Load())])]))],
  type_ignores=[])
>>> print(ast.dump(ast.parse('x or y or z'), indent=2))
Module(
  body=[
    Expr(
      value=BoolOp(
        op=Or(),
        values=[
          Name(id='x', ctx=Load()),
          Name(id='y', ctx=Load()),
          Name(id='z', ctx=Load())]))],
  type_ignores=[])

Similarly x and y and z is a flat “And with 3 children”.
(This is not new — python 3.11.8 also parses flat 3-children.)
Mixes can not be flattened. x and y or z is parsed as '(x and y) or z' BUT x or y and z is parsed as x or (y and z) — that is, and has higher precendence.

What about grammar?

6.17. Operator precedence confirms and has higher precedence.

6.11. Boolean operations explains that x or y, x and y do short-circuiting, but not how exactly it behaves for longer chains.
It gives grammar:

or_test  ::=  and_test | or_test "or" and_test
and_test ::=  not_test | and_test "and" not_test
not_test ::=  comparison | "not" not_test

Grammar agrees “and” has higher precendence.
Grammar gives no hint that “BoolOp with 3 children” is a possibility, but I’m not sure if precise AST structure must match grammar, as long as it’s equivalent?

Anyway, that grammar is left-recursive. IIUC it implies this parsing:

x "or" y "or" z
and_test(x) "or" and_test(y) "or" and_test(z)
or_test(and_test(x) "or" and_test(y)) "or" and_test(z)
or_test(or_test(and_test(x) "or" and_test(y)) "or" and_test(z))

So if I read that right, documented grammar requires (x or y) or z associativity which contradicts x or (y or z) execution.
If optimization is desired, it should be at least documented.

cben · April 4, 2024, 10:11am

Well if x is falsey, I do expect x or x to call __bool__ twice.
AFAICT, what we see here is not about x or x happening to have same object on both sides (and I wouldn’t want the language to special-case that).

When (x or y) or z is executed and x.__bool__() is True on first call, it reduces to x or z so x.__bool__() gets called again.
Whereas x or (....) execution, when x.__bool__() is True, that’s the final answer.

Topic		Replies	Views
Problem with or operator Python Help	2	335	January 16, 2021
I need help understanding how to verify an input. What is the proper syntax? Whenever I try to run this code no matter the input the print functions print. Why is this the case? Python Help	2	456	November 23, 2023
Augmented boolean algebra assignments Ideas	6	508	May 8, 2023
Basic Doubt in syntax Python Help	2	399	July 21, 2021
"OR" operator won't seem to work, why? Python Help help	4	458	December 21, 2021

`x or x or x` versus `(x or x) or x`

Well, AST surprised me

What about grammar?

Related Topics