Can `ast.BoolOp` use `left` and `right` instead of a `values` list?

tusharsadhwani · June 4, 2023, 1:59pm

There’s a comment in the Python grammar present in the ast module documentation right now:

    -- BoolOp() can use left & right?
    expr = BoolOp(boolop op, expr* values)

BoolOp uses op and a list of values, but BinOp does a pure binary tree with left and right. This is something I’ve wondered quite a few times before:

Why is it designed like this?
Why is the comment there?
And if it can be changed, is the standardization worth the breaking change?

ntessore · June 4, 2023, 2:16pm

I think that is due to the fact that, much like Compare, a BoolOp isn’t limited to left and right values

>>> print(ast.dump(ast.parse('a and b and c', mode='eval'), indent=2))
Expression(
  body=BoolOp(
    op=And(),
    values=[
      Name(id='a', ctx=Load()),
      Name(id='b', ctx=Load()),
      Name(id='c', ctx=Load())]))

tusharsadhwani · June 4, 2023, 2:28pm

You can say the same about 1 + 2 + 3 though. They’re semantically the same.

ntessore · June 4, 2023, 2:32pm

They are not the same because the boolean operators can short-circuit, I suppose? Good question, actually!

Rosuav · June 4, 2023, 3:11pm

I’m not entirely sure. Are there any situations in which (a and b) and c is not the same as (a and b and c)? The disassembly of both looks identical. Or is it something that can happen with mixed and and or operators that means it’s better to group all the ands and all the ors?

effigies · June 5, 2023, 12:03pm

At least in pyparsing, a left-associative rule generates a list of tokens like [test1, test2, test3] while a right-associative rule generates [test1, [test2, test3]]. It’s more work to create and implement short-circuiting on a left-associative binary tree than on a list.

ajoino · June 5, 2023, 1:14pm

I think the mathematical properties of BoolOp or and and, and BinOp | and & are identical, so there is no reason why you’d implement things one way or the other. The ast module seems to be built by many different people (essentially every time new syntax is added IIRC) over a long time so inconsistencies are to be expected.

tusharsadhwani · June 5, 2023, 1:28pm

Not really, booleans do short circuiting.

tusharsadhwani · June 5, 2023, 1:30pm

OK, I figured it out. What @effigies said makes perfect sense, and that’s why only Compare and BoolOp have lists: they both short circuit.

ajoino · June 5, 2023, 1:30pm

I’d still say it’s the same property, if you replace 1 and 0 with “any truthy” and “any falsy”.

storchaka · June 5, 2023, 4:19pm

x = a and b and c

and

x = (a and b) and c

produce different bytecodes. The truth value of a is always evaluated once in the former case, but can be evaluated two times in the latter case.

Boolean expressions in boolean context are optimized. Both

if a and b and c: x

and

if (a and b) and c: x

produce the same bytecode. The truth value of a is always evaluated once.

This is an implementation detail, and you code should not rely on it.

Other than this, the difference between multi-argument BoolOp and nested two-argument BoolOp is that the former uses iteration, and the latter uses recursion. The recursion level is limited (otherwise it can cause stack overflow), this limits the number of sequential and’s and or’s, the same way as the number of sequential arithmetic operators is currently limited.

Rosuav · June 5, 2023, 4:32pm

Serhiy Storchaka:

x = a and b and c
and
x = (a and b) and c
produce different bytecodes. The truth value of a is always evaluated once in the former case, but can be evaluated two times in the latter case.

Hmm. I’m looking at my current build of 3.12, and dis.dis() is showing the same values. At what point does this get optimized?

>>> dis.dis('x = a and b and c')
  0           0 RESUME                   0

  1           2 LOAD_NAME                0 (a)
              4 JUMP_IF_FALSE_OR_POP     3 (to 12)
              6 LOAD_NAME                1 (b)
              8 JUMP_IF_FALSE_OR_POP     1 (to 12)
             10 LOAD_NAME                2 (c)
        >>   12 STORE_NAME               3 (x)
             14 RETURN_CONST             0 (None)
>>> dis.dis('x = (a and b) and c')
  0           0 RESUME                   0

  1           2 LOAD_NAME                0 (a)
              4 JUMP_IF_FALSE_OR_POP     3 (to 12)
              6 LOAD_NAME                1 (b)
              8 JUMP_IF_FALSE_OR_POP     1 (to 12)
             10 LOAD_NAME                2 (c)
        >>   12 STORE_NAME               3 (x)
             14 RETURN_CONST             0 (None)

It’s also possible that something’s changed since March when I last built from source, or that my Python installation is broken.

storchaka · June 5, 2023, 6:36pm

Current main and 3.12.

$ echo 'x = a and b and c' | ./python -m dis
  0           0 RESUME                   0

  1           2 LOAD_NAME                0 (a)
              4 COPY                     1
              6 POP_JUMP_IF_FALSE        6 (to 20)
              8 POP_TOP
             10 LOAD_NAME                1 (b)
             12 COPY                     1
             14 POP_JUMP_IF_FALSE        2 (to 20)
             16 POP_TOP
             18 LOAD_NAME                2 (c)
        >>   20 STORE_NAME               3 (x)
             22 RETURN_CONST             0 (None)

$ echo 'x = (a and b) and c' | ./python -m dis
  0           0 RESUME                   0

  1           2 LOAD_NAME                0 (a)
              4 COPY                     1
              6 POP_JUMP_IF_FALSE        2 (to 12)
              8 POP_TOP
             10 LOAD_NAME                1 (b)
        >>   12 COPY                     1
             14 POP_JUMP_IF_FALSE        2 (to 20)
             16 POP_TOP
             18 LOAD_NAME                2 (c)
        >>   20 STORE_NAME               3 (x)
             22 RETURN_CONST             0 (None)

Furthermore, the new bytecode is larger and slower. It is a regression.

sunmy2019 · June 6, 2023, 5:17am

github.com/faster-cpython/ideas

Remove `JUMP_IF_FALSE_OR_POP` and `JUMP_IF_TRUE_OR_POP`

opened 12:09PM - 20 Mar 23 UTC

closed 07:51AM - 06 Apr 23 UTC

markshannon

These instructions are used for boolean operations, either explicit `a or b` or …implicit `a < b < c`. We should remove these instructions for a few reasons: * These two bytecodes represent only 0.2% of all instructions. * They add quite a lot of complexity to the compiler * PEP 669 requires an instrumented version of all jumps, so removing them will remove 4 instructions, not just 2. * They can be trivially replaced: `JUMP_IF_FALSE_OR_POP` becomes `COPY 1; POP_JUMP_IF_FALSE; POP_TOP`. We can avoid generating much, if any, extra code by better peephole optimization, or more sophisticated code generation. ### Possible optimizations #### Code generation If there are no walrus expressions (`x := ...`) in the statement, then `LOAD_FAST` instructions can be freely re-ordered and duplicated. `a < b < c` can be implemented as `(a < b) or (b < c)` without the need to store `b` on the stack. Currently we generate the sequence: ``` LOAD_FAST 0 (a) LOAD_FAST 1 (b) SWAP 2 COPY 2 COMPARE (<) POP_JUMP_IF_FALSE LOAD_FAST 2 (c) COMPARE (<) POP_JUMP_IF_FALSE ``` We could generate: ``` LOAD_FAST 0 (a) LOAD_FAST 1 (b) COMPARE (<) POP_JUMP_IF_FALSE LOAD_FAST 1 (b) LOAD_FAST 2 (c) COMPARE (<) POP_JUMP_IF_FALSE ``` #### CFG optimization The sequence: ``` LOAD_FAST 0 (a) LOAD_FAST 1 (b) SWAP 2 ``` can be changed to: ``` LOAD_FAST 1 (b) LOAD_FAST 0 (a) ``` The sequence: ``` LOAD_FAST a COPY 1 POP_JUMP_IF_FALSE label POP_TOP ... label: ... ``` can be changed to: ``` LOAD_FAST a POP_JUMP_IF_FALSE label ... label: LOAD_FAST a ... ``` (Provided that `label` has only one predecessor)

Rosuav · June 6, 2023, 5:55am

Ah, thanks Steven, that explains it. Guess it’s time for me to build a new Python!

Topic		Replies	Views
Suggest adding the '&&' operator instead of 'and', and the '\|\|' operator instead of 'or' in Python Ideas	20	2339	May 8, 2023
Switch Python's parsing tech to something more powerful than LL(1) Ideas	55	9905	April 28, 2019
Imply Logical Operators Ideas	25	1792	May 4, 2023
Has this been forgotten at Python development? Python Help help	6	495	March 21, 2021
Premature micro-optimisations that keep me awake at night Python Help	3	250	December 9, 2022

Can `ast.BoolOp` use `left` and `right` instead of a `values` list?

Related Topics