Revisiting PEP 505 – None-aware operators

This is certainly true for attribute access.

But there are already short-circuit operators in Python which do not obey this property. Feel free to read the lengthy discussion above if you haven’t already, but one point brought up was that breaking up

e = (a and b) and c

into

t1 = a and b
e = t1 and c

Has different semantics - resulting in a.__bool__() being called twice in the second one, but only once in the first (depending on the Python version). The same is true for or.

This gives precedent for making ?. a short-circuit operator - you already can’t break down some Python expressions accurately because of their short-circuiting nature - so adding another one, especially when the short-circuiting is a major use-case, seems reasonable to do.

1 Like

I’ve just updated my draft so short-circuiting is limited to the end of a group. You can find the updated sections here:

The implementation is also updated now: https://pep505-demo.pages.dev/

1 Like

I still feel that what’s happening for and/or is an optimization or a quality-of-implementation issue, as opposed to the semantics being proposed for ‘a?.b.c’ and ‘(a?.b).c’ . The ‘?’ semantics does not depend on a normally read-only operation being able to log a side-effect: It is the difference between raising an AttributeError and returning None.

4 Likes

I’m not convinced (yet) that using such a different AST for None-aware access operators makes sense. Yes, it would simplify the short-circuiting, at what cost though?

In addition to the NoneAwareAttribute and NoneAwareSubscript nodes, we’d likely need to add at least three more AttributeTail, SubscriptTail and CallTail. If I didn’t miss something, the expression a.b?.c[0].func() would then be parsed roughly like this:

ast.NoneAwareAttribute(
    base=ast.Attribute(
        expr=ast.Name(identifier="a"),
        identifier="b"
    )
    tail=[
        ast.AttributeTail(identifier="c"),
        ast.SubscriptTail(slice=ast.Constant(value=0)),
        ast.AttributeTail(identifier="func")
        ast.CallTail(args=[], keyword=[])
    ]
)

Reusing parts of the Attribute, Subscript and Call nodes might be challenging as one is left-recursive whereas the *Tail variants are right-recursive. This will not only affect CPython itself for the bytecode generation where we might need to duplicate or refactor code but also every other consumer of the AST. It might also be at least a bit surprising that a normal attribute access (.) can be parsed as two separate nodes depending on the context.

What I like about just adding primary '?' '.' NAME and primary '?' '[' slices ']' to the primary rule is that it shows these can be used almost interchangeably with the other access operators. Yes, consumers will need to adjust their Attribute, Subscript and Call node handling to add support for short-circuiting, but IMO that’s a fairly minor ask compared to the alternative mentioned above.


In comparison, adding a group attribute to all expression nodes is a bit tedious but it does make a surprising amount of sense actually. In a way it’s similar to the other location attributes. It also helps that a group will only ever have exactly one topmost expression node.

My implementation can be found here: Comparing python:main...cdce8p:syntax-none-aware-access-operators · python/cpython · GitHub
In particular the last commit: Add group attribute for expressions · python/cpython@fb46569 · GitHub

And you didn’t try adding flags to the attribute and subscript node?

Being amenable to decomposition is a desirable quality, but not a universal or guaranteed one. Your example is wrong - it draws on a bug in a specific CPython version - but none the less I will concede the point that this is certainly possible to compromise on decomposeability, if something sufficiently valuable is gained in return.

I’m not sure that it is. ?? and ??= are straightforward, normal binary operators. ?. is subtle and hard to explain.

3 Likes

The behavior is sufficiently different that I think having separate nodes for the None-aware access operators makes more senes.

Feel free to compare these if you like
Python/codegen.c → codegen_visit_expr (for Attribute_kind)
Python/codegen.c → codegen_none_aware_attribute

1 Like

If you prefer, you could only use a shortCircuiting flag:

Attribute(
  value=NoneAwareAttribute(
    value=Name(id='a'),
    attr='b',
    shortCircuiting=1),
  attr='c',
  shortCircuiting=1) # 0 for (a?.b).c

The new grouping section says that:
(a?.b).c
is equivalent to
_t = a?.b
_t.c
Which makes perfect sense.

But the short circuiting section still says that
a?.b.c is not equivalent to
_t = a?.b
_t.c

Consequently,
a?.b.c is not equivalent to (a?.b).c which seems confusing.
No other operators change their meaning in parentheses like this.

2 Likes
>>> -7 < -6 < -5
True
>>> (-7 < -6) < -5
False
10 Likes

Good point. The implicit chaining of comparisons is a bit different, as the middle expression is duplicated.
Once you’ve expanded -7 < -6 < -5 to (-7 < -6) and (-6 < -5) it is much clearer what is going on and parentheses can be added, or removed, without changing the meaning.

Is there an expansion of chained ?. and . operators that is robust to having parentheses added or removed? If there were it would make the semantics a lot clearer and easier to reason about.

The transformation I keep in mind when thinking about ?. is this

base?.tail

base.tail if (base is not None) else None

In practice the lookup of base is cached so you’ll need to add a temporary variable but that can make it difficult to see what’s actually going on. Especially for more complex cases.

(_t.tail) if ((_t := base) is not None) else None

While base can be replaced with any number of expressions, including groups, tail is limited to attribute access, subscript, their none-aware variants and calls. That’s due to the left-recursive grammar in the primary rule and similar to base.tail.

Does this help?

7 Likes

With the concrete example of a?.b.c vs (a?.b).c, the expansions work out to:

  • a?.b.c => _t.b.c if (_t := a) is not None else None
  • (a?.b).c => (_t.b if (_t := a) is not None else None).c

Similarly to how a < b < c and (a < b) < c expand out differently to:

  • a < b < c => (a < b) and (b < c)
  • (a < b) < c => (_t := (a < b)) and (_t < c)

In both cases, the version with the parentheses is probably not what the person writing it actually wanted, but that doesn’t mean it’s invalid code

4 Likes

Another discussion point the last few days was if the short-circuiting behavior should be removed. I’ve added a new section Short circuiting is difficult to understand to my draft to (a) capture this argument and (b) point out why I think the short-circuiting behavior as a whole should not be removed from the draft.

To reiterate the arguments here

  1. A lot of the recent discussion has centered around corner cases in the short-circuiting behavior. They are important to get right, for sure (and I believe we have done that now) but seeing all these posts here can make the behavior seem more complicated than it actually is in practice.
    Say I don’t know anything about ?. and ?[ ], all that’s really needed to know is that if the LHS / base subexpression evaluates to None the RHS / tail will be skipped and the result will be set to None. If it is any other value, it will just do a “normal” attribute access or subscript lookup. In the majority of cases this should be more than enough information to understand how the operators will work. See also the How to Teach This section in the draft.
  2. On a technical level, removing short-circuiting means each subsequent attribute access or subscript would need to be changed to their None-aware variants. Similar to the maybe a.b proposal, this will hide potential error cases if a subsequent not-optional attribute suddenly starts to return None as well. Having short-circuiting allows the developer to be more explicit. More details can be found in the Remove short-circuiting section in rejected ideas.
  3. The short-circuiting behavior as it’s implemented specified right now is identical to that of other major languages like JS, TS and C#.
14 Likes

I hope you meant specified. Implementation should follow specification.

1 Like

Yes, thanks for pointing it out! Though the implementation / demo is also updated to follow the spec now. https://pep505-demo.pages.dev/

5 Likes

Looks like the discussion around the None-aware access operators has quieted down, at least for the moment. I took some time to write another draft, this time for the coalescing operators, ?? and ??=.

The draft PR on my fork: PEP XXX: Coalescing operators [DRAFT] by cdce8p · Pull Request #3 · cdce8p/python-peps · GitHub

Changes compared to PEP 505

I decided to change the precedence of ?? to match other implementations in C# and JS. For more details, please read Precedence in the specification and Add ?? as a binary operator in the rejected ideas section.

(Minor) open questions

AST node for ??

From the implementation side of things, ?? shares a lot with the boolean operators and and or. So much so that there isn’t a strict need to add a separate CoalesceOp node. A simpler alternative would be to modify boolop = And | Or and just add on a third Coalesce option. At the moment I’m thinking that it might be confusing for some if ?? is a BoolOp, so I kept the separate node. Though, I could be convince to change it.

PEP 505 suggested to add it as a BinOp which doesn’t make sense to me. See the rejected idea section I linked to earlier.

AST node for ??=

PEP 505 also suggested that it should be an AugAssign. However, so far those are only used for binary operators which could be confusing. Furthermore, AugAssign expressions always evaluate the left and right hand side, without any short-circuiting. So while the implementation could reuse major parts, I chose to add a new CoalesceAssign instead. If there are strong opinions that it should be an AugAssign, I could change that too.

Implementation + demo

I’ve updated the demo to match the specification in the draft.

Summary

Both drafts together should cover the contents of PEP 505. While there is certainly a synergy between both draft proposals, from a technical standpoint each addresses a different issue, has its own motivation and could theoretically be implemented independently of one another. Though of course both share the common goal of making it easier to work with None / optional values.

8 Likes

Could you please open a separate thread for this? This one is already 400 posts long and very unwieldy.

1 Like

It’s still on topic though. I’d like to avoid spreading the discussion even further for the moment. Once these drafts are officially added to the PEPs repo, I’ll open new and separate discussion threads for each one.

Still looking for a PEP sponsor at the moment though.

1 Like

I think I can sponsor both. IIRC I withdrew as a sponsor when you announced going just for ‘?.’ That’s addressed now.

16 Likes