I’m not directing this comment at anyone in particular. I’m just noting that I raised the issue of not being able to short-circuit the entire chain, or the entire line, many posts ago, and others did as well.
The above comment may have been interpreted as being about typing, but it was actually about None not having a do_something attribute.
Okay, that’s a deviation from the spec, and the implementation needs to be fixed. (Though not urgently, since there is not much point in writing that.)
I’m currently working on improving my draft, just wanted to respond quickly to it. Where exactly is it a deviation from the spec? I explicitly called out Trivial groups being equal here: PEP 999 – None-aware access operators | peps.python.org
Unless we completely change how groups work in the parser, I’m not sure there is another way. If the AST is equal, the bytecode has to be too.
That means we can’t support the explicit scoping of the ?. operator present in JS / TS but as I pointed out earlier, that might be a fair limitation. The much more useful cases for groups like (a.b?.c ?? d).e?.func() will still work fine.
Oh, it’s worse. You made this part of the spec. Sorry, that has to change. I’m about to disappear for a day, but I know how to change the grammar to make this work.
If there is a good way, I’m more than happy to change it. I just put it to the side as I’m not sure there is actually a good use case for it which can’t be archived by directly using try - except.
Regardless of the usefulness, it just doesn’t make sense if the effect of ? can escape parentheses. (And I happen to agree with @beauxq about this issue.)
This grammar makes sense to me:
primary:
atom primary_tail*
primary_tail:
| '.' NAME
| &'(' genexp
| '(' [arguments] ')'
| '[' slices ']'
| '?' '.' ~ NAME
| '?' '[' slices ']'
Bytecode generation should have a helper that just loops over the primary_tail elements, and for ? elements, the jump-if-None should go to the very end and return None.
You could also refactor it like this:
primary:
| atom &('.' | '(' | '[' | '?') primary_tail_items
| atom
primary_tail_items: primary_tail+
primary_tail: # As above
I’m trying to understand your proposal. Unless I’m missing something, my understanding is that the parser first generates the AST which is then converted to a CST / bytecode. Are you suggesting that we change the AST for the primary rule? Currently that is left-recursive, so primary_tail_items might not really work. For a?.b.c this is the current AST
That would be a huge change, so not sure I got it right.
–
If we do change the AST, we might as well just introduce a new Group expression node instead. Though all that’s actually needed would be some way to keep track of the group level for each expression.
–
Given these options, I’m not sure the change is actually worth it. Maybe it helps to reframe the issue. It isn’t so much about escaping as a group is just something to help during parsing. If it can be optimized, it will be, e.g. (a.b).c == a.b.c and (a)?.b == a?.b. This should certainly be mentioned in the PEP, but it isn’t a dealbreaker IMO.
Yes, I am proposing a change to the AST, even when no ‘?’ is present. Do you think that is a show-stopper? We make no guarantees about AST stability across feature releases IIUC. If it is, introducing a group might be a good alternative, if we can make sure only to add a group around an atom containing a parenthesized primary containing a ‘?’.
Yes it will, absolutely! You’re right that we don’t make any stability guarantees, however in practice the AST has more or less been stable for a lot of versions, aside from new additions. Just as a point of comparison, ast.Str and ast.Bytes were removed just in 3.14 while the deprecation happened all the way back in 3.8. Yes, the ast.Constant node was first used in 3.9 I believe, the point still stands though.
Changing how the primary rule is translated will be a major breaking change for basically anyone who is using the AST. The obvious examples include type checkers and linters like mypy, flake8, pylint and most certainly others as well.
Even just adding an additional Group node would still be a major breaking change though most consumers could just add it as a no-op.
Not sure. Just wrapping the expressions at the group grammar level would require us knowing that it contains a ?. in the first place.
group[expr_ty]:
| '(' a=(yield_expr | named_expression) ')' { a }
Maybe it would work if we duplicated all rules, i.e. yield_expr and yield_expr_without_none_aware. Otherwise we would have to set a flag somehow.
However, even if possible elegantly, this would still be at least out of norm. Why should a Group node be generated for (a?.b).c and not (a.b).c or (a + b) - c ?
Because the semantics in case (1) are different, while for case (2) and (3) there is no semantic difference.
I think we can generate some kind of wrapper node for a parenthesized ‘?.’ operator. It may not be pretty given that the AST for expressions without ’?’ must not change, but it can and should be done.
Type checkers should be able to infer that ‘(a.b?.c.d).repr()’ has type ‘str’ while ‘a.b?.c.d.repr()’ is ‘str|None’.
(a.b?.c.d).repr() just screams that it will raise an error at some point. Any reasonable case should use a fallback, either (a.b?.c.d or e).repr() or better (a.b?.c.d ?? e).repr().
If I’d see it during a code review, I’d almost certainly want it changed. If that’s really what the developer intends, they can still use assignment expressions or just split it up into two statements. Yes, I’m repeating myself here (and this will be my last comment for it) but I still fail to see the point in adding support for it.
I just updated my draft. In particular I rewrote the Short circuiting and Grouping sections in the spec to make it more clear what’s happing. (The actual behavior didn’t change though.)
Copy of the updates (refer to the draft for the up-to-date version)
Specification
=============
Short-circuiting
****************
If the left hand side for ``?.`` or ``?[ ]`` evaluate to ``None``, the
remaining expression is skipped and the result will be set to ``None``
instead. The ``AttributeError`` for accessing a member of ``None`` or
``TypeError`` for trying to subscribe to ``None`` are omitted. It is
therefore not necessary to change ``.`` or ``[ ]`` on the right hand side
just because a ``?.`` or ``?[ ]`` is used prior.
::
>>> a = None
>>> print(a?.b.c[0].some_function())
None
The ``None``-aware access operators will only short-circuit expressions
containing name, attribute access, subscript, their ``None``-aware
counterparts and call expressions. As a rule of thumb, short-circuiting
is broken once a (soft-) keyword is reached.
::
>>> a = None
>>> print(a?.b.c)
None
>>> print(a?.b.c or "Hello")
'Hello'
>>> 2 in a?.b.c
Traceback (most recent call last):
File "<python-input>", line 1, in <module>
2 in a?.b.c
TypeError: argument of type 'NoneType' is not a container or iterable
>>> 2 in (a?.b.c or ())
False
Grouping
********
Grouping is an implicit property of the `Short-circuiting`_ behavior.
If a group contains a non short-circuiting expression, i.e. one that
is not either a name, attribute access, subscript, their ``None``-aware
counterparts or a call expression, the short-circuiting chain will be
broken. The rule of thumb still applies: short-circuiting is broken once
a (soft-) keyword is reached.
In the example below the group contains a ``BoolOp`` (``or``) expression
which breaks the short-circuiting chain into two: ``a.b?.c`` inside
the group which is evaluated first and ``(...).e?.func()`` on the
outside.
::
(a.b?.c or d).e?.func()
# a.b?.c
_t2 = _t1.c if ((t1 := a.b) is not None) else None
# (... or d)
_t3 = _t2 if _t2 else d
# (...).e?.func()
_t4.func() if ((_t4 := _t3.e) is not None) else None
In contrast, the example below only consists of a name, one attribute
access and two ``None``-aware attribute access expressions. As such the
grouping does not break the short-circuiting chain. The brackets can
safely be removed.
::
# Trivial groups
(a?.b).c?.d == a?.b.c?.d
Rejected Ideas
==============
Remove short-circuiting
-----------------------
It was suggested to remove the `Short-circuiting`_ behavior completely
because it might be too difficult to understand. Developers should
instead change any subsequent attribute access or subscript to their
``None``-aware variants.
::
# before
a.b.optional?.c.d.e
# after
a.b.optional?.c?.d?.e
The idea has some of the same challenges as `Add a maybe keyword`_.
By forcing the use of ``?.`` or ``?[ ]`` for attributes which are
``not-optional``, it will be difficult to know if the ``not-optional``
attributes ``.c`` or ``.d`` suddenly started to return ``None`` as well.
The ``AttributeError`` would have been silenced.
Another issue especially for longer expressions is that **all**
subsequent attribute access and subscript operators need to be changed
as soon as just one attribute in a long chain is ``optional``. Missing
just one can instantly cause a new ``AttributeError`` or ``TypeError``.
Limit scope of short-circuiting with grouping
---------------------------------------------
Some languages like JS [#js_short_circuiting]_ and C# [#csharp]_ limit the
scope of the `Short-circuiting`_ via explicit grouping::
a = None
x = (a?.b).c
# ^^^^^^
In the example above short-circuiting would be limited to just ``a?.b``,
thus with ``a = None`` the expression would raise an ``AttributeError``
instead of setting ``x`` to ``None``.
Even though other languages have implemented it that way, this kind of
explicit grouping for short-circuiting does have its disadvantages.
The ``None``-aware access operators are explicitly designed to return
``None`` at some point. Directly limiting the scope of the
short-circuiting behavior almost guarantees that the code will raise
an ``AttributeError`` or ``TypeError`` at some point. Type checkers
would also have to raise an error for trying to access an attribute
or subscript on an ``optional`` variable again.
As such breaking the short-circuiting chain does only make sense if a
fallback value is provided at the same time. For example::
(a?.b.c or fallback).e.func()
In case it is known that ``a`` will always be a not ``None`` value,
and it is just still typed as optional, better options include adding
an ``assert a is not None`` or if it is ever proposed a ``Not-None``
assertion operator ``a!`` (out of scope for this PEP). Developers also
always have the option of splitting the expression up again like they do
today.
Hi, all, I’m glad to see some minds have changed since the original post and I’m excited that we’re making (a little bit of) forward progress on a PEP. I’ve been following the discussion passively, but wanted to chip in my own agreement with @beauxq (if I’m understanding their position correctly):
I firmly believe that a?.b.c should not short-circuit. That is, it should throw an AttributeError if a is None. I agree with the parentheses/atomicity argument. Moreover, I believe short-circuiting would add a (mostly) needless complication to the language, both in its implementation and use. Explicit is better than implicit, and I think it’s perfectly fine to have users write ?. for each sequential access. I feel we’re overestimating how many long sequences of attribute accesses are in normal code, and are therefore over-fitting an edge case.
I would be happy to help incorporate this feedback into the PEP. I’d also be happy if we separated the short-circuiting behavior into a follow-up PEP, as I don’t think there would be any reasonable backwards-compatibility concerns with adding it in a later version of the language.
To me the new operator wouldn’t be worth the trouble of a new syntax if a?.b.c is no more than a syntactic sugar of (None if a is None else b).c.
If short-circuiting is allowed it would help save a much larger boilerplate of a whole if block. And you can still easily get your desired behavior of throwing an AttributeError when a is None by simply adding parentheses, i.e. (a ?. b).c.
In the preceding example, B isn’t evaluated and C() isn’t called if A is null. However, if the chained member access is interrupted, for example by parentheses as in (A?.B).C(), short-circuiting doesn’t happen.
I would expect
(a.b?.c.d).repr()
to function the same as
_v = a.b?.c.d
_v.repr()
Many languages already have the safe navigation operator implemented in this way. Please don’t diverge from their spec if possible.