Revisiting PEP 505

In JavaScript, uninitialized variables and object properties are automatically assigned undefined. That’s not the case in Python. In Python, you need to explicitly assign a variable the value None if you want to indicate the absence of a value. If you choose to use None, then you should handle it accordingly. These None values don’t just appear out of nowhere. That applies to every sentinel value.

13 Likes

I find the rules pretty clear if ? only ever “looks left”. So the null-coalescing aspect of a?.b should be completely ignorant of the fact that it is attribute being looked up is b. The mental model seems pretty clear too:

Expressions a?.b, a?[b], and a?(b) are equivalent to replacing a? with a while appending if a is not None else None to the end of the expressions.

11 Likes

It’s probably because using exceptions like this isn’t unheard of, e.g., StopIteration, etc. While it might be a shift for some to have AttributeError, IndexError, or KeyError get swallowed by some syntax, you can replace AttributeError with hasattr(), IndexError with len(), and KeyError with in, so I wouldn’t get too hung up on the “catching exceptions implicitly with syntax feels icky” part and just view it as a way to communicate intent for those of us advocating for data structure traversal instead of promoting None to a very special place that I don’t know if any other object holds in Python.

Since the original PEP author doesn’t like either idea I don’t view that at a selling point. :wink:

Sorry, I just grabbed the first nested getattr() call at the end of a long day; obviously was more tired than I realized. :sweat_smile:

3 Likes

Could you elaborate on your apprehension about “promoting None to a very special place”? This is something else that’d never occurred to me that I’m curious about. None is our only named sentinel (I think?), and it’s used everywhere we need an orthogonal value or default since it’s the only value of its type. That already seems pretty special.

4 Likes

We have Ellipsis too. It is even more special than None in that it can be referred to using two syntax, one global name and one non-keyword literal, but hardly ever used as a sentinel.

None became so prevalent probably because people were already too familiar with other languages’ null (especially Java, where types are all nullable). To think out loud, I wonder what would have happened if Python never had None; would int | Ellipsis become the norm instead?

And True and False.

Off the top of my head, I can’t think of any syntax where the type of something is so important and restricted. Almost everywhere else in the language there’s some way to coalesce to what the syntax ultimately wants. But in this instance it is very explicitly None under your proposal and nothing else; no way to convert an object to None, no to subclass, etc, it’s an is None check.

Sure, but that’s by convention, not by language definition. It’s much different to add dedicated syntax that says, “this operator does something in the face of None and something in all other instances”. Compare that to everything else wanting a “truthy” value or anything that’s iterable. In the vast majority of cases, Python doesn’t operate based on the type of something in such a strict manner (exceptions are probably the most obvious thing where that doesn’t hold, but even then you can subclass to control things a bit more).

2 Likes

I can tell you that this is historically inaccurate. I added None to the language very early on in analogy with NULL in C. I wanted everything user-visible to be “safe” in the sense that the implementation would always be able to use obj->ob_type and obj->ob_refcnt without needing a NULL check every time (or worse, trying to eliminate redundant NULL checks and getting it wrong). But I really wanted something like NULL meaning “nothing here”. I think it’s help up well for over three decades, and the new static type system does a great job with it too.

Ellipsis was a much, much later addition to the language, specifically to allow the cute syntax “…” in various places. Which conveys something like “and more of the same” – it’s not a sentinel.

I think that for ?? and ??= after all I can live with only checking for None, since that’s what the or version does too. But for ?. and ?[...] I’m still not convinced using .get(key) is just too verbose.

BTW the precedence for ?? should be lower or equal to that of or. And it should bind in the same way, so that a ?? b ?? c means (a ?? b) ?? c, i.e., the first of a, b and c that’s not None.

6 Likes

Thanks for starting the discussion again here @noahbkim! I’ve been looking into it as well as I’ve seen multiple cases where ?. in particular would be useful and would like it to move forward.

Limiting the scope

One of the main issues I see with PEP 505 is that it attempted to add all four operators (?., ?[], ??, ??=) at once which made the discussion complex and hard to follow. Especially as they have different arguments for and against them. What became clear though is that ?[] was unpopular, so it’s the right call to drop it for now IMO.

Jugging the discussion so far, it might be worth it to separate ?. and ??/ ??= as suggested before.

Mental model for ?.

This came up as well already. What should the ?. operator actually do? I find the comparison to dart actually the most helpful. ?. just replaced an additional if expression. These two should be equal

_t.c if (_t := a.b) is not None else None
a.b?.c

Edit: An earlier version stated the equivalence between a.b.c if a.b is not None else None and a.b?.c. That missed the fact that the latter should only evaluate a.b once ideally. Updated the byte code example below as well.

An example how the byte code could look like
 1             LOAD_NAME                1 (a)
               LOAD_ATTR                4 (b)
               COPY                     1
               STORE_NAME               3 (_t)
               POP_JUMP_IF_NONE        12 (to L1)
               LOAD_NAME                3 (_t)
               LOAD_ATTR                8 (c)
               JUMP_FORWARD             1 (to L2)
       L1:     LOAD_CONST               2 (None)

  --   L2:     POP_TOP

 2             LOAD_NAME                1 (a)
               LOAD_ATTR                4 (b)
               COPY                     1
               POP_JUMP_IF_NONE        10 (to L3)
               LOAD_ATTR                6 (c)

  --   L3:     POP_TOP

That also means that a.b?.c is None if, and only if, a.b is None. Without that guarantee, it’s IMO just way too easy to misspell an attribute and always get None because of that. Relying on linters shouldn’t be the answer for that.

class A1:
    b = None
a = A1()
assert a.b?.c is None

class A2: ...
a = A2()
a.b?.c  # AttributeError: 'A' object has no attribute 'b'

Chaining

The ?. also returns early, meaning that each subsequent attribute access / method call / subscript is skipped if one evaluated to None previously. So these two examples are also equivalent and won’t raise an AttributeError:

res = None
if a is not None:
    if a.b is not None:
        res = a.b.do_something()
return res
return a?.b?.do_something()

Await

This isn’t really anything different this, so I wouldn’t handle it differently.

func: Coroutine[Any, Any, int] | None = None
await func

Where is it useful?

PEP 505 includes some good examples, but I’d like to highlight the two IMO most relevant cases here again. PEP 505 – None-aware operators | peps.python.org

1. Dealing with some kind of outside data, most likely JSON

An example from mypy:

    def serialize(self) -> dict[str, Any]:
        # serialize class to json dict for caching
        return {
            ...
            "type_guard": self.type_guard.serialize() if self.type_guard is not None else None,
            "type_is": (self.type_is.serialize() if self.type_is is not None else None),
        }

Could be rewritten to

    def serialize(self) -> dict[str, Any]:
        return {
            ...
            "type_guard": self.type_guard?.serialize(),
            "type_is": (self.type_is?.serialize(),
        }
2. Or data coming from an API with optional keys to be backwards compatible
from typing import TypedDict

class Values(TypedDict):
    date: int | None

class Keys(TypedDict):
    value: Values | None

class Data(TypedDict):
    key: Keys


def get_value(data: Data) -> int | None:
    # Option 1
    if data["key"].get("value") is not None:
        res = data["key"]["value"].get("date")  # type error: union-attr

    # Option 2 - now with :=
    if (c := data["key"].get("value")) is not None:
        res = c.get("date")

    # Option 3 - inline if
    res = c.get("date") if (c := data["key"]["value"]) is not None else None

    # Option 4 - try... except
    try:
        res = data["key"]["value"]["date"]  # type error: index
    except KeyError:
        res = None

    return res

All options would work but each has it’s drawbacks. 1 doesn’t work for type checkers, 4 hides potential misspellings, and 2 / 3 are difficult to get right - the brackets need to be set correctly and knowledge about := is required. There might be other options, like using get defaults intelligently but these aren’t always suitable.

With none-aware attribute access, this would become

def get_value(data: Data) -> int | None:
    return data["key"].get("value")?.get("date")
Side note on nested TypedDicts

Writing nested TypedDicts is quite cumbersome currently, as can be seen in the example. There are open discussion how to make that easier, but for the time being many would simple type

data: dict[str, Any]

Implementation

For anyone interested, an example implementation (just) for ?. can be found here: Comparing python:main...cdce8p:none-aware-attr · python/cpython · GitHub
The byte code above (in the Mental model section) was generated using this branch.

The dedicated operator allows us to skip an additional attribute lookup which would happen after the if statement, so it’s even faster (about ~40% for simple examples) than current methods although that probably doesn’t matter much.

9 Likes

Regarding the notion that None is not that special: it is special, intentionally. Many places in the standard API use None to means absence of something better – not just d.get(k) but also there are literally tons of APIs that use None to indicate “nothing instead of something”. This convention is deeply ingrained in the language and intentionally so. (And it’s not in conflict with the notion that None itself is still an object – it’s better as an object, so that parts of anybody’s code that don’t care about it being special don’t have to special-case it.)

4 Likes

This equivalence would be problematic if a overrides __getattribute__ in some expensive or non-deterministic way. I would expect a.b?.c to evaluate a.b exactly once, and save the result. The equivalent code would then be (PEP 572 to the rescue :-):

__tmp.c if (__tmp := a.b) is not None else None

I also find it interesting that you start with "let’s drop ?[] and then claim (quoting PEP 505) that important use cases are JSON and dict with optional keys.


FWIW in order to support my preferred semantics without literally catching exceptions, at least for a[k]?.c we can define this as syntactic sugar for

__tmp.c if k in a and (__tmp := a[k]) is not None else None

Similarly we can see a.b?.c as sugar for

__tmp.c if(__tmp :=  getattr(a, "b", None)) is not None else None

Admittedly, getattr(a, "b", None) may in some cases have no choice but to evaluate a.b catching AttributeError (same as using hasattr()), but it stands a chance of getting optimized for common cases (e.g. if __getattribute__ is not overridden, which CPython can see by checking a slot in the type object).

3 Likes

If you are unsure whether a and b are None, what makes you certain that the do_something attribute exists in object b?

That will simply raise a TypeError, whereas await a?.b(c).d?[e] may fail at runtime. Should we use a try/except block to catch TypeError here? Isn’t the whole point of the proposal to avoid using a try/except block?

I like the cleanliness, but it seems a bit too magical to me how because there is a None-aware operator ?. in the parent AST node all of a sudden a[k] should no longer raise KeyError when k is missing from a. Should the code above not be written as a?[k]?.c?

2 Likes

Good point! This is what my implementation does actually :sweat_smile: but I missed to consider it for the examples. Not sure if it makes a difference though as much of the code in question is probably deterministic.

IMO using get works for fine for now. AFAIR there was just considerable pushback to ?[], so it might be “better” to start with ?. and see how that works out before attempting to go for ?[]. There is also __getitem__ which should work, albeit not as elegantly

a?["key"]
a?.__getitem__("key")

What happens if for some reason a never has an attribute b? Wouldn’t the expression always return None? I can understand the desire for it, but continue using getattr seems to be the better option for me.

That fully depends on the actual code. Just because something is ... | None doesn’t mean it’s unknown. I.e. if we know it’s not None, why not call a method on it.

It’s a simplified example. func may very well be the return type of some complex method which only in rare cases returns None. Point being you’ll still need to check if it’s not None before doing the await. That doesn’t change with ?. Theoretically you could use ?. even if all attribute accesses return a non-None value. You probably shouldn’t, but we’re all adults. For me, this would be something a linter could highlight.

2 Likes

That only handles cases where a is None. Not if a is a dict but doesn’t have a k key.

To be clear. I wouldn’t expect ?[] / ?. to handle that.

2 Likes

Ah yes you’re right, but I still don’t see how there can be k in a and in @guido’s equivalent code. That is totally unexpected from the expression a[k].

I would expect a[k]?.c to be more equivalent to:

__tmp.c if (__tmp := a[k]) is not None else None

Similarly, I’d expect a.b?.c to be more equivalent to:

__tmp.c if (__tmp := a.b) is not None else None

Again I do like the cleanliness of Guido’s preferred semantics but only if the magic can be logically explained.

EDIT: It may work if there’s a forgiving version of the . operator such as -> where a->b is equivalent to getattr(a, "b", None) although it may belong to a different proposal.

4 Likes

On second thought, how about making such an operator .? instead so that a.?b is understood to be equivalent to getattr(a, "b", None) because the question mark is next to b, implying that b may be None when missing.

To further generalize the idea, we can also allow a?.?b, which would be equivalent to:

getattr(a if a is not None else None, "b", None)
1 Like

Please don’t. That will just be confusing; checking to see on which side of the . the question mark is. ?. makes sense as an operator since that’s what a couple other languages already use for none-/null-aware attribute access.

5 Likes

Yeah that doesn’t read well. How about a?->b in that case? I’m just trying to make Guido’s idea work here because the status quo of getattr(a, "b", None) for a forgiving a.b does look rather verbose.

Or maybe the ?. operator may include an implicit AttributeError catcher to make a.b?.c work when b is missing from a, though it may be hard to catch AttributeError raised only from the attribute access of b but not from somewhere further upstream or from within an overridden __getattribute__. Or maybe it’s okay to ignore any AttributeError in this context. Something like a.b.c?.d would not raise an error if either b or c is missing because ?. catches and ignores them all. But then this gives ?. too much meaning outside of None-awareness and still doesn’t solve the problem of simplifying getattr(a, "b", None) itself.

Since the purpose of PEP-505 is all about safe navigation, at this point I think it actually makes sense to make the operator ?. perform double duty, such that a?.b returns None when either a is None or when getattr(a, "b") raises AttributeError.

My main use case is to make testing an optional feature in a configuration as painless and as readable as possible:

if options?.experimental?.feature_a:
    ... # experimental feature A

as opposed to:

if getattr(getattr(options, "experimental", None), "feature_a", None):
    ...

or a try block.

Allowing ?. to do both will satisfy the use case. PEP-505 as it is right now doesn’t.

IMO it’s not that common (try grepping all of your own code for it), so let’s not try to invent special syntax for that.

My idea was for a?.b to suppress AttributeError, but not by wrapping a.b.c? in a try/except block. Instead, (as I wrote before) this should be done by the existing primitive getattr(a, "b", None), or (equivalently) by only wrapping the .c part in a try/except. Certainly I would object to a.b.c? suppressing an AttributeError raised by a.b!

I have to admit that in this case I don’t care about the “purity” of catching an AttributeError raised by the immediate implementation of __getattribute__ but not in operations invoked by that implementation. It is my opinion that if a __getattribute__ operation raises AttributeError, whether intentionally or accidentally, it should be interpreted the same – the attribute does not exist, for all intents and purposes. If you want to debug why you’re getting an unexpected “does not exist”, don’t catch the exception and see what the traceback gives you.

3 Likes