Safe navigation operators by way of expression result queries

(Creating a separate thread specifically for the “query expressions” idea I first posted in Introducing a Safe Navigation Operator in Python - #228 by ncoghlan . This iteration on the idea does a better job of distinguishing the exception-as-value and exception-as-error cases)

A recurring point of confusion in the safe navigation operator discussion is which (if any) of the following obj?.attr translates to:

  • obj.attr if obj is not None else None (the PEP 505 meaning)
  • obj.attr if hasattr(obj, "attr") else obj (a common, but incorrect, guess as to what it means)
  • obj.attr if hasattr(obj, "attr") else None(an alternative incorrect guess about the meaning)

The “query expression” idea results from asking the question “What if safe navigation could fail gracefully for both None values and missing attributes, let you easily query to see which of those actually happened, and let you query to see exactly where in the code that result was introduced?”.

Foundation: expression results

The base level of this idea is being able to split the result of evaluating every expression into three categories:

  • Error: evaluating the expression throws an exception
  • Missing: evaluating the expression produces a sentinel value that means “no result”
  • Value: any other result that isn’t Missing or an Error

The default Missing sentinel value (and the only one with syntactic support) would be None.
Exceptions returned from an expression would be regular Value instances - only caught exceptions would be considered Error instances.

All of these types would inherit from a common base type for implementation purposes, but that would be considered an implementation detail - referencing them collectively should always use union types rather than the implementation base class.

The following also assumes the existence of a CodeLocation type, as a convenient way of passing around the same level of code location detail as the compiler already reports on SyntaxError instances. (The various types in Python that provide code location information could potentially standardise on the __location__ attribute suggested here, but actually doing that is NOT part of this specific suggestion)

class _BaseQueryResult:
    """Base class for common query result behaviour"""
    _location: CodeLocation

    def __new__(cls, location:CodeLocation) -> Self:
        self = super().__new__()
        self._location = location
        return self

    @property
    def __location__(self) -> CodeLocation:
        return self._location

    def __bool__(self) -> bool:
        return self.has_value

    @property
    def has_value(self) -> bool:
        return False

    @property
    def is_missing(self) -> bool:
        return False

    @property
    def is_error(self) -> bool:
        return False
class Value(_BaseQueryResult):
    """Any Python object that isn't missing or an error. Considered true."""
    _value: Any
    def __new__(self, value:Any, location:CodeLocation=None) -> Self:
        self = super().__init__(location)
        self._value = value
        return self

    def resolve(self) -> Any:
        return self._value
   
    @property
    def has_value(self) -> bool:
        return True

    # Support lookup chaining
    def __getattr__(self, attr:str) -> Value|Missing|Error:
        # This query function is defined in the next section
        return query_try_getattr(self._value, attr)

    def __getitem__(self, subscript:Any) -> Value|Missing|Error:
        # This query function is defined in the next section
        return query_try_getitem(self._value, subscript)
class Missing(_BaseQueryResult):
    """A missing result. Considered false."""
    _sentinel: Any
    def __new__(cls, sentinel:Any=None, location:CodeLocation=None) -> Self:
        self = super().__new__(location)
        self._sentinel = sentinel
        return self

    def resolve(self) -> Any:
        return self._sentinel

    @property
    def is_missing(self) -> bool:
        return True

    # Support lookup chaining
    def __getattr__(self, _:str) -> Self:
        return self

    def __getitem__(self, _:Any) -> Self:
        return self

    def __call__(self, *args, **kwds) -> Self:
        return self
class Error(_BaseQueryResult):
    """An error result. Considered false.

       Re-raised on resolution.
    """
    _exception: BaseException
    def __new__(cls,
        exception:BaseException,
        location:CodeLocation=None
    ) -> Self:
        self = super().__new__(location)
        self._exception = exception
        return self

    def resolve(self) -> Any:
        # Can presumably work out something better to do here,
        # but this is the simplest way to avoid changing the type
        raise self._exception

    @property
    def exception(self) -> BaseException:
        return self._exception

    @property
    def is_error(self) -> bool:
        return True

    # Support lookup chaining
    def __getattr__(self, _:str) -> Self:
        return self

    def __getitem__(self, _:Any) -> Self:
        return self

    def __call__(self, *args, **kwds) -> Self:
        return self
class ExpectedError(_BaseQueryResult):
    """An expected error result. Considered false.

       Returns sentinel on resolution.
    """
    _sentinel: Any
    def __new__(cls,
        exception:BaseException,
        sentinel:Any=None,
        location:CodeLocation=None
    ) -> Self:
        # Arg order intentionally chosen so `ExpectedError(None)` will
        # trigger a type error (since `None` is not an exception)
        self = super().__new__(exception, location)
        self._sentinel = sentinel
        return self

    def resolve(self) -> Any:
        return self._sentinel

    # Support lookup chaining
    def __getattr__(self, _:str) -> Self:
        return self

    def __getitem__(self, _:Any) -> Self:
        return self

    def __call__(self, *args, **kwds) -> Self:
        return self

Doing anything fancier than the above would be left to third party libraries like result · PyPI or returns · PyPI

Foundation: expression result query functions

The following functions would be added to the operator module, each encapsulating a specific check that produces an expression query result. Note that only the query_try_* functions can produce new Error results (since returned exceptions are considered values), but query_expr may still pass through previously created Error instances unmodified.

def query_expr(obj: Any, sentinel=None) -> Value|Missing|Error:
    """Classify an object as missing, an error, or some other value"""
    if obj is sentinel:
        # Do the fastest check first
        return Missing(sentinel)
    if isinstance(obj, (Value, Missing, Error)):
        # Expression query results are passed through as they are
        return obj
    # Wrap anything else as a regular value
    return Value(obj)   
def query_try_call(obj: Any, sentinel=None, catch=Exception) -> Value|Missing|Error:
    """Classify callee and its result as missing, an error, or some other value"""
    # Use lambda or functools.partial when the callable takes parameters
    obj_expr_result = query_expr(obj, sentinel)
    if not obj_expr_result.has_value:
        # Short circuit the call request for errors and missing values
        return obj_expr_result
    if obj is obj_expr_result:
        obj = obj.resolve()
    try:
        returned_value = obj()
    except catch as exc:
        return Error(exc)
    return query_expr(returned_value, sentinel)

query_try_await, query_try_yield, and query_try_yield_from would share a similar structure to query_try_call, but the setup details would be different to make the scopes of the try blocks as narrow as possible:

async def query_try_await(obj: Any, sentinel=None, catch=Exception) -> Value|Missing|Error:
    """Classify awaitable and its result as missing, an error, or some other value"""
    obj_expr_result = query_expr(obj, sentinel)
    if not obj_expr_result.has_value:
        # Short circuit the wait request for errors and missing values
        return obj_expr_result
    if obj is obj_expr_result:
        obj = obj.resolve()
    try:
        returned_value = await obj
    except catch as exc:
        return Error(exc)
    return query_expr(returned_value, sentinel)
def query_try_yield(obj: Any, sentinel=None, catch=Exception) -> Value|Missing|Error:
    """Classify the result of yielding the item as missing, an error, or some other value"""
    obj_expr_result = query_expr(obj, sentinel)
    if obj_expr_result.is_error:
        # Only shortcircuit yield for unhandled errors
        # (since yielded values aren't required to have any
        # particular behaviour, yielding the sentinel is OK)
        return obj_expr_result
    if obj is obj_expr_result:
        obj = obj.resolve()
    try:
        returned_value = yield obj
    except catch as exc:
        return Error(exc)
    return query_expr(returned_value, sentinel)
def query_try_yield_from(obj: Any, sentinel=None, catch=Exception) -> Value|Missing|Error:
    """Classify the result of yielding from the iterable as missing, an error, or some other value"""
    obj_expr_result = query_expr(obj, sentinel)
    if not obj_expr_result.has_value:
        # Short circuit the yield from request for errors and missing values
        return obj_expr_result
    if obj is obj_expr_result:
        obj = obj.resolve()
    itr = iter(obj) # Note: exceptions here are allowed to escape
    try:
        returned_value = yield from itr
    except catch as exc:
        return Error(exc)
    return query_expr(returned_value, sentinel)

Finally, we get to the query functions that correspond to the new safe navigation operators:

def query_try_getattr(obj: Any, attr:str, sentinel=None) -> Value|Missing|Error:
    """Classify an attribute lookup as missing, an error, or some other value"""
    obj_expr_result = query_expr(obj, sentinel)
    if not obj_expr_result.has_value:
        # Short circuit the attribute lookup for errors and missing values
        return obj_expr_result
    if obj is obj_expr_result:
        obj = obj.resolve()
    _getattr = getattr # This would be looked up directly, not via builtins
    try:
        returned_value = _getattr(obj, attr)
    except AttributeError as exc:
        # Could check exc.obj and exc.name here, but
        # `getattr` doesn't do that, so this doesn't either
        return ExpectedError(exc, sentinel)
    return query_expr(returned_value, sentinel)
def query_try_getitem(obj: Any, subscript:Any, sentinel=None) -> Value|Missing|Error:
    """Classify a subscript lookup as missing, an error, or some other value"""
    obj_expr_result = query_expr(obj, sentinel)
    if not obj_expr_result.has_value:
        # Short circuit the subscript lookup for errors and missing values
        return obj_expr_result
    if obj is obj_expr_result:
        obj = obj.resolve()
    try:
        returned_value = obj[subscript]
    except LookupError as exc:
        return ExpectedError(exc, sentinel)
    return query_expr(returned_value, sentinel)

In the operator module API, the sentinel used to determine Missing values, and the caught exception used to the determine Error values for try_call and friends can be customised.

Expression result query expressions

Semantically, the following syntax would then map to the following calls to the above operator module functions:

  • ?EXPRquery_expr(EXPR)
  • ?try EXPR → similar to query_try_call(lambda: EXPR), but see comments below
  • ?await EXPRquery_try_await(EXPR)
  • ?yield EXPRquery_try_yield(EXPR)
  • ?yield from EXPRquery_try_yield_from(EXPR)
  • obj?.attr → similar to query_try_getattr(obj, "attr"), but see comments below
  • obj?[subscript] → similar to query_try_getitem(obj, subscript), but see comments below

Presumably in an actual implementation of this idea, the compiler would emit inline code for all of those, since it could do that more efficiently than if it actually made the calls to the corresponding operator module functions. For safe navigation in particular, it would be able to avoid the indirection through the intermediate result objects when continuing on with additional attribute and subscript lookups.

That said, even a function call based implementation would be able to do the right thing for almost everything except ?try EXPR - that has to be implemented inline to avoid changing the evaluation scope for the contents of EXPR. For safe navigation, an inline implementation can also ensure lookups don’t inadvertently retrieve Value attributes instead of attributes of the contained object.

Regardless of how they were implemented, the expression query results would then need to be interrogated via either pattern matching or the has_value, is_missing and is_error properties to determine whether or not an exception had been thrown.

In the dedicated syntax, the sentinel used to determine Missing values, and the caught exception used to the determine Error values for try_call and friends can NOT be customised (they’re always None and Exception respectively).

Defining narrow exception handling scopes

A common bug when writing generators and coroutines is to use overly broad exception handling scopes like the following:

async def cr():
    try:
        value = await define_request()
    except Exception as e:
        # Do something with exception
    else:
        # Do something with value

This code catches exceptions from both await and define_request(), which probably isn’t the intended behaviour. A suitably narrow exception scope instead looks like this:

async def cr():
    awaitable = define_request()
    try:
        value = await awaitable
    except Exception as e:
        # Do something with exception
    else:
        # Do something with value

?await, ?yield, and ?yield from similarly define a narrower exception handling scope than the corresponding ?try ... expressions:

  • ?try await EXPR catches exceptions from EXPR;
    ?await EXPR does not
  • ?try yield EXPR catches exceptions from EXPR;
    ?yield EXPR does not
  • ?try yield from EXPR catches exceptions from itr(EXPR);
    ?yield from EXPR does not

They also differ in how Value, Missing, Error and None results from EXPR are handled: the ?try based forms will suspend the frame unconditionally, while all 3 dedicated forms will pass Error results straight back without suspending the current frame (since they short circuit based on the given expression). ?await and ?yield from will also avoid suspending the frame for Missing and None values . All 3 dedicated forms will unwrap Value results before awaiting or yielding, and ?yield will also unwrap Missing values.

There’s no way to obtain this behaviour with only ?try, although the following comes closest:

  • ?try await (?EXPR).resolve()
  • ?try yield (?EXPR).resolve()
  • ?try yield from (?EXPR).resolve()

Missing values from EXPR will become Error values for await and yield from (since they’re not awaitable or iterable) rather than being passed through unchanged, and exceptions from EXPR will still be caught by the outer ?try and become Error returns rather than being raised as regular exceptions.

Coalescing values

Basing __bool__ on the has_value property means that using or on the expression query results implements None-coalescing semantics. All of Value, Missing, and Error define a resolve() method, which means coalescing values can be written as:

coalesced = (?a or ?b or ?c).resolve()

The existing or short-circuiting semantics would apply. If ?? was defined, it would just be syntactic sugar for the above.

Coalescing assignment could be defined as a ?= b translating to a = (?a or ?b).resolve() regardless of whether or not a dedicated binary coalescing operator is defined.

Edit (multiple): fixed assorted bugs in the code sketch, including one where the various operator functions failed to unwrap passed in Value and Missing instances when necessary.
Edit: added note about the difference between ?try await EXPR and ?await EXPR (and friends)
Edit: added ExpectedError subclass with different .resolve() behaviour (returning a sentinel value), changed query_try_getattr and query_try_getitem to use it (and added notes on why query_try_getattr doesn’t use an even tighter exception check)

8 Likes

There are some elements of this proposal that correspond to rejected ideas in PEP 505:

With the idea in this thread, the expectation is that query results will be used to simplify local calculations by avoiding repeated None checks and try/catch blocks, and then resolved back to their underlying values before the function returns (as discussed in the coalescing subsection).

Needing the final .resolve() call to get back to the unwrapped value (or raise an unhandled exception) is an unavoidable downside of taking the “wrapped results” approach. It’s just deemed worthwhile here for the potential benefits it brings (such as distinguishing Missing results from Error results rather than implicitly conflating them, and being able to track exactly where in the code a particular result was generated).

From the short circuiting point of view, that’s part of the ?. and ?[...] syntactic sugar definitions, and this idea recommends keeping that feature - the lookup chaining support on the types is just there so the function-based API still works even without the short-circuiting syntax.

BTW, I should be explicit that I don’t personally intend to take this proposal any further myself. It isn’t the first time I’ve tried to come up with a conceptual foundation for the PEP 505 safe navigation operators that better integrates them with the rest of the language, so this is a “can it be done?” exercise for me, rather than an “it should be done” proposal that I would personally advocate for.

(I do quite like this variant though, since it not only covers PEP 505 – None-aware operators | peps.python.org but also the rejected PEP 463 – Exception-catching expressions | peps.python.org, in a way that helps avoid some particularly common cases of using overly broad try/except statements in combination with await, yield, and yield from).

2 Likes

Read your initial comment on the other thread and will need to do a more detailed look at the additional parts here (something like the .resolve() method was actually the only recommendation I had from there, so already covered)

Overall though, I think this does a great job of capturing all of the various requirements and disagreements there (I know it fully covers my use case where I wanted just the None handling and nothing else as well as my objections to the other non-PEP 505 approaches)

Curious to see if anyone else will be able to poke any holes in this idea, because it looks really solid to me

The biggest objection I could see is that the various possible answers to the question “what does obj?.attr mean?” would now all be correct and that could be even more confusing[1]


  1. It makes perfect sense to me, but I’ve also spent the past week reading and thinking about PEP505, PEP463, and a lot of the discussions around them ↩︎

1 Like

Yeah, there would need to be significant thought put into explaining the following approaches to post-query checks:

  • just call .resolve(): exceptions are re-raised, anything else goes back to its underlying value (the PEP 505 use case)
  • check .has_value (or the boolean value): excludes missing values and errors. May need to be careful that unexpected exceptions aren’t silently lost.
  • check .is_error: equivalent to the except clause in the expression that produced the result. Inspect .exception if specific handling is needed, otherwise log it and/or call .resolve() to re-raise it.
  • check .is_missing: excludes populated values and errors. Assuming .resolve() is called on the other branch, won’t silently lose exceptions.

The property checks can all be replaced by the corresponding match statement cases.

Given that, we may want to add a runtime warning to error objects if they are discarded without either being resolved or having their exception property accessed. (similar to the warning that is emitted when coroutines aren’t awaited)

To simplify dealing with that, we might need to distinguish between “expected” exceptions and unexpected ones, such that resolve reports the sentinel value for expected exceptions.

query_try_getattr and query_try_getitem would then set the resolution result only for the case where the raised exception was for the relevant attribute or item.

Alternatively, those exception handler definitions could be improved to immediately re-raise unexpected exceptions and never convert them into error results in the first place. They would return Missing for expected exceptions instead of Error.

Updated the first post with some fixes and clarifications from the last couple of comments:

  • added an ExpectedError subclass and updated query_try_getattr and query_try_getitem to use it. Resolving the safe navigation operators will now return None for missing attributes instead of re-raising the AttributeError
  • made a note in query_try_getattr that checking exc.obj and exc.name for exact matches would differ from the way getattr itself works

However, fixing the first problem highlighted an issue with having any non-dunder methods or properties on expression result instances: they’re ambiguous when used in combination with the safe navigation operators.

Consider obj?.attr.resolve() vs (obj?.attr).resolve(). The first one short circuits the resolve() call for missing and error results, while the second one doesn’t.

This problem goes away if resolve(), has_value, is_missing, and is_error all become function calls or some other kind of check instead of methods or properties.

Of those operations, I think resolve_query would be worthy of being a builtin function (since it pairs with the ? query operator).

Unwrapping a safe navigation result would then be unambiguously written as resolve_query(obj?.attr) - the closing parenthesis on the builtin function call indicates where the short-circuiting ends. A builtin name like that also gives people something to search for when trying to understand code that uses query expressions, beyond just the cryptic ? symbol.

Everything else (including the type definitions), I’d suggest putting in a new dedicated module rather than adding them to the builtins or directly to the operator module. (Tentative name queryexpr, since some of the more obvious options like querylib would conflict with existing PyPI modules, and this library would specifically support querying Python expressions rather than anything else)

The initial set of names in that module (taking advantage of the dedicated queryexpr namespace to drop the common query prefix):

  • Value
  • Missing
  • Error
  • ExpectedError
  • query_expr
  • try_call
  • try_await
  • try_yield
  • try_yield_from
  • try_getattr
  • try_getitem
  • resolve_expr (may not be here if it’s a builtin)
  • has_value
  • is_missing
  • is_error

The public methods and properties defined on results in the initial post would become:

  • resolve__resolve_expr__ (may not need this, see below)
  • has_value → removed (rely on __bool__ or `isinstance(obj, queryexpr.Value) instead)
  • is_error → removed (rely on isinstance(obj, queryexpr.Error) instead)
  • is_missing → removed (rely on isinstance(obj, queryexpr.Missing) instead)
  • exception__cause__ (has the same meaning as the exception usage, so we can reuse it)

The result types would also be defined in such a way that they play nice with match statements, so code like the following would work:

from queryexpr import Error, ExpectedError, Missing, Value

def handle_result(result:Value|Missing|Error) -> Any:
    match(result):
        case Value(value):
            ... # Do something with value
        case Missing(sentinel):
            ... # Do something with sentinel
        case ExpectedError(exc, sentinel):
            ... # Do something with exc and/or sentinel
        case Error(exc):
            ... # Do something with exc

Potentially, resolve_expr could be defined in terms of pattern matching, rather than in terms of a polymorphic method:

def resolve_expr(result:Value|Missing|Error) -> Any:
    match(result):
        case Value(value):
            return value
        case Missing(sentinel):
            return sentinel
        case ExpectedError(_, sentinel):
            return sentinel
        case Error(exc):
            raise exc

Just noticed a typo in the original post: ExpectedError would be a Missing subclass rather than directly inheriting from the base query type.

(it resolves the same way Missing does, so this would make more sense than subclassing Error)