Introducing a Safe Navigation Operator in Python

A semi-baked idea that has been kicking around in my brain the past few days is something I’ve been thinking about as “query expressions” (assorted potential enhancements omitted in the sketch below, like being able to capture the same code location information on the various query result instances as we capture when raising SyntaxError in the compiler, as well as offering common has_value, is_missing, and is_error properties for easier runtime introspection. Distinguishing between raised exceptions and returned exceptions is also a topic that would need further consideration):

class Value:
    """Any Python object that isn't missing or an error. Considered true."""
    def __init__(self, value):
        self._value = value

    @property
    def value(self):
        return self._value
   
    def __bool__(self):
        return True
class Missing:
    """A missing value. Considered false."""
    def __init__(self, sentinel=None):
        self._sentinel = sentinel

    @property
    def sentinel(self):
        return self._sentinel

    def __bool__(self):
        return False
class Error:
    """An error value. Considered false."""
    def __init__(self, exception):
        self._exception = exception

    @property
    def exception(self):
        return self._exception

    def __bool__(self):
        return False
def query(obj: Any, *, sentinel=None) -> Value|Missing|Error:
    """Classify a Python object as missing, an error, or some other value"""
    if obj is sentinel:
        # Do the fastest check first
        return Missing(sentinel)
    if isinstance(obj, (Value, Missing, Error)):
        # Avoid nesting the result wrapper objects
        return obj
    if isinstance(obj, BaseException):
        # Exceptions are always considered errors,
        # even when returned instead of being raised
        return Error(obj)
    # Wrap anything else as a regular value
    return Value(obj)   

Given the above foundation, ? would then be defined as a unary prefix operator that was just a shorthand for operator.query, so ?expr would give the same result as query(expr) (aside from the latter lacking code location information).

On its own, that wouldn’t be interesting (aside from potentially being a way to capture code location information for arbitrary expressions), but where I think it has more potential is as a concept that underlies safe navigation in a way that handles the x is not None vs hasattr(x, "attr") discrepancy by saying it means both (and represents those potential results differently).

Firstly though, ?try could be an exception catching query expression, such that:

result = ?try some_call()

translated to:

    try:
        _inner_value = some_call()
    except Exception as _e:
        _inner_value = _e
    _query_result = ?_inner_value
    result = _query_result

and ?await some_call() (along with ?yield and ?yield from) could be defined as a way to avoid a common bug where coroutine and generator exception handlers inadvertently cover more than just the exceptions thrown in when the frame resumes execution:

    _awaitable = some_call() # Note: outside the scope of the try block!
    try:
        _inner_value = await _awaitable
    except Exception as _e:
        _inner_value = _e
    _query_result = ?_inner_value
    result = _query_result

(If you do want to cover both parts of the expression for some reason, then ?try await ..., ?try yield ..., and ?try yield from ... would all remain available)

Returning to the original topic of safe navigation (in this conceptual framework: “attribute query expressions” and “item query expressions”):

result = obj?.attr.subattr

would translate to:

  _lhs = obj
  if _lhs is None:
      _inner_value = _lhs
  else:
      try:
          _inner_value = _lhs.attr
      except AttributeError as _e:
          _query_result = _e
     else:
         # Resolve any trailing parts of the expression
         # This clause would be omitted when not needed
         _inner_value = _inner_value.subattr
  _query_result = ?_inner_value
  result = _query_result

Item lookup (such as result = obj?[some_calculated_item()].attr) would translate to:

  _lhs = obj
  if _lhs is None:
      _inner_value = _lhs
  else:
      _lookup_key = some_calculated_item() # Outside the try/catch!
      try:
          _inner_value = _lhs[_lookup_key]
      except LookupError as _e:
          _query_result = _e
     else:
         # Resolve any trailing parts of the expression
         # This clause would be omitted when not needed
         _inner_value = _inner_value.attr
  _query_result = ?_inner_value
  result = _query_result

With these definitions, obj?.attr would potentially replace a lot of hasattr and getattr usage with a type-safe alternative, and obj?[item] would provide a convenient way to attempt optional item lookups without having to handle KeyError or IndexError yourself (LookupError is their common parent exception).

Due to the way __bool__ is defined in the query result objects, ?? wouldn’t be needed - you would use or in combination with query expressions instead.

This approach would presumably be a bit slower than the simpler definition in PEP 505, but not that much slower:

  • try/except is essentially free these days when no exception is thrown
  • the actual implementation would presumably avoid querying objects when it already knows the result (such as for caught exceptions, or values that have just been checked against None)

Edit: I started a dedicated thread for further iteration on this idea: Safe navigation operators by way of expression result queries

3 Likes

Bumping this thread since I ran into another real world example where this would be useful.

def download_video(self, videoId, destinationDir, playlistIndex=None):
    #Check if video already exists
    #Format can be "index|id|uploader|name" or "id|uploader|name"
    #utils.SEPARATOR is "|"
    for file in os.listdir(destinationDir):
        if file.split(utils.SEPARATOR)[0] == videoId or file.split(utils.SEPARATOR)?[1] == videoId:
            print("Video '%s' is already downloaded" % (videoId))
            return
    #...

Splitting on the separator might not yield two elements as there may be files with no separator at all.
In Javascript I would just do file.split(utils.SEPARATOR)?[1] which would resolve to null if the split didn’t do anything, and that would still work.
Here, I either have to add a length check, which is quite verbose:
if file.split(utils.SEPARATOR)[0] == videoId or len(file.split(utils.SEPARATOR)) > 1 and file.split(utils.SEPARATOR)[1] == videoId:

Or I can use any but that makes the code harder to read, it’s not obvious at first that I’m just checking the 1st and 2nd fields:
if any([x == videoId for x in file.split(utils.SEPARATOR)[:2]]):

1 Like

Why not just use videoId in file.split(utils.SEPARATOR)[:2]?

Your version does the split twice, which isn’t ideal (but is easily fixable). This version (and the any() version) doesn’t. I don’t know why you say “it’s not obvious that I’m just checking the first and second fields” - that’s exactly what [:2] says. If it’s because it’s hidden in the comprehension, the version above makes it more obvious.

By the way, in the any() version you don’t need to build a list:

any(x == videoId for x in file.split(utils.SEPARATOR)[:2])

(note one set of [...] has been removed).

IMO, this is actually an example of where the ?[] operator is an attractive nuisance - it leads you towards a worse solution, rather than helping you write a good solution.

9 Likes

Because actually these are not equivalent:

if obj and obj.attribute:
    print(obj.attribute)

is checking whether the values are ‘falsy’, but not None. For example, if obj.attribute == "", then it wouldn’t print anything. Also obj could be [] or {}, which could be valid values to have an attribute. So that is not always the wanted behavior.

What I would want to ‘improve’ is this:

if obj is not None and obj.attribute is not None:
    print(...)

which is much more verbose

if expression checks if the expression is truthy, whereas if not expression checks if the expression is falsy.

By definition, None is an object frequently used to represent the absence of a value. If a user chooses to rely on a somewhat vague distinction between the absence of a value and an empty value, it can make coding more complicated by requiring explicit checks to determine if a given value or attribute is None, as you demonstrated. Technically, though, you are correct.

However, you can be more explicit when handling empty values:

obj = {'a': 2}
# obj = {}
# obj = None

if obj and (keys := set(obj.keys())):
    print(keys)

else:
    print('Empty:', set())

if obj is not None and (keys := set(obj.keys())):
    print(keys)

else:
    print('Empty:', set())

I don’t see any added value in using optional variables. Using Optional makes falsy values redundant. But, ultimately, it comes down to code style.


Current language features do not favor the extensive use of None. However, if a None-aware operator were introduced, it would make working with optional variables much easier. You might be interested in the new thread discussing a similar proposal.

If the moderators agree, this thread may be closed.