Introducing a Safe Navigation Operator in Python

In some type systems, A|B reflects an untagged union that can’t be safely narrowed to a more specific type at runtime.

That isn’t the case for Python (the required runtime type information for safe narrowing is always available), so you’re correct to see that argument as inapplicable.

I think you’re overlooking a subtlety in the original argument and in the way tagged unions work. The idea behind a tagged union (or disjoint union or sum type or any of the other many names the concept has) isn’t just that access to it is type safe but also that it keeps track of which “side” of the union an entry came from. While Python does keep track of type info to make sure that the C-style bugs where accessing an int in a int | str as though its a str can’t happen, it does not explicitly keep track of which side of the union the value came from. Obviously, we can infer that in examples like int | str but in cases like int | int we cannot.

Of course, things like explicitly writing out int | int don’t really happen in real world code, but what does happen is things like a library writing generic code with T | None and user code then supplementing int | None for T. In Python this leads to issues because when the code sees a None value it can’t tell if it’s a “user None” or “library None”. Proper tagged unions prevent this issue since there the value would either be a None (library None) or Some None (user None). This is the issue that the original commenter was getting at and why using a lot of Python style unions, in particular with None, can be considered bad design. As a side note, this behaviour also is why library Python code often uses ad-hoc singletons or even the ellipsis to mark missing values.

11 Likes

That’s a very interesting and nuanced observation, @ImogenBits, thank you!

One thing to note, though, is that the lack of distinction between “user None” and “library None”, and the lack of distinction between None and “missing” when dealing with an Optional[T] is a pre-existing condition in python, and not really germaine to the decision on whether or not to support “none propagation” in python

Basically, what I’m trying to get at is: We’re already in this “mess”, and have been for decades. Adding support for none-propagating is not going to appreciably increase this “mess”, but it will make this “mess” easier to deal with

I don’t think this is an accurate assessment. I’ve taken care with APIs I’ve designed to not use None this way for years. If people just assume using None will eventually be easier, people will create worse and worse APIs where every single access needs this, as it bleeds into more things because “well, it’s easy”. The things we make easy at a language level drive how people write code. This shouldn’t be an encouraged pattern for designing function apis, so we shouldn’t make it easier, we should nudge library authors towards providing a better api.

3 Likes

That’s true but another use case for none aware operators is that it’ll make it easier to deal with existing APIs that use some variation of None. Lots of existing external APIs also don’t care if python specifically has none aware operators or not.

1 Like

Sure none-aware operators give the designers a tool they can abuse, but it also gives the consumers a tool to deal with existing real world APIs that do as they please because they do not care about a single language’s features. And I would assume the consumers outnumber the designers. For said API consumers who have no other choice but to deal with None(s), none aware operators are all pros with no cons

1 Like

One can also do

match obj:
    case object(attribute=value) if value is not None:
        print(value)

This does not look better for the trivial case, but it generalizes nicely to deeper structures.

match obj:
    case object(attribute=object(field=object(measurement=value))) if value is not None:
        print(value)

Note that using object allows you to not needing to care about the type of the nested fields.

If you parsed a json as a dict you can have even more concise notation

match obj:
    case {'attribute': {'field': {'measurement': value}}} if value is not None:
        print(value)

if you know the type you are looking for you can make it even more concise:

match obj:
    case object(attribute=object(field=object(measurement=int(value)))):
        print(value)

and

match obj:
    case {'attribute': {'field': {'measurement': int(value)}}}:
        print(value)

I’m wondering if I could propose something a little left field ?

What if the proposal was just simplified into a silencing exceptions into “None” operator like you can in Swift with the “try?” keyword.

For example

if name := try? object.name:
   print(name)

print(try? object.name[0])

This try keyword would functionally be the same as having written:

try:
    name = object.name

  if Name is not None:
     print(name)
catch:
   pass

try:
   print(try object.name[0])
catch:
   print(None)

Quite often I find you either use “?” for the whole statement or you don’t so the ? become noise in many cases. Python already has operators like | or := which pretty much do the same thing as many of the ? operators in other languages.

So I thing perhaps this could be a good way of doing the same thing, especially as the reason languages like swift have both ? and try? is that they treat application errors and access errors as different things.

In python all errors are exceptions and so “try?” can cover both and IMHO matches fact that allot of python’s control operators are words such as “and” and “else” rather than “?”

I also think it kind of makes it easier to see you are doing a bad pattern, often you can see the “try” at the top of a function and go “phew” at least we are handling it. And ignore the fact the catch block just has “pass” in it.

“try?” at least stares you in the face and says “you are doing bad things here”

Sure, external APIs may not care about the syntax features of specific languages, but a python library wrapping that external API should give you useful objects and useful types.

It’s clearly an error if you expect some_response.some_user.id to exist as a non-None value, and either some_response or some_user or id are None. Any of these being None is likely a different failure mode, and more likely that not, shouldn’t be None but a specific error handling condition based on which failed and how.

As such, any API that you are using with this pep’s proposed handling (ie some_response?.some_user?.id) should be addressed by changing whatever is generating this object to error out rather than set you up to be in an invalid program state. msgspec and pydantic both make handling external APIs in this way easy,(and these aren’t the only options)

There are significant cons. As I already said, language design drives how people write code. If you’re finding None to be a problem in libraries you use, contribute to not having them in places that cause problems. What this would do is encourage the kind of design that forces you to use this everywhere because everyone is pushing the responsibility of handling missing data to the next person rather than handling it at the earliest place possible to eliminate the need for the checks in the first place.

1 Like

That’s… usually not an option. When you’re consuming someone else’s API, you don’t get to say “hey, Python doesn’t have a safe navigation operator so YOU need to change YOUR output”. That has barely more chance of happening than a future version of Python switching from indentation to braces.

You might as well go stand in a river with a signpost and expect it to change its course.

1 Like

You’ll notice I said to change “whatever is generating the object”, and not changing some external API here. I also referenced multiple libraries that parse, validate, and can handle defaults on missing data when that’s expected

If we’re talking python library APIs, that circles back to language design shouldn’t encourage just ignoring everywhere you expect to have data and don’t, but if you actually want that rather than handling each way this data could be an exception to your program’s expectations.

try:
    some_id = some_response.some_user.id
except AttributeError:
    some_id = None

That just shifts the problem around, it doesn’t solve it. Instead of taking a blob of JSON, decoding it, and then navigating through it, you’re proposing that we first create an entire schema for the object, and then validate everything. That’s a HUGE job just to avoid running into a None while descending through the object, and it assumes that the API you’re getting this from is completely perfectly stable - which, in my experience, is unrealistically hopeful.

I say again, you’re trying to change the wrong thing here.

1 Like

If the API is giving back data you don’t expect (for any reason, not just stability), it is a good sign that the program shouldn’t be doing anything with the data, and should error, as the program’s understanding of the API is incorrect in some way and any assumptions about what the data is and means should no longer be valid. Without actually keeping track of what the mismatch of expectations to the data you got was, you won’t get a meaningful error here either.

Okay, but how much of the data are you going to validate before you try using ANY of it?

Here. Grab this output:

It is big. But you could easily delve into it at a superficial level and read through it to find the handful of emotes you might be interested in. Great! But if you’re going to validate the entire schema first, you need to design something that will pass validation for EVERY emote. And some of them are unusual in ways that you might not care about.

So now you have two options: either you design something that validates absolutely everything (a massive job, and a huge waste of time), or you use a sloppy validation that lets things through if they aren’t the part you’re looking for.

And if you’re going to let things through… why not just… not validate, and navigate the tree using a method that doesn’t mind variations?

1 Like
  1. your specific example has already been done, and each way it can fail has it’s own exception: Twitch API — twitchAPI v4.3.1 documentation
  2. If you’re intentionally fine with with extracting only data you expect to exist in the response using methods that are forgiving, dict.get exists. Failing to reach the emotes on the expected json path (even if it’s after parsing into a python dict) should still fail for the permissive case. safe navigation doesn’t help when you expect data that isn’t there, it just erases context for the same lack of data

That’s not an unreasonable point - many of my usecases are such that I’m grabbing some data from a JSON source and extracting information from it. I very rarely care enough to write a full parsing library for the data, I just want to get on with analyzing the items I care about.

But just because that’s a common need, doesn’t mean it needs a language feature. Libraries like glom exist to do precisely the sort of navigation of unstructured data we’re talking about here.

The point @mikeshardmind is making, which I agree with, is that if Python has this sort of non-aware navigation operator, that will encourage Python libraries that produce data for consumption by Python code to think it’s OK to use None as a marker for missing data, which in turn will be difficult to consume with anything other than none-aware operators. So it becomes a self-reinforcing design, which is both awkward to handle cleanly (at some point you’ll need to deal with the None values that are filtering up through your code) and clumsy to make strict (the only way to do so is with the explicit checks that people are claiming are so inconvenient to use that we need new none-aware operators).

This all feels to me like we’re making the same mistake that “quiet NaN” values do in floating point.

6 Likes

Yep, now, what are you going to do when Twitch updates something and that module hasn’t been updated yet? Are you completely unable to do ANYTHING since it won’t validate? Or are you still able to read the parts of the response that you’re interested in (assuming that those parts haven’t changed)? Changes happen all the time, usually in backward-compatible ways.

And we’re right back to the start: dict.get exists, but when you try to chain it, it gets VERY verbose. So we’d like a much much easier way to do this.

Maybe? I really don’t see that as being a particularly likely scenario. It would only be an issue if, as you say, you’re producing data for consumption by other Python code, and you don’t care about anything else. JavaScript has a none-aware navigation operator - do we see APIs written for consumption by JavaScript code that do this? I can’t say I’ve seen a lot of it, but maybe I don’t look in the right places.

I’m not saying it’s impossible, just that it doesn’t seem nearly as prevalent as would be significant here.

1 Like

glom exists (and I’ve used it for cases similar to what you’re describing before when I truly only care about some of the data)

If you don’t want a library for this, it’s really not that hard to chain, and it doesn’t need to be verbose.

It’s the kind of thing I’ve inlined before, though some people would want to just name a function for it to not think about reduce except when looking at this function:

from functools import reduce

def nested_get(d, key: str):
    return reduce(dict.get, key.split("."), d)

# or

def nested_get(d, *keys: str):
    return reduce(dict.get, keys, d)

Arguably all JSON producing APIs that include missing keys are examples of this. And because those APIs became useful outside of the Javascript world, everyone is now under pressure to add none-aware options like Javascript.

2 Likes

That doesn’t work, though:

>>> reduce(dict.get, ["a", "b"], {"c": 1})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: descriptor 'get' for 'dict' objects doesn't apply to a 'NoneType' object