Add None coalescing operator in Python

I read code in many languages and honestly I am not sure this is true. Assuming you understand what the ?? symbol combo means (which is a big assumption, and IMO a legit reason to reject the syntax addition on its own), it is much more readable than the if x is None else combo to me, since the latter can be more easily misread (to say if x else) and requires more mental attention. I assume this is quite subjective, however.

4 Likes

That may be true. But then, many people coming to Python from other languages claim that our ternary if operator value if cond else altvalue is ā€œunreadableā€. So maybe readability depends on what you are used to.

With other languages using ?? I expect that people will soon consider x ?? y to be every bit as readable as x + y and f(*args, **kw).

In the 1990s, I was a heavy user of Apple’s Hypertalk language, which used syntax like

add 1 to x

put the value of background field "Address" into address

Do you think that’s worth the extra typing? That’s not a rhetorical question. The Hypertalk community loved Apple’s verbose syntax, but even they used abbreviated syntax sometimes.

Instead of comparing fake examples with variables x and y, we should look at examples from real code.

2 Likes

Indeed I mean with ā€œx ?? yā€ you have to learn what is ā€œ??ā€, and the knowledge is useful only there. While for ā€œif elseā€, it’s applicable to wider problem range.

But for the same reason I wouldn’t like to see ā€œ:=ā€œ in the language, but it’s there anyway :wink: Though ā€œ??ā€ faces an additional challenge that ā€œ?ā€ is not already used in the language.

1 Like

Yes, but it is not concise. What would be even more readable is a concise statement similar to conditional expression, which does not require repetition of identifiers (x if x). This can be done with introducing a built-in like maybe (x = maybe(x, [])), but if there could be a proposal which avoids new syntax introduction, that would be quite interesting.

Thanks for the examples.

Reading them reassures me that ā€œif is None ā€¦ā€ is superior in readability to the subtle ā€œ??ā€, ā€œ?.ā€, etc.

The difference vs e.g. C# is that in the latter, Null reference is so common and ?. is frequently used not to express an intended business logic but to handle an unexpected input. The well known billion dollar mistake. On the other hand, None in Python is often used to mean ā€œdefault valueā€ and therefore an explicit test against None reveals intention more clearly.

1 Like

ā€œReadabilityā€ in theoretical discussions is almost completely a useless stat. Everyone has their own definition, never backed by any sort of actual studies, and nobody ever changes their mind based on other people’s examples.

I personally suspect that some people consider something ā€œreadableā€ on the basis that ā€œI can understand what it does based on my pre-existing knowledge of what Python can already doā€ (meaning that new syntax is ALWAYS less ā€˜readable’ than a verbose form that already exists), and other people consider something ā€œreadableā€ on the basis that it is compact and expresses a thought that can be fit into a larger ā€œsentenceā€ or ā€œparagraphā€ (meaning that a new syntax is almost always more ā€˜readable’ than the more-verbose form that already exists).

It’s nearly as bad as ā€œexplicitā€, which means ā€œcode that I likeā€. Can we all agree to just stop arguing about whether something is readable based solely on our subjective views, please? It’s nothing but noise in a discussion.

3 Likes

I agree ā€œreadabilityā€ is largely subjective. But I thought this topic was inviting for opinions of personal preference. After all, having or not ā€œ?ā€ is more art than science.

ā€œReadabilityā€ is very much science. If you want to say ā€œI don’t like x ?? yā€, then say ā€œI don’t likeā€, not ā€œit is less readableā€.

(And if you don’t believe me about readability being a science, look at the web accessibility guidelines and how extremely precise they can be regarding font size, colour contrast, and other matters. They have based their recommendations on actual studies, studies in which the readability of something HAS been thoroughly tested.)

1 Like

Statistics is not science.

A poll is needed to decide whether ā€œ??ā€ etc is more readable or less. (Of course there’re plenty of problems with respect to survey design.) So I take this thread as a ā€œmini-pollā€ and express my vote.

Statistics is not the definition of readability either.

As per this study (accessible via Sci-Hub), full word identifiers are hypothesized to improve code readability:

I think it’s hard for us common folk to gauge readability on the basis of science and not prudence. We don’t really possess means such as randomized controlled experiments (ā€œA/B testingā€), and I am not particularly well-versed in state of the art, well-cited, peer-review literature on this niche subject. Even then, at most I would be able to read and understand studies using something simple like OLS, and anything more advanced would fly over my head. ā€œGet a PhD or stop talkingā€ basically excludes 99% of people from the conversation.

2 Likes

That’s fair. I’m not asking people to stop talking. All I’m saying is to be clear about what you’re expressing. Are you saying that you don’t like this syntax? Then say ā€œI don’t like this syntaxā€. That’s a perfectly valid viewpoint - you don’t have to reword it as ā€œthis is more implicitā€ or ā€œthis is less readableā€ just to make it valid. Why can’t people express opinions as opinions instead of trying to appeal to facts that don’t exist?

Often, design considerations DO come down to whether people like something or not (although more commonly it’s the opinions of core devs rather than random people discussing on mailing lists, but still). So go ahead and say that you don’t like it.

BTW, it’s also perfectly reasonable to say ā€œI love the idea of a None-coalescing operator, but I hate the x ?? y syntaxā€. Opinions are allowed to be complex :slight_smile:

Simply saying ā€œI don’t like itā€ is not helpful to the audience because it doesn’t explain why I don’t like it.

On the other hand it’s helpful to elaborate that I find it less readable than ā€œis Noneā€ so that the original proposal author gets some meaningful feedback.

To add another angle to the discussion:

I’m not sure whether the discussions have already mentioned this (the PEP 505 doesn’t list this), but SQL has a function called COALESCE(), which is commonly used for handling NULL values: see e.g. the PostgreSQL docs: PostgreSQL: Documentation: 16: 9.18. Conditional Expressions

Making this a builtin in Python would solve many of the situations listed in PEP 505 in an explicit and elegant way.

The function would also go beyond just checking one value for None. It returns the first non-None argument, so you don’t have to chain operators and you can use the builtin in a functional way with iterators.

Furthermore, we could optionally extend this to also accept N/A values (math.isnan()), empty strings, empty lists/tuples, etc. to address other areas where ā€œthis value is not available/usableā€ pops up. Here’s a sketch:

coalesce(*args, /, check=None, logic=False)

Return the first value from args which is not the check object (defaults to None). If check is a callable, return the first value from args which check(value) is False. logic may be set to True, to swap the test logic.

We could then add a few handy operators for the check function, e.g.

  • tuple.isempty() - check for empty tuples
  • list.isempty() - check for empty tuples
  • str.isempty() - check for empty strings
  • etc.
    or a more generic isemtpy() function, which check the length and the type of an object.

Other functions which come in handy as check function:

  • math.isnan()
  • math.isfinite(), with logic set to True
  • math.isinf()
  • len()
  • bool()
  • operator.attrgetter()
  • operator.itemgetter()
  • cmath.isnan()
7 Likes

Yeah, same syntax is in R/Tidyverse as well.

The limitation with COALESCE in Python (without additional language support) is that it does not lazily evaluate the arguments, which I imagine will be a necessary requirement.

If a special case is made in the Python language to lazily evaluate the arguments of COALESCE, it would be an interesting approach. It also opens the door as for whether to support lazy evaluation / short circuit in more contexts. (I feel a ā€œmacroā€ system is knocking at the door!)

What does ā€œless readableā€ mean for you? Without understanding your personal sense of readability, it is difficult to know how to interpret your claims about readability.

On its own, ā€œless readableā€ carries about as much information as ā€œI don’t like itā€. What makes it less readable?

  • Is it too verbose?

  • Too terse?

  • Too ambiguous? E.g. Python uses the * symbol for nearly a dozen different things, if we include the stdlib as well as the syntax itself.

  • Full of weird symbols that have to be memorised by rote?

  • Are the symbols visually hard to distinguish, e.g. in many fonts $ and S look very similar.

  • How does it compare to other syntax in Python?

Regarding the last point, is x + y less readable than add(x, y)? How about x**y compared to pow(x, y)?

If you answered ā€œYesā€, then maybe you just don’t like symbolic operators, and prefer words, so of course you will dislike the ?? symbolic operator.

But if you find the + and ** operators more readable than the named function calls, and yet find the ?? operator less readable, that possibly means you are confusing familiarity with readability. You find + and ** readable because you are used to them, while ?? is unfamiliar.

Never underestimate the difference familiarity makes to readability. The first time I tried to read Python code, I found it an unreadable mess. It was full of weird symbols like [:] and {x: y} and I had no idea what was going on. Now I find Python so readable that every time I try to read code in another language, I cry :slight_smile:

My personal feelings are:

  • I think that dealing with None is a minor pain point. It would be nice to have a better (easier, more terse) way to deal with it.

  • I like the look of the ?? operator. It feels right to me, it’s not too weird, and is easier to remember.

  • I expect that as other languages introduce the same operator, it will get more familiar and more people will come to expect it.

  • I’m neutral towards the ?. and ?[] symbols. They don’t look as nice, but I can’t think of a better alternative.

  • I don’t think it is worth implementing just the ?? and not the other two.

So I guess that overall I’m positive towards the PEP.

3 Likes

Short-circuiting is natural for operators (as it already exists in or etc.) and useful for lazy evaluation in:
x ?? calculate_expensive_fallback().

Old broken x or default constructs can easily be fixed to x ?? default without introducing more bugs in the rewrite to coalesce(x, default) which requires editing in three places instead of one.

Chaining is much clearer and less error prone with operators:
(override ?? fallback).name ?? default
coalesce(coalesce(override, fallback).name, default)

The word coalesce is difficult to remember and spell.

A coalesce function can not replace ?. etc. I think?

A keyword-based coalesce operator x coalesce default could be plausible, but seems implausible for ?. etc.

Overall ?? wins IMO. (And is already more familiar from other languages.)

This cannot be done with a function, because function arguments have to be evaluated before the function is called. In other words, they are eagerly evaluated.

Like the ternary if, and and or, this has to be lazy and only evaluate the right operand if the left operand is None.

PEP 505 suggests three new operators, ??, ā€œmaybe dotā€ and ā€œmaybe subscriptā€. If you want to propose alternative spelling that doesn’t use a question mark, you need to propose an alternative for all three, not just ??.

Python uses words for and, or, and x if cond else y where a lot of C-like languages use symbols &&, ||, and cond ? x : y. So the ?? operator is a bit of an odd fit here. On the other hand, Python uses punctuation for . and [], so it makes full sense to go with ?. and ?[]. If someone wants to propose an alternate spelling for ??, go for it; personally, I can’t think of anything better, so the slight oddity won’t be all that bad. Also, there’s no concept of x and= y but we do have x *= y, so being able to write x ??= y is a win for the punctuation spelling.

That said, though, PEP 505’s semantics are NOT the same as the semantics in other languages, so this is still going to be something to learn, regardless of the spelling used.