Add None coalescing operator in Python

uranusjr · December 14, 2022, 9:33am

I read code in many languages and honestly I am not sure this is true. Assuming you understand what the ?? symbol combo means (which is a big assumption, and IMO a legit reason to reject the syntax addition on its own), it is much more readable than the if x is None else combo to me, since the latter can be more easily misread (to say if x else) and requires more mental attention. I assume this is quite subjective, however.

steven.daprano · December 14, 2022, 11:11am

That may be true. But then, many people coming to Python from other languages claim that our ternary if operator value if cond else altvalue is “unreadable”. So maybe readability depends on what you are used to.

With other languages using ?? I expect that people will soon consider x ?? y to be every bit as readable as x + y and f(*args, **kw).

In the 1990s, I was a heavy user of Apple’s Hypertalk language, which used syntax like

add 1 to x

put the value of background field "Address" into address

Do you think that’s worth the extra typing? That’s not a rhetorical question. The Hypertalk community loved Apple’s verbose syntax, but even they used abbreviated syntax sometimes.

Instead of comparing fake examples with variables x and y, we should look at examples from real code.

fancidev · December 14, 2022, 11:04pm

Indeed I mean with “x ?? y” you have to learn what is “??”, and the knowledge is useful only there. While for “if else”, it’s applicable to wider problem range.

But for the same reason I wouldn’t like to see “:=“ in the language, but it’s there anyway Though “??” faces an additional challenge that “?” is not already used in the language.

vovavili · December 15, 2022, 12:00am

Yes, but it is not concise. What would be even more readable is a concise statement similar to conditional expression, which does not require repetition of identifiers (x if x). This can be done with introducing a built-in like maybe (x = maybe(x, [])), but if there could be a proposal which avoids new syntax introduction, that would be quite interesting.

fancidev · December 15, 2022, 12:00am

Thanks for the examples.

Reading them reassures me that “if is None …” is superior in readability to the subtle “??”, “?.”, etc.

The difference vs e.g. C# is that in the latter, Null reference is so common and ?. is frequently used not to express an intended business logic but to handle an unexpected input. The well known billion dollar mistake. On the other hand, None in Python is often used to mean “default value” and therefore an explicit test against None reveals intention more clearly.

Rosuav · December 15, 2022, 2:56am

“Readability” in theoretical discussions is almost completely a useless stat. Everyone has their own definition, never backed by any sort of actual studies, and nobody ever changes their mind based on other people’s examples.

I personally suspect that some people consider something “readable” on the basis that “I can understand what it does based on my pre-existing knowledge of what Python can already do” (meaning that new syntax is ALWAYS less ‘readable’ than a verbose form that already exists), and other people consider something “readable” on the basis that it is compact and expresses a thought that can be fit into a larger “sentence” or “paragraph” (meaning that a new syntax is almost always more ‘readable’ than the more-verbose form that already exists).

It’s nearly as bad as “explicit”, which means “code that I like”. Can we all agree to just stop arguing about whether something is readable based solely on our subjective views, please? It’s nothing but noise in a discussion.

fancidev · December 15, 2022, 3:13am

I agree “readability” is largely subjective. But I thought this topic was inviting for opinions of personal preference. After all, having or not “?” is more art than science.

Rosuav · December 15, 2022, 4:37am

“Readability” is very much science. If you want to say “I don’t like x ?? y”, then say “I don’t like”, not “it is less readable”.

(And if you don’t believe me about readability being a science, look at the web accessibility guidelines and how extremely precise they can be regarding font size, colour contrast, and other matters. They have based their recommendations on actual studies, studies in which the readability of something HAS been thoroughly tested.)

fancidev · December 15, 2022, 5:25am

Statistics is not science.

A poll is needed to decide whether “??” etc is more readable or less. (Of course there’re plenty of problems with respect to survey design.) So I take this thread as a “mini-poll” and express my vote.

Rosuav · December 15, 2022, 5:29am

Statistics is not the definition of readability either.

vovavili · December 15, 2022, 7:52am

As per this study (accessible via Sci-Hub), full word identifiers are hypothesized to improve code readability:

[Album] Imgur: The magic of the Internet

I think it’s hard for us common folk to gauge readability on the basis of science and not prudence. We don’t really possess means such as randomized controlled experiments (“A/B testing”), and I am not particularly well-versed in state of the art, well-cited, peer-review literature on this niche subject. Even then, at most I would be able to read and understand studies using something simple like OLS, and anything more advanced would fly over my head. “Get a PhD or stop talking” basically excludes 99% of people from the conversation.

Rosuav · December 15, 2022, 8:47am

That’s fair. I’m not asking people to stop talking. All I’m saying is to be clear about what you’re expressing. Are you saying that you don’t like this syntax? Then say “I don’t like this syntax”. That’s a perfectly valid viewpoint - you don’t have to reword it as “this is more implicit” or “this is less readable” just to make it valid. Why can’t people express opinions as opinions instead of trying to appeal to facts that don’t exist?

Often, design considerations DO come down to whether people like something or not (although more commonly it’s the opinions of core devs rather than random people discussing on mailing lists, but still). So go ahead and say that you don’t like it.

BTW, it’s also perfectly reasonable to say “I love the idea of a None-coalescing operator, but I hate the x ?? y syntax”. Opinions are allowed to be complex

fancidev · December 15, 2022, 8:54am

Simply saying “I don’t like it” is not helpful to the audience because it doesn’t explain why I don’t like it.

On the other hand it’s helpful to elaborate that I find it less readable than “is None” so that the original proposal author gets some meaningful feedback.

malemburg · December 15, 2022, 9:59am

To add another angle to the discussion:

I’m not sure whether the discussions have already mentioned this (the PEP 505 doesn’t list this), but SQL has a function called COALESCE(), which is commonly used for handling NULL values: see e.g. the PostgreSQL docs: PostgreSQL: Documentation: 16: 9.18. Conditional Expressions

Making this a builtin in Python would solve many of the situations listed in PEP 505 in an explicit and elegant way.

The function would also go beyond just checking one value for None. It returns the first non-None argument, so you don’t have to chain operators and you can use the builtin in a functional way with iterators.

Furthermore, we could optionally extend this to also accept N/A values (math.isnan()), empty strings, empty lists/tuples, etc. to address other areas where “this value is not available/usable” pops up. Here’s a sketch:

coalesce(*args, /, check=None, logic=False)

Return the first value from args which is not the check object (defaults to None). If check is a callable, return the first value from args which check(value) is False. logic may be set to True, to swap the test logic.

We could then add a few handy operators for the check function, e.g.

tuple.isempty() - check for empty tuples
list.isempty() - check for empty tuples
str.isempty() - check for empty strings
etc.
or a more generic isemtpy() function, which check the length and the type of an object.

Other functions which come in handy as check function:

math.isnan()
math.isfinite(), with logic set to True
math.isinf()
len()
bool()
operator.attrgetter()
operator.itemgetter()
cmath.isnan()

vovavili · December 15, 2022, 10:06am

Yeah, same syntax is in R/Tidyverse as well.

fancidev · December 15, 2022, 10:16am

The limitation with COALESCE in Python (without additional language support) is that it does not lazily evaluate the arguments, which I imagine will be a necessary requirement.

If a special case is made in the Python language to lazily evaluate the arguments of COALESCE, it would be an interesting approach. It also opens the door as for whether to support lazy evaluation / short circuit in more contexts. (I feel a “macro” system is knocking at the door!)

steven.daprano · December 15, 2022, 11:30am

What does “less readable” mean for you? Without understanding your personal sense of readability, it is difficult to know how to interpret your claims about readability.

On its own, “less readable” carries about as much information as “I don’t like it”. What makes it less readable?

Is it too verbose?
Too terse?
Too ambiguous? E.g. Python uses the * symbol for nearly a dozen different things, if we include the stdlib as well as the syntax itself.
Full of weird symbols that have to be memorised by rote?
Are the symbols visually hard to distinguish, e.g. in many fonts $ and S look very similar.
How does it compare to other syntax in Python?

Regarding the last point, is x + y less readable than add(x, y)? How about x**y compared to pow(x, y)?

If you answered “Yes”, then maybe you just don’t like symbolic operators, and prefer words, so of course you will dislike the ?? symbolic operator.

But if you find the + and ** operators more readable than the named function calls, and yet find the ?? operator less readable, that possibly means you are confusing familiarity with readability. You find + and ** readable because you are used to them, while ?? is unfamiliar.

Never underestimate the difference familiarity makes to readability. The first time I tried to read Python code, I found it an unreadable mess. It was full of weird symbols like [:] and {x: y} and I had no idea what was going on. Now I find Python so readable that every time I try to read code in another language, I cry

My personal feelings are:

I think that dealing with None is a minor pain point. It would be nice to have a better (easier, more terse) way to deal with it.
I like the look of the ?? operator. It feels right to me, it’s not too weird, and is easier to remember.
I expect that as other languages introduce the same operator, it will get more familiar and more people will come to expect it.
I’m neutral towards the ?. and ?[] symbols. They don’t look as nice, but I can’t think of a better alternative.
I don’t think it is worth implementing just the ?? and not the other two.

So I guess that overall I’m positive towards the PEP.

petersuter · December 15, 2022, 11:35am

Short-circuiting is natural for operators (as it already exists in or etc.) and useful for lazy evaluation in:
x ?? calculate_expensive_fallback().

Old broken x or default constructs can easily be fixed to x ?? default without introducing more bugs in the rewrite to coalesce(x, default) which requires editing in three places instead of one.

Chaining is much clearer and less error prone with operators:
(override ?? fallback).name ?? default
coalesce(coalesce(override, fallback).name, default)

The word coalesce is difficult to remember and spell.

A coalesce function can not replace ?. etc. I think?

A keyword-based coalesce operator x coalesce default could be plausible, but seems implausible for ?. etc.

Overall ?? wins IMO. (And is already more familiar from other languages.)

steven.daprano · December 15, 2022, 11:37am

This cannot be done with a function, because function arguments have to be evaluated before the function is called. In other words, they are eagerly evaluated.

Like the ternary if, and and or, this has to be lazy and only evaluate the right operand if the left operand is None.

PEP 505 suggests three new operators, ??, “maybe dot” and “maybe subscript”. If you want to propose alternative spelling that doesn’t use a question mark, you need to propose an alternative for all three, not just ??.

Rosuav · December 15, 2022, 11:42am

Python uses words for and, or, and x if cond else y where a lot of C-like languages use symbols &&, ||, and cond ? x : y. So the ?? operator is a bit of an odd fit here. On the other hand, Python uses punctuation for . and [], so it makes full sense to go with ?. and ?[]. If someone wants to propose an alternate spelling for ??, go for it; personally, I can’t think of anything better, so the slight oddity won’t be all that bad. Also, there’s no concept of x and= y but we do have x *= y, so being able to write x ??= y is a win for the punctuation spelling.

That said, though, PEP 505’s semantics are NOT the same as the semantics in other languages, so this is still going to be something to learn, regardless of the spelling used.