Add None coalescing operator in Python

fancidev · December 15, 2022, 2:17pm

By lack of readability I mean “??” is too terse. I prefer the more verbose (current) version.

The comparison to (x + y) is not helping because addition is a well known operator that every one learned in primary school.

While ?? and ?. is familiar in some other languages (I personally get to know them from C# in a “recent” edition, maybe C# 6?), I find the operation too specific to IT professional, and not friendly to the casual reader. I believe Python’s advantage (and beauty) lies in that one doesn’t have to be an IT professional to read and write Python (compared to e.g. C or JavaScript). I hope Python can keep this advantage in its evolution.

vovavili · December 15, 2022, 2:35pm

I think ?. and ?[] could be a parameter of sorts:

override = coalesce(override, fallback, atom=True).name.

Method chaining is a fair point. R solves this problem with a pipe operator %>% which does not exist in Python.

coalesce(override, fallback, atom=True).name %>% coalesce(., default)

Rosuav · December 15, 2022, 3:04pm

Okay, finally something that can be reasonably debated!

So what you’re saying is that you prefer x if x is not None else y over x ?? y. That’s perfectly reasonable, but I disagree, partly because THAT much verbosity is extremely annoying, and partly because it forces you to write x twice - not a big deal if it’s a simple variable lookup, but it does make it harder to use when the left side is a function call or something.

Maybe, but I have seldom seen a programmer have trouble with extending that to x ** y, which I don’t recall learning in primary school (exponentiation was done with superscripts, but never a double asterisk), and even multiplication and division aren’t spelled the way I learned them in my youth (x * y vs x × y or simply xy). The “modulo” or “remainder” operator in programming, which varies in meaning from language to language, doesn’t really even exist in mathematics - but it’s not a problem to have x % y with an operator.

So I suspect the “familiarity” argument is far less about grade-school mathematics (which really only covers addition and subtraction), and more about what we’re accustomed to from other programming languages.

I’m not sure there’s as much difference as you might think. Simple features are pretty easy to use (“Python as a calculator” is a great tool - just fire up the REPL and type expressions to be evaluated, no programming knowledge needed), but to be able to read and write arbitrary Python code, you still need to be at least broadly familiar with a good number of concepts. The barrier-to-entry is notably higher in C (though I wouldn’t say it’s all that much higher in JS), but the upper reaches of the language are going to still need some programming skill. For instance, I wouldn’t expect a non-programmer to understand this:

await asyncio.gather(*[cancel_task(t) for t in tasks])

A None-coalescing operator wouldn’t be something that you need for Python-as-a-calculator, and it’s far FAR less to get your head around than all the concepts of async/await (and asynchronicity in general).

I would reword the strength you’re describing. Rather than being “one doesn’t have to be an IT professional to read and write Python”, I would say, instead, that “a non-programmer can become a Python programmer in less time” (than, say, a C programmer). If you take someone who isn’t a programmer (say, a research scientist) and invite him/her to learn some Python in order to be able to better analyze the raw data, how much time would that take? How many days of research get sacrificed to the initial learning process, in order to get this benefit?

Obviously it’s impossible to put a simple figure on this, as it depends on the person’s background and the level of code complexity needed, but I would say that Python still has a quite considerable advantage here - partly because of the immense expressiveness of the language. We are not restricted to just what we can intuitively understand from grade school; operators like matrix multiplication are utterly meaningless to someone who’s just finished fourth-grade arithmetic, but are incredibly useful to a scientist who expresses concepts in matrices because it’s the most natural form for them.

We have a symbol for matrix multiplication because it is useful, not because it is pre-known by every single potential programmer. A None-coalescing operator isn’t taught in grade school, but that doesn’t mean it’s not useful.

vovavili · December 16, 2022, 4:29am

Well, researchers in bioinformatics learn regular expressions for their research. Most people learn the basics of regex quickly and learn advanced regex concepts like lookahead assertions and backtracking shortly after. And yet, I would still consider regular expressions without f-strings and re.VERBOSE to be borderline unreadable write-only code. So ease of acquisition and readability are not necessarily coextensive.

Rosuav · December 16, 2022, 4:45am

Agreed; a regex without re.VERBOSE is an exercise in compactness, but not particularly readable. That said, though, the expressiveness of a simple regex is quite good - it’s only really when they get overly complicated that they become hard to read, and that’s what re.VERBOSE is great at handling.

steven.daprano · December 16, 2022, 9:44am

Is it more terse than **, //, <= and the other dozen or so operators we use in Python? Do you find them “unreadable” too?

If it is no more terse than the other operators you accept, then it isn’t the terseness that you object to. It must be something else.

How about operators like &, |, %, ^? I didn’t learn about bitwise operators in primary school, or even secondary school. They are twice as terse as the ?? operator. Do you dislike them twice as much?

Programming languages are not designed for the casual reader. The casual reader might, just barely, grok functions from maths class in school, but they won’t grok async, type declarations, globals and locals, closures, zip, map, regexes, Unicode, classes, imports, context managers, exceptions, etc.

The beauty of Python is that it is accessible to casual programmers. You don’t have to use null-coalescing operators any more than you have to write classes, or use closures, or use threads.

But we didn’t let those casual programmers stand in the way of Python getting classes, closures, threads, async, regexes etc. Let the casual programmers continue to write using the basic features, and the power users use the power features.

vovavili · December 16, 2022, 5:29pm

I feel like mathematical notation by convention is taught from very early on to be extremely terse, and bitwise operations are effectively mathematical binary operations with a modified syntax.

Rosuav · December 16, 2022, 7:28pm

Are you saying that, because of that, mathematical operators are readable while terse, but other operators are unreadable while terse? If so, please explain to me the readability of the Willans formula for the nth prime number, which uses standard mathematical notation - plug in a number n and you get back the nth prime number, guaranteed! Try converting that into Python code and tell me whether it’s more readable in the terse mathematical form, or in a wordier form. And then, based on that, explain your above statement and how it affects a None-coalescing operator that has nothing to do with mathematics.

I don’t think there’s anything more going on here than the common phenomenon of familiarity. What you are already accustomed to is ALWAYS going to sit better in your brain than something you are not accustomed to. There’s nothing wrong with that phenomenon; just, please acknowledge it for what it is. Unless there really is something magical about mathematics, that’s not the factor here.

vovavili · December 17, 2022, 2:52am

No, I feel like mathematical notation is unreadable by a very established convention that we are introduced to since early school arithmetic. This is why most people are often afraid of college level mathematics, and why the popular culture has so much respect for few physicists and mathematicians with natural mathematical talent, like Albert Einstein or Kurt Godel. Capital-sigma notation, for example, is pretty much nothing but a terse for loop that would be a two- or three-liner in Python if spelled out in a readable way.

A person who does not believe that mathematical notation is in any way sui generis is probably better off stating his case for Perl and Haskell than for Python.

Rosuav · December 17, 2022, 4:08am

So how far does that convention go?

Addition and subtraction x+y
Multiplication x*y
Division x/y
Modulo/remainder operator x%y
Exponentiation x**y
Bitwise operators x&y
Conditional operators x||y
None-coalescing operator x??y
Subscripting x[y]
Comparisons x == y
Assignment x = y

Please tell me which of these are part of the “very established convention”, which ones are reasonable extensions to that convention, and which ones are not. All of them use terse punctuation in Python. I would say that addition is the only one that really counts as part of the established convention, with all the others being some degree of extension from that.

But mathematics is all about those kinds of extensions. We can count; but reversing the effect of counting might get us to meaningless numbers, so we extend numbers to include zero and negatives. We can multiply; but reversing that can lead to non-integers, so we extend numbers to include rationals. We can square numbers, but finding the square root might not work in all cases, so we extend numbers to include complex numbers. The Reimann zeta function diverges unless x>1, so we use analytic continuation to figure out what its value should be for other x.

I’m not sure about you, but I never learned about “the modulo operator” in grade school. Does that mean it’s bad, and we should instead have a mod(x, y) function? No! It’s a very useful operator. I certainly didn’t learn about assignment or comparison operators in algebra, yet we absolutely would not want to get rid of those. They are extensions to that original “very established convention”, but that convention is already a massive pile of extensions upon extensions just to get us the concepts that we grew up with.

vovavili · December 17, 2022, 4:49am

I am not saying that the notation that Python has is well established in mathematics, I am saying that terse notation in general is a well-established convention in mathematics. People are socialized from early on to express arithmetic operations concisely.

Rosuav · December 17, 2022, 4:54am

Yep. And I’m asking you: how far does that lead? See my list of examples. How many of them have their terseness justified by arithmetic?

vovavili · December 17, 2022, 5:46am

Addition and subtraction, multiplication, division, modulo/remainder operator, exponentiation are terse because arithmetic is terse; Bitwise operators and conditional operators are terse because notation for algebraic binary operations is terse; comparisons and assignment are ubiquitous from Fortran for almost all languages except for R (<-). So most terse operators in Python have an extremely strong overwhelming readability-related consideration to be terse, namely an extension of a highly established convention, usually mathematical.

Rosuav · December 17, 2022, 6:30am

So arithmetic is terse, therefore the concept “if this is empty then use that” is allowed to be terse (spelled x || y), but “if this is None then use that” is not allowed to be terse. Am I understanding you correctly?

Be honest here. Is it REALLY arithmetic that’s justifying your arguments, or is it actually familiarity?

vovavili · December 17, 2022, 6:51am

Nope, I don’t think of any category of what and what is not allowed. All I claim is that terse arithmetic notation is usually perceived to be more readable, but this is because mathematical notation in general is in a unique place. Outside this domain, people usually don’t find terse notation to be easy to read.

Rosuav · December 17, 2022, 7:35am

Okay. But then you claim that x || y is readable, but x ?? y isn’t? Or are you claiming that Python is already unreadable?

vovavili · December 17, 2022, 7:52am

I would imagine that people who do bitwise operations are quite at place with truth tables and mathematical logic notation, so there is no sacrifice of readability at all with these terse operators, both for the coder and the reader. A person who doesn’t understand Boolean algebra will not understand bitwise shifts, and a person who understands Boolean algebra will be at ease with using terse notation to describe bitwise shifts.

Of course, I am going from my gut feeling, so take my judgement with a pitch of salt.