None - a "billion dollar mistake", as null is?

alexpovel · February 20, 2024, 4:19pm

I just came across the returns package. Its documentation states:

None is called the worst mistake in the history of Computer Science.

which went against my understanding of things. I always saw Python’s None and other languages’ null (C++, C#, Java, …) as totally different, and would like to sanity check my understanding. The “billion dollar mistake” quote is strictly about the null idea of things, not the None group language constructs, as far as I know. In that context, the quote would be mistaken, and I’d love to hear other opinions on this!

So, None in Python is mostly sane and safe. A guard of e.g. isinstance(some_obj, SomeType) will reliably protect against “null dereference”: once passed (isinstance is True), there’s a guarante the correct type is at hand. None is safely excluded, as it is entirely outside whatever type hierarchy we’re inside of:

x = 42
NoneType = type(None)
if not isinstance(x, NoneType):
    pass

Dereferencing anything that is not NoneType is safe in the sense that no AttributeError on NoneType is raised, which seems closest in spirit to null dereference exceptions (but is totally different still, as Python doesn’t have a concept of null pointers). In the null family of languages, null is a valid value common to all reference (not value) types; so even after checking for “is some_obj an instance of SomeType”, some_obj might still blow up as null on dereference. This is not the case for Python (although it might blow up for other reasons, like accessing a non-existing member); e.g., in a properly typed-checked code base,

def set(x: SomeType) -> SomeType:
    x.some_member = 42
    return x

will never blow up for reasons of None, whereas similar constructs might very well in null languages (despite these usually being statically typed by nature already).

So how do you feel about the “billion dollar mistake” quote in the context of Python and its None? Is it applicable?

Rosuav · February 20, 2024, 4:21pm

For starters, I disagree that null is a billion dollar mistake. Bad behaviour around dereferencing nulls could be considered that (although, frankly, I think the cost is overblown - there are FAR worse problems out there), but nulls themselves are definitely not the problem here.

So, that said, I agree with you that Python’s None is definitely not a major problem, but not that it’s materially different from nulls.

MegaIng · February 20, 2024, 4:35pm

“billion dollar mistake” only applies within the context of statically typed languages where languages like C and especially Java give you the illusion of safety. That fact that in python anything might be anything requires you have a lot more checks in place.

Don’t forget that within python, isinstance(x, DataType) if DataType isn’t a builtin doesn’t guarantee you anything. You don’t know what attributes it might have, someone might have deleted those, or could have created an instance without calling __init__, … Python’s dynamic typing by itself is already the “billion dollar mistake”.

bschubert · February 20, 2024, 5:21pm

The “billion dollar mistake” isn’t the existence of null/None; it’s having all references be implicitly nullable.

As you pointed out, this is not the case in (statically type-checked) Python. None isn’t compatible with every other type in the same way that null is in other languages. So in that sence, Python didn’t make the “billion dollar mistake.”

p.s.: regarding isinstance(some_obj, SomeType) and isinstance(some_obj, NoneType), those actually aren’t reliable ways to check for None. It’s possible to have isinstance(None, SomeType) be True even when SomeType is unrelated to NoneType, and it’s possible to have isinstance(x, NoneType) be True even when x is not None. The proper way to check for None or other sentinels is with is/is not, as recommended in PEP 8.

barry-scott · February 20, 2024, 5:22pm

The quote is about NULL not None.

chepner · February 20, 2024, 8:21pm

The “billion dollar mistake” referred specifically to the use of an invalid reference as a special sentinel. From the inventor of said mistake (emphasis mine):

I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.[29]

None is not an invalid reference; it’s just the single value of its type, defined specifically to refer to the kind of thing ALGOL used an invalid reference for.

Having a single type stand in for the lack of value of any type is arguably a mistake (though one that has more to do with arguing the merits and drawbacks of dynamic vs static typing), but not to the same extent as leaving open the possibility of, say, a segmentation fault.

USSX-Hares · February 20, 2024, 9:13pm

That’s why modern languages with powerful type systems (Rust, Scala, Haskell) avoid null wherever possible (Scala IDEs even show warnings when null is presented in the code). They use a special type-guard Option[T] (or sometimes Optional[T]) which can be Some[T] or None[T], where the last is singleton constant. The key difference between this type guards and Python’s Optional is the fact it explicitly allows no-value when the latter implicitly allows it everywhere and behaves more like Java’s @Nonnull and @CheckForNull.

alexpovel · February 21, 2024, 9:12am

Thanks for all the replies! I think the core, yet simple insight is:

I do love me some Option<T> with its None, it feels great with pattern matching. The part (slightly offtopic to the original question…) I never fully grasped so far is why Option<T> is generally treated as the holy grail (don’t get me wrong, it’s great), when Python 3.10+ with comparatively simple, imperative tools gets most of the way there as well:

from dataclasses import dataclass


@dataclass
class Movie:
    name: str
    year: int


MaybeMovie = Movie | None  # A union of a type and a *concrete value* of a type (`NoneType`) 🤔
# type MaybeMovie = Movie | None  # This Python 3.12 syntax not yet supported in mypy 1.8


def check(maybe_movie: MaybeMovie) -> str:
    match maybe_movie:
        case Movie(name=name, year=year):
            return f"Got movie {name} from {year}"
        case None:
            return "Got nothing"


maybe_movies: list[MaybeMovie] = [
    None,
    Movie(name="Titanic", year=1997),
]

for maybe_movie in maybe_movies:
    print(check(maybe_movie))

This gets surprisingly close to e.g. Rust in style, looks, but also semantics. It passes mypy --strict. Removing a match arm causes type checks to fail (although adding bogus and unreachable cases also passes mypy; it’s also only a warning, not an error, in Rust by default, so seems fair enough).

Note that this is explicit as well! MaybeMovie and Movie are distinct, and we opted into accepting None in check’s signature, as we’d do with Option<Movie>. We cannot forget to check for None without failing type checks. Of course, guarantees in Rust and friends are much stronger; code will not compile at all if one is forgetful, whereas limitations or bugs in mypy (an optional, external tool no less) are always possible. Mypy lagging behind Python syntax (type statement) is a good example.

Anyway. I feel like Python doesn’t get enough credit in this area, and is facing an uphill battle. Maybe in another year or 10…