Why does Python have variable hoisting like Javascript?

Python only creates lexical scopes in functions, classes and modules.
As for why, I guess it is for homogenity: unlike ALGOL-like languages
there’s no way to create a block/scope on-the fly anyway. The for
loop (and later with) is the only case where a name is bound without
explicit assignment, so it behaves as-if it is assigning the name
in the current scope every iteration.

As for why it stays this way: backward compatibility.
A lot of code will break will break if we change the semantic now,
instead static analysis would be more suitable to enforce what you want.

It is not the job of the compiler to catch every imaginable error. There
are many circumstances were the human programmer knows better than the
compiler that something that looks like an error is not. For example:

I may know that a always contains at least one prime number. I might
know that f is only called inside a try…except block that will catch
the exception.

Or I might know that the function can fail, but don’t care, because the
code is only used in circumstances where a failure doesn’t matter.

Or I might be working under a development model based on the idea of
software contracts: the function requires a non-empty list containing
at least one prime, and if the caller provides something else, the bug
is in the caller’s code, and is not the responsibility of f.

So it is not clear at all that the interpreter should refuse to
compile the above function, or refuse to allow it to execute. It would
be a poor user experience for the interpreter to insist that I fix
non-bugs that can never happen before I can run the code.

In a very practical sense, I often write and run code which I know is
“buggy” in some purely pedantic sense. I have the interactive
interpreter open, I write a quick function to test some behaviour, or to
compute some result, and run it once only to discard it afterwards. Code
just like Guido’s f above. I know it is “buggy” in the sense that it
doesn’t handle an empty list or a list with no primes correctly, and I
literally don’t care because it works under the circumstances I am
calling it, and that’s all that matters.

Fundamentally, Python is a scripting language which is excellent for
this sort of thing, and making the interpreter strictly refuse to
compile or execute “buggy” code would be a huge regression.

2 Likes

Anyway, on my bike ride today I realized that the key here is that Python uses function scopes – unlike C++, any variable defined anywhere in a function has that whole function as its scope.

Here is another common pattern where Python benefits from this:

def f():
    if foo():
        x = bar()
    else:
        x = baz()
    do_something(x)

Supporting this is a necessary part of Python not having separate variable declarations. You just assign it, and it’s a local you can use :magic_wand:.
It doesn’t matter to the parser on what code path the assignment happens, or whether it happens at all (but in the latter case UnboundLocalError catches the bug).
If it had block scope, like C / C++, the above would define 2 separate x variables, none of them accessible from do_something(x) :confused:. In such language you’re forced to have a separate declaration line to clarify which scope you intended:

    SomeType x;  // separate declaration line needed!
    if (foo()) {
        x = bar();
    } else {
        x = baz();
    }
    do_something(x);

Consider also Go which is good example of block-scoped language where it’s mostly painless thanks to syntax sugar to declare and initialize in same line, x := bar().
But semantically that’s still declaration + assignment, and logic like the above forces you to split them up:

    var x SomeType
    if foo() {
        x = bar()  // `=` is assignment to existing variable
    } else {
        x = baz()
    }

[P.S. Technically, all forms of “binding” — assignment, for loops etc. — also tell the parser “this name is a local”. Arguably that’s also syntax sugar for a top-of-function declaration, just sweeter :yum: and in the rare case when you want to mutate a variable outside the function, you do need an “anti-declaration” global / nonlocal.]