Currently, I’m reading Tutorial, Class Definition Syntax to learn some basic grammar about Python, especially the part of how Python creates and resolves the binding between name and object.
Please look at the following code:
def C():
breakpoint() # 1
class C:
breakpoint() # 2
C
C()
According to the tutorial:
When a class definition is entered, a new namespace is created … …
When a class definition is left normally, a class object is created.
This is basically a wrapper around the contents of the namespace created by the class definition.
The original local scope (the one in effect just before the class definition was entered) is reinstated, and the class object is bound here to the class name given in the class definition header.
When Python arrived at breakpoint-2, it need to resolve the name C.
Here’s what I understand about scope (environment) chains at that time:
The local scope of functionC points to the global scope, thus though there’s no name C in the local scope when Python reached breakpoint-1, it can still find C in the global scope.
When Python reached breakpoint-2, the class object was not bound to its name yet (according to the tutorial).
Python wanted to resolve the name C, so
it first consulted the current local scope,
then it looked up the outer scope created by the invocation to the function C,
finally the C was found in the global scope. It turned out to be the name of function C.
However, Python instead throws an error when I run the above code:
NameError: cannot access free variable 'C' where it is not associated with a value in enclosing scope
Am I missing something?
I asked this question to learn about Python’s scope and name resolution mechanism, not to solve a practical problem, so that’s all the background of this question.
Python 3.12.2 (tags/v3.12.2:6abddd9, Feb 6 2024, 21:26:36) [MSC v.1937 64 bit (AMD64)] on win32
So… a name (function C) can be shadowed by one (class C) that was bound to nothing (?)
This sounds strange, really.
The name C is first registered in the local scope created by the invocation to the function C, then Python executes the class definition. Finally it binds the name to the class object.
Is this how Python works?
In Python, variables that are only referenced inside a function are implicitly global. If a variable is assigned a value anywhere within the function’s body, it’s assumed to be a local unless explicitly declared as global.
So distinction on should be made between referencing and assigning value. In case of def f(): a = a + 1 assigning value to a (local) is made by using a (local) which is not bound yet.
As of why it’s so (from referenced documentation):
Though a bit surprising at first, a moment’s consideration explains this. On one hand, requiring global for assigned variables provides a bar against unintended side-effects. On the other hand, if global was required for all global references, you’d be using global all the time. You’d have to declare as global every reference to a built-in function or to a component of an imported module. This clutter would defeat the usefulness of the global declaration for identifying side-effects.
Of course, Python doesn’t prevent shooting yourself in the foot so one can be “clever” with this:
>>> a = 1
>>> def change():
... a = a + 1
...
>>> a
1
>>> change()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in change
UnboundLocalError: cannot access local variable 'a' where it is not associated with a value
>>> a = [1]
>>> def change():
... a[0] += 1
...
>>> a
[1]
>>> change()
>>> a
[2]
def C():
def __class_scope_C__():
C
C = __class_scope_C__()
C()
And this code does throw the same NameError. The python compiler sees that C is assigned in the outer function scope and therefore makes it a local variable there.Then inside the class scope the name C isn’t local, so it’s looked up in the upper lexical scope, where it does find a definition.
An almost exact desuggaring of a class definition in this case is something like this:
def C():
def C():
__module__ = __name__
__qualname__ = 'f.<locals>.C'
C
C = __build_class__(C, 'C')
Where __build_class__ is a builtin function that does actually exists (but is undocumented and internal to CPython) and calls the first passed in callable to run the class body, with the corresponding class namespaces as the locals. This transformation can’t be expressed with normal python code, because the compiler treats the scope slightly differently. The above code doesn’t quite work out, the inner assignments aren’t put into the class namespace because they should be using the opcode STORE_NAME (to go into the locals dictionary) instead of STORE_FAST (which uses a faster index-based system), which can’t be forced with normal python code.