The behavior is documented under Naming and binding:
The scope of names defined in a class block is limited to the class block; it does not extend to the code blocks of methods
Scoping rules are handled by the parser (the first stage of compilation), not the runtime, so there are no code object flags for it. Specifically, this is a function of the symbol table, which is where the compiler would look for names in nested scopes if the name is not available in the current function.
You can access Python’s symbol tables with the symtable
module:
>>> from symtable import symtable
>>> tab = symtable("""\
... a = 1
... class A:
... a = 2
... def m():
... assert a == 1
... m()
... assert a == 2
... """, "<stdin>", "exec")
>>> tab
<SymbolTable for top in <stdin>>
>>> tab.get_type()
'module'
>>> tab.lookup("a")
<symbol 'a'>
>>> tab.lookup("A").is_namespace() # class A has a namespace
True
>>> class_tab = tab.lookup("A").get_namespace()
>>> class_tab.get_type()
'class'
>>> A_tab.lookup("m").is_namespace() # so does the A.m function
True
>>> A_m_tab = A_tab.lookup("m").get_namespace()
>>> A_m_tab.get_type()
'function'
You can use the SymbolTable.lookup()
method to find where names come from when looked up in different scopes. This returns a Symbol
instance which you can use to figure out where a name came from, or to find nested scopes (with the Symbol.get_namespace()
method I used above).
For closures to be created, you need a reference to the variable in a child scope (so non-local names that are sourced from a parent scope other than the module-level (global) scope). Such references will be marked as free, as this needs to be recorded in the code object for that scope. You’ll note that the name a
in the method body is not marked as free, even though it is not a local variable:
>>> A_m_tab.lookup("a").is_free()
False
>>> A_m_tab.lookup("a").is_local()
False
If it is not a free variable and it is not a local, it must be a global; the A.a
name is not available to the nested scope of the m
function. And indeed, "a"
in that scope is a global instead:
>>> A_m_tab.lookup("a").is_global()
True
In fact, the namespace of m
is marked as not nested; free variables can only exist in nested scopes:
>>> A_m_tab.is_nested()
False
The symtable
module simply uses the same PySymtable_BuildObject()
function that the Python compiler uses to give you access to this information. The normal compilation steps are:
- parse source code into an AST (using
PyParser_ASTFromFile()
or PyParser_ASTFromString()
), call PyAST_CompileObject()
with the result.
PyAST_CompileObject()
uses PySymtable_BuildObject()
to produce a symbol table from the AST.
PyAST_CompileObject()
then use the AST and symbol table to produce bytecode, grouped into code objects per namespace, by calling the compile_mod()
function.
It is PySymtable_BuildObject()
that determines scopes; it walks the AST that the parsing stage has produced and calls various functions in a visitor pattern to output the symbol table. For classes, the symtable_visit_stmt()
function calls symtable_enter_block()
with the _Py_block_ty block
parameter set to ClassType
, which informs how it records names.
If you want to track how the symbol table treats the class scope, you could start by searching through symtable.c
for ste->ste_type == ClassBlock
tests, and focus on the analyze_block()
function. The symtable.c
source code has helpful comments such as:
/* Analyze raw symbol information to determine scope of each name.
The next several functions are helpers for symtable_analyze(),
which determines whether a name is local, global, or free. In addition,
it determines which local variables are cell variables; they provide
bindings that are used for free variables in enclosed blocks.
There are also two kinds of global variables, implicit and explicit. An
explicit global is declared with the global statement. An implicit
global is a free variable for which the compiler has found no binding
in an enclosing function scope. The implicit global is either a global
or a builtin. Python's module and class blocks use the xxx_NAME opcodes
to handle these names to implement slightly odd semantics. In such a
block, the name is treated as global until it is assigned to; then it
is treated as a local.
The symbol table requires two passes to determine the scope of each name.
The first pass collects raw facts from the AST via the symtable_visit_*
functions: the name is a parameter here, the name is used but not defined
here, etc. The second pass analyzes these facts during a pass over the
PySTEntryObjects created during pass 1.
When a function is entered during the second pass, the parent passes
the set of all name bindings visible to its children. These bindings
are used to determine if non-local variables are free or implicit globals.
Names which are explicitly declared nonlocal must exist in this set of
visible names - if they do not, a syntax error is raised. After doing
the local analysis, it analyzes each of its child blocks using an
updated set of name bindings.
The children update the free variable set. If a local variable is added to
the free variable set by the child, the variable is marked as a cell. The
function object being defined must provide runtime storage for the variable
that may outlive the function's frame. Cell variables are removed from the
free set before the analyze function returns to its parent.
During analysis, the names are:
symbols: dict mapping from symbol names to flag values (including offset scope values)
scopes: dict mapping from symbol names to scope values (no offset)
local: set of all symbol names local to the current scope
bound: set of all symbol names local to a containing function scope
free: set of all symbol names referenced but not bound in child scopes
global: set of all symbol names explicitly declared as global
*/
where we learn that names that are not bound to in a scope start as implicit globals, and
/* Allocate new global and bound variable dictionaries. These
dictionaries hold the names visible in nested blocks. For
ClassBlocks, the bound and global names are initialized
before analyzing names, because class bindings aren't
visible in methods. For other blocks, they are initialized
after names are analyzed.
*/
and
/* Class namespace has no effect on names visible in
nested functions, so populate the global and bound
sets to be passed to child blocks before analyzing
this one.
*/
These tell us that any additions to the global and bound sets made in the class body are not passed on to child scopes; they are never used to help determine the scope of names in child scopes.
And finally:
/* Check if any local variables must be converted to cell variables */
if (ste->ste_type == FunctionBlock && !analyze_cells(scopes, newfree))
goto error;
else if (ste->ste_type == ClassBlock && !drop_class_free(ste, newfree))
goto error;
So A.a
in the class scope is not passed on to recursive analyze_block()
calls, and names in class scopes can’t be closure cells. The above decides between function scopes and class scopes, marking applicable free variables from child scopes as closures in the function scopes only. Names in class bodies never can become closures, A.m
can’t find A.a
because the above analysis excludes locals in class bodies, and the a
reference in an expression in A.m
remains an implicit global, and so lookups find the global a
instead of A.a
.