How name resolution works in Python

I want to understand the name resolution from the perspective of AST. Till now what I have understood is that name resolution occurs in two step:

  1. First step fills the scope table by visiting every declaration AST node like variables, functions, classes etc. and also resolve the occurence of identifiers.
  2. This step resolves identifier in function call etc.
    My question is how python resolves variables, that is is enclosing scope all the way up in the chain of nested scopes open for searching the name or is there any restriction ? is the name resolution works the same for both variable name and function name ?

For access, Python will check the local scope, any enclosing scopes in order, then global and builtin scopes. Binding (assignment) makes the variable local unless either nonlocal or global keywords are used. global will make it use the global scope directly, and nonlocal will check enclosing scopes in order to choose which is assigned. (It is not possible to assign to the builtin scope).

So, when the code is compiled, Python checks for a global or nonlocal statement for the name, and then checks for any kind of name binding (this includes assignment statements, function and class definitions, the loop variable in for loops, with and except statements, and probably a few other things I’m forgetting). At this point, if it knows the name should be nonlocal (because of such a statement), then it searches the enclosing scopes to find the right one (we know this happens ahead of time, because we will get SyntaxError if there is no enclosing scope that has the name). If it knows that the name should be local (because it was assigned with no global or nonlocal statement), or that it should be global (because of a global statement), those cases are straightforward. Otherwise, it has to search every scope, and then creates the bytecode for the correct kind of access.

Reference:

Function names are variable names, so yes. They have the same namespace; if I write x = 1 then I can also write def x(): pass in the same scope, and that makes x stop being 1 and start being the function.

1 Like

Note that class bodies allow a nonlocal to get dynamically shadowed by the locals mapping. For example:

def f():
    x = 1
    class A:
        print('class x, 1st:', x) 
        locals()['x'] = 2
        print('class x, 2nd:', x)
    print('func x:', x)
>>> f()
class x, 1st: 1
class x, 2nd: 2
func x: 1

The compiler classifies “x” as a free variable (nonlocal) in the class body:

>>> f.__code__.co_consts[2].co_name
'A'
>>> f.__code__.co_consts[2].co_freevars
('x',)

But it gets referenced using the special opcode LOAD_CLASSDEREF.

>>> dis.dis(f.__code__.co_consts[2])
              0 COPY_FREE_VARS           1

  3           2 RESUME                   0
              4 LOAD_NAME                0 (__name__)
              6 STORE_NAME               1 (__module__)
              8 LOAD_CONST               0 ('f.<locals>.A')
             10 STORE_NAME               2 (__qualname__)

  4          12 PUSH_NULL
             14 LOAD_NAME                3 (print)
             16 LOAD_CONST               1 ('class x, 1st:')
             18 LOAD_CLASSDEREF          0 (x)
             20 CALL                     2
             30 POP_TOP

  5          32 LOAD_CONST               2 (2)
             34 PUSH_NULL
             36 LOAD_NAME                4 (locals)
             38 CALL                     0
             48 LOAD_CONST               3 ('x')
             50 STORE_SUBSCR

  6          54 PUSH_NULL
             56 LOAD_NAME                3 (print)
             58 LOAD_CONST               4 ('class x, 2nd:')
             60 LOAD_CLASSDEREF          0 (x)
             62 CALL                     2
             72 POP_TOP
             74 RETURN_CONST             5 (None)
2 Likes

Man, usually I just tell myself “remember that classes don’t actually create a scope”. But actually it’s a lot more complicated than that. Thanks for drilling deep into that.

1 Like

A class body executes as an unoptimized function, which uses a locals mapping instead of fast locals. The locals mapping ultimately gets copied as the __dict__ of the instantiated type object. A class body also creates a __class__ closure for methods that either use super() without arguments or that explicitly reference __class__. The cell object for the closure gets stored as a local variable named __classcell__, which gets processed by the LOAD_BUILD_CLASS opcode. Thus manually assigning to __classcell__ isn’t supported. For example:

>>> class A:
...     __classcell__ = 1
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __classcell__ must be a nonlocal cell, not <class 'int'>

A name that’s assigned to in a class body is implicitly declared local (i.e. to use LOAD_NAME and STORE_NAME), unless it’s explicitly declared global (i.e. to use LOAD_GLOBAL and STORE_GLOBAL) or nonlocal (i.e. to use LOAD_CLASSDEREF and STORE_DEREF).

Referencing an unoptimized local variable via LOAD_NAME has always been a bit peculiar in Python, because it falls back on checking globals and builtins. It’s set up this way to allow temporarily shadowing a global or builtin name. For example:

>>> x = 1
>>> class A:
...     print('x, 1st:', x)
...     x = 2
...     print('x, 2nd:', x)
...     locals()['x'] = 3
...     print('x, 3rd:', x)
...     del x
...     print('x, 4th:', x)
... 
x, 1st: 1
x, 2nd: 2
x, 3rd: 3
x, 4th: 1

Because locals in a class body is just a mapping, a local variable can be assigned dynamically, such as locals()['x'] = 3.

For a free variable from an outer function scope that’s referenced in a class body, the LOAD_CLASSDEREF opcode tries to capture the dynamism and non-local fallback of the classic LOAD_NAME opcode. However, I think it’s a mistake that the compiler doesn’t switch to using the regular LOAD_DEREF opcode when a variable in the class body is explicitly declared nonlocal. The latter would be consistent with how the compiler switches to using LOAD_GLOBAL instead of LOAD_NAME when a variable in a class body is declared global. For example:

>>> x = 1
>>> class A:
...     global x
...     print('x, 1st:', x)
...     locals()['x'] = 2 # ignored because x is global
...     print('x, 2nd:', x)
... 
x, 1st: 1
x, 2nd: 1
3 Likes