I’m curious about Python’s handling of duplicate class attributes. In the following code snippet:
class Foo:
name = 'one'
name = 'two'
foo = Foo()
print(foo.name)
The attribute name is initially set to 'one' but is then overridden by 'two'. As a result, foo.name prints 'two'.
It seems like it would be beneficial for Python to raise an error or warning in such cases to prevent accidental overwrites. Could someone explain why Python allows this behavior and what implications enforcing uniqueness of class attribute names might have?
I can’t speak to your whole question but think I can address at least part of it. Someone else could provide a more complete answer.
Unlike many other languages you may be familiar with, Python doesn’t separate declaration and assignment of variables and attributes.
For example, in Java or C (and many other languages) you can write this:
// declares variable “foo”
int foo;
// foo now exists, but we never assigned it a value.
// Some languages will give it a default, some won’t.
// assigns a value, now we can definitely use “foo”
foo = 1;
// often combined:
int bar = 2;
Python doesn’t separate these concepts, instead variables are defined automatically on first assignment. They are less like bucket you store a value in, and more like a label you can put on a value.
For classes and objects in particular, you could think of their attributes as just variables attached to the object. These two are more or less equivalent:
class Example:
name = “hyrax”
# now I can use it:
print(Example.name)
class Example:
pass
Example.name = “hyrax”
# now I can use it:
print(Example.name)
If you see what looks like a class definition where the attributes have types but no values (like an @dataclass), it is more like syntax sugar.
if you apply a bit of introspection, there is only one variable with this name / label:
class Foo:
name = 'one'
name = 'two'
foo = Foo()
count = 0
for attrs in foo.__dir__():
if attrs == 'name':
count += 1
print(count)
print(foo.__dir__())
follow up … I don’t believe that it is a duplicate class attribute that you are creating. It is basically a reassignment of the original value (changing the value of the variable). The first instance is an assignment by which you are creating the new variable. The second instance is changing its original value.
Here is a snapshot of debugging in PyCharm. Notice how the value has been changed to two. It did not create a ‘new’ attribute.
As a simple test by stepping through the script in debug mode using Pycharm's IDE, the attribute values are automatically updated (if there is a change in their values) and shown to the right of the attributes / variables. Note that there is only one value shown for the attribute name. Namely: name: 'two'.
… by the same logic, this same principle can be applied to a function. Notice that a duplicate variable is not being created. Its value is merely being modified.
def some_function():
var1 = 100 # create local variable by assignment
var1 = 500 # change its value - duplicate is NOT being created
print(var1)
Of course, this is not something that you would do in practice but shown for demonstration purposes.
There’s a general attitude that emitting warnings about suspicious code is a job for static analysis tools / linters, not the runtime interpreter. This is explained a bit in this post from a related thread about emitting a warning for code like x == 2 or 3:
Even if a warning was added for this, it likely wouldn’t end up being that useful, since:
It couldn’t be enabled by default, as that would be disruptive to end users[1]
major linters already have rules that detect re-binding names without usage (I know PyCharm and Ruff do), so there’d be little benefit to experienced users
Inexperienced users that don’t know how to use a linter also likely wouldn’t know how to enable an optional warning (point in case, do you run Python with warnings enabled, e.g. with -Wall or -X dev? I know I rarely do )
This behavior is possible because the body of a class statement is just an “ordinary” namespace where statements are executed. The body of the class is executed like any other chunk of code, and whatever names are bound at the end of execution become attributes of the class object. You’re free to bind, re-bind, and even un-bind[2] names as you like, just like in any other scope.
You’re also not limited to simple assignment statements like a = b by the way. All sorts of statements are allowed, including many that perform assignments in exactly the same way that a classic assignment statement does (obligatory: Lots of things are assignments).
Assuming that by uniqueness you mean not allowing names in a class body to be rebound after their first assignment, then enforcing uniqueness would break a lot of code.
Parts of the standard library depend on the ability to re-assign names in class bodies. For example, property and typing.overload both make use of repeated assignments to the same name using def statements. It’s also not entirely unusual to use “multi-step initialization” as a way of simplifying complex assignments. For example, rewriting this:
class Foo:
name = a(b(c))
as this:
class Foo:
name = c
name = b(name)
name = a(name)
writing a warning to stdout would risk corrupting the output of a program, and many CI environments interpret anything written to stderr as a job failure. Both are undesirable ↩︎