Should I still use an instance attribute in this scenario?

Slurms_MacKenzie · September 20, 2023, 2:29pm

Hello all. I’ve got a question that might be a bit trivial, but I’m wondering if there is a best practice in this situation.

Some preamble: when creating a class, attributes that should be inherited by every instance of the class are defined outside of the __init__ method - so called class attributes. Not only does every instance inherit all the class attributes, but each attribute is initialized to a set value for every instance of the class.

Attributes that are not necessarily inherited by each instance, or those that are inherited by each instance but who’s value may differ between instances, are defined within the __init__ method - so called instance attributes.

So my question is this: if I have an attribute that should be inherited by all instances of a class, and for each instance it will be initialized to the same value, and it’s data type is not mutable (in this case just a boolean), but it’s value is intended to be modified by the program on an instance by instance basis, should I use an instance variable, or a class one?

Functionally I don’t think there is a difference, and it seems a class attribute can be used as it’s initial value is static, and need neither be determined by an argument passed to the class constructor, nor is it the result of an expression evaluated when an instance is created.

However, since in this case the attribute is fated to be modified on a per instance basis, I wondered if logically it made more sense that it should be an instance attribute?

jagerber · September 20, 2023, 3:59pm

I would say if you need per-instance control of the attribute it should be an instance attribute.

This would be my suggested approach:

class Foo:
    def __init__(self):
        self.bar = True


foo_1 = Foo()
foo_2 = Foo()

print(f'{foo_1.bar=}')
# True
print(f'{foo_2.bar=}')
# True

foo_2.bar = False

print(f'{foo_1.bar=}')
# True
print(f'{foo_2.bar=}')
# False

Compare with

class Foo:
    bar = True


foo_1 = Foo()
foo_2 = Foo()

print(f'{foo_1.bar=}')
# True
print(f'{foo_2.bar=}')
# True

foo_2.bar = False

print(f'{foo_1.bar=}')
# True
print(f'{foo_2.bar=}')
# False

Which has the same programmatic behavior. But Compare that with

class Foo:
    bar = True


foo_1 = Foo()
foo_2 = Foo()

print(f'{foo_1.bar=}')
# True
print(f'{foo_2.bar=}')
# True

Foo.bar = False

print(f'{foo_1.bar=}')
# False
print(f'{foo_2.bar=}')
# False

This third code block shows how class attributes are intended to be used. That is, they are meant to be modified at the class level, not the instance level. The modification at the class level then modifies the result that will be read out at the instance level.

The second code block demonstrates, confusingly, that it IS possibly to modify a class attribute at the instance level. This seems to “convert” the class attribute to an instance attribute on that class.

The fundamental issue is that if you use a pattern like code block 2 then it is very possible for you or other readers/writers of the code to confuse the behavior with that in code block 3. For example, when answering this question, I thought code block 2 would behave like code block 3 but I had to test it out and look up how it works. With code block 1 it is straightforward to understand how the code will work and this is the idiomatic approach to setting up instance attributes. I would consider code block 2 to be an anti-pattern.

kknechtel · September 20, 2023, 5:09pm

This conceptual model is wrong. Class attributes are not “inherited” at all, and they aren’t “set for” instances. They’re called class attributes because they actually belong to the class itself, and not to instances. In Python, everything is an object, including the classes themselves. A class is a real, first-class object - not just something that can have some object representing it using some “reflection” library. You can assign the class to a variable, pass it to and return it from a function… and modify and inspect its attributes.

When you use a class attribute, this is not “initializing” the value of that attribute for the instances. It’s assigning a value to an attribute of an entirely separate object. Python has a special rule that when you look up an attribute in an object and it isn’t found there, the object’s class also gets checked (and if something is found there, Python first looks for a __get__ method on that object and calls it if found, and otherwise gives the found object itself). This is the system that allows methods, properties and other descriptors to work; it’s why if you try to assign a “per-instance method” it won’t have a self passed to them implicitly (it was found directly in the object, so the descriptor protocol wasn’t used); and it’s why reassigning a class attribute doesn’t “affect” all instances (a new attribute was assigned into the instance that shadows the class attribute) but mutating it does (it’s the same object being “seen by” all instances).

Class attributes are confusing and rarely what you want. What you propose will describe as worked, and there is indeed a difference - it can save memory, as long as there’s more than one instance of the class in your system that never gets its own value assigned (and it can do even better if there’s something preventing per-instance values from being interned). (Of course, if every instance gets a corresponding attribute assigned, the class instance is simply redundant.)

But it’s confusing.

Slurms_MacKenzie · September 20, 2023, 10:49pm

Thanks, what you propose makes sense. As @kknechtel pointed out, using a class attribute will save some memory if there is more than one instance of the class for which an overriding instance attribute has not been set

In my program there are 3000 instances of the class so it’s relevant in my case. However I also agree with the assessment that it’s (code block 2) is something of an anti-pattern, some might even describe it as maverick

I think in my case I’ve got to take into account that I’m writing the program to run on my desktop, and memory isn’t a constraint. For that reason I don’t think it makes sense to write some slightly unconventional code to save a few booleans worth of memory in this case (metaphorically speaking). Of course if I was working with a little single board computer, etc., then it could be different.

Slurms_MacKenzie · September 20, 2023, 11:14pm

Ah, sorry. As I was typing it out it did occur to me that it was likely about to get shot down in flames. It’s probably something akin to a kiddies mental model that allows one to start writing code, but one that doesn’t reflect pythons internal workings.

To be honest your own explanation makes much more sense, especially re: checking an object’s class if a given attribute isn’t found in the object itself. It also makes sense as if everything in the class were copied directly into each instance, I’m assuming the amount of memory taken up by all the methods in the class would be consumed again for each instance?

Another thing I did wonder is that if using a class attribute can save memory under certain conditions, would it also be a little more expensive in terms of CPU time, as I would imagine there are more steps involved in checking an instance and then also checking it’s class.

jagerber · September 20, 2023, 11:23pm

I see. Saving memory is a different issue. In that case you might try something like:

from dataclasses import dataclass


@dataclass
class HeavyData:
    heavy_bool: bool = True


class UnitializedSentinel:
    pass


class Foo:
    class_default_data = HeavyData(True)

    def __init__(self):
        print('Starting initialization...')
        self.data = UnitializedSentinel
        print('Initialized')

    @property
    def data(self):
        if self._data is UnitializedSentinel:
            print('Using class default data')
            return self.class_default_data
        else:
            print('Using instance data')
            return self._data

    @data.setter
    def data(self, val):
        print('Setting instance data')
        self._data = val


foo_1 = Foo()
foo_2 = Foo()

print(f'{foo_1.data.heavy_bool=}')
print(f'{foo_2.data.heavy_bool=}')

foo_2.data= HeavyData(heavy_bool=False)

print(f'{foo_1.data.heavy_bool=}')
print(f'{foo_2.data.heavy_bool=}')

Which results in

Starting initialization...
Setting instance data
Initialized
Starting initialization...
Setting instance data
Initialized
Using class default data
foo_1.data.heavy_bool=True
Using class default data
foo_2.data.heavy_bool=True
Setting instance data
Using class default data
foo_1.data.heavy_bool=True
Using instance data
foo_2.data.heavy_bool=False

We configure the data attribute to be a property so that we can make the decision to fall back on the class default option if the instance data has not been set.