On The Topic of Extending Type by Subclassing

Hello Pythonistas!

I am currently on the topic of extending types by subclassing. In the example below, the type list is extended by way of the subclass Set via inheritance. Now, my understanding is that self represents the instance object when the class is instantiated. However, here apparently, it also represents the list class type that is imported. In a round about way, self represents both the instance and the list class. Can someone please shed some light how this works? Generally, I am accustomed to changing an attribute’s value by the following syntax (a.k.a, qualifying).: self.some_attribute.append(some_value). As shown below, in the sample code, it is done as:

self.append(x)   # directly to self

Here is the sample code:

class Set(list):  # 'Set' is subclassed to built-in 'list'
    
    def __init__(self, value = []):   # Constructor
        print('__init__')
        list.__init__([])             # Customizes list - clear it each time
        self.concat(value)            # Copies mutable defaults
        

    def intersect(self, other):       # other is any sequence
        print('intersect')
        res = []                      # self is the subject

        for x in self:
            if x in other:            # Pick common items
            
                res.append(x)
        print('res = ',res)
        return Set(res)               # Return a new Set - becomes reference for next operation

    def union(self, other):           # other is any sequence
        print('\nunion')
        
        res = Set(self)               # Copy me and my list first
        res.concat(other)
        
        return res
    
    # Append only if not already in original list
    def concat(self, value):          # value: list, Set, etc.
        print('concat')
        for x in value:               # Removes duplicates
        
            if not x in self:
                
                self.append(x)


    def __and__(self, other): return self.intersect(other)
    def __or__(self, other):  return self.union(other)
    def __repr__(self):       return '\nIn __repr__ Set:' + list.__repr__(self)

if __name__ == '__main__':
    
    x = Set([1,3,5,7])   # Create instance 'x' object
    y = Set([2,1,4,5,6]) # Create instance 'y' object

    print('\nBegin actions!\n')
    print(x.intersect(y), y.union(x))

Any insight would be appreciated.

You instantiate a Set instance:

 x = Set([1,3,5,7])   # Create instance 'x' object

Now x is a reference to the instance. An object is… whatever it is,
with a reference to its type (class). The class/type specifies where
Python looks for methods: on the class itself, and then if not found, on
the classes in the method resolution order (MRO), which is available as
the class’ .__mro__ attribute. The MRO is computed when you define the
class.

So if you like, you can imagine this as a list which looks for a
method in the Set class before it looks in the list class.

So when you go eg:

 x.append(3)

Python needs to locate x.append. It looks for append in Set, and
then in list. That’s also how overriding a method works: if you
defined an append method in the Set class it would be found there
and used, and Python wouldn’t get as far as looking for it in list.

You can look at the MRO BTW:

 print(Set.__mro__)

Cheers,
Cameron Simpson cs@cskk.id.au

Hi,

thank you for responding to my query. However, my misunderstanding has to do more with self.append(some_value) as opposed to self.some_attribute.append(some_value) within the class method concat.

Can you please elaborate?

No it doesn’t. self always refers to the instance, not the class. What makes you think that?

Let’s say you has a Set instance bound to your variable x.

When you call x.append(some_value) Python does this in 2 steps:

  • compute x.append, which becomes what’s called a “bound method”,
    which is a partial function being the append method bound to your
    x object.
  • call the bound method with (some_value) and return the result.

When you call a bound method, the source object is provided as the first
parameter to the method itself. So the method gets called as:

 Set.append(x, some_value)

Inside the append method these are bound to the parameters self and
value, and you use those named there.

This isn’t really any different to a normal function:

 def add2(a, b):
     return a + b

 x = 1
 y = 2
 print(add2(x, y))

Outside the function you’re talking about x and y. inside the
function, those values are given the names a and b.

It’s the same with a method: your source object is provded as self
(that name is just a convention, but it is universal) and the
some_value is provided as value.

self only ever means the object, not the class. But attribute lookup does not always give you something that is “physically” part of the object (i.e. stored in its __dict__). When that lookup fails, Python checks in the object’s class as well. Further, if that process finds something with its own __get__, it calls that (this is the descriptor protocol - the hook that allows both methods and properties to work).

Using self.append inside the class definition, instead of list.append etc., is the same as how you write mylist.append(1) instead of list.append(mylist, 1) in ordinary code (although the latter also works).

I can’t understand how you came up with the x.append possibility. Within your concat code, x means the thing that will be “appended” to the Set. It doesn’t mean the thing that offers append functionality. That’s what self is.

This part has nothing to do with inheritance. It’s the same as when you write any ordinary class and have its methods call each other. In your own method’s code, self means the instance of the class, with which you are currently working. def concat(self, value): is called, and self becomes an instance of your Set; when you use self.append, that means to use the append method that applies to the same instance.

It’s just that inheritance allows “its methods” to also mean methods that were implemented by the base (such as append).

Just some nitpicks

The list.__init__([]) is not doing anything. You can just write

def __init__(self, value):
    self.concat(value)

Note that I didn’t not assign default argument value=[]. Never! ever! use a mutable object as a function’s default argument, this leads to lots of unexpected behavior. In general, for this kind of logic you should write it as

def __init__(self, value=None):
    if value is None:
        value = []
    # proceed as normal

But in this particular case, you could “cheat” like this

def __init__(self, value=()): # default argument is an empty tuple
    self.concat(value)

This works because a tuple is immutable, and the concat method does depend on the fact that its argument must be a list, so a tuple works too.


TLDR: Maybe this deviates too much from what you’re asking, but I hope it helps. I think part of your confusion is the unnecessary list.__init__([]) line, so I must point it out. The rest are just nitpicks I found along the way.

1 Like

When I include the following command in the concat method right after the self.append(x), it prints the list:

        print(self)

>>> [2, 1, 4, 5, 6]  # result - I commented out the __repr__  method for the moment

Yes, I understand that x is the instance (object). However, the part that is a bit confusing to me at the moment is, as I stated above, is that the test code is using self.append(x)

Sorry, a misunderstanding, I was eating /cooking (multi-tasking if you will) - wires must have gotten crossed. I fixed it. Please re-reference the post above. :blush:

Yes, I figured as much after commenting it out and observing the results. But, this is code from the book, so to honor the code that was presented to me, I did not make any edits. I wanted to provide you with the exact code, without edits, that was presented to me.

This is a textbook example code. The book has already been published, so not much that I can do there. :wink:

Thank you for the insight. I will tuck this away into my Python toolbag.

That’s expected behavior. self is the instance object, so print(self) does the same as a = ["any", "random", "list"]; print(a). It’s not the same as print(list), so it shows that self is not the class object.

The fact that you commented out __repr__ doesn’t matter. Your class inherits from list so it will have list’s default __repr__, you only need to write your own repr if you want to change it.

I wouldn’t go that far:

But yes, it’s generally confusing and error-prone.

1 Like

Why not?

:grin:

Yes you can: :open_book::arrow_right::wastebasket:. From what I’ve seen this is horrendous code and not fit as teaching material.

2 Likes

Well, from what I have seen so far, I have yet to find a book as comprehensive as this one, covering the fundamentals of Python. Others generally breeze by certain topics, or do not include them at all. Sure, this book is not perfect, but it covers just about all of the topics in Python under one book. So, I don’t need to look for information from many sources. If I do, its as a supplement.

Here is an arbitrary example:

class ArbClass:
    
    def __init__(self, value = None):
        
        if value == None:
            value = []
            
        self.value = value
        
    def print_me(self):
        
        print(self)

x = ArbClass([1,2,3,4])
print(x.value)

x.print_me()

The result, after running the script is:

[1, 2, 3, 4]
<__main__.ArbClass object at 0x00000118680D0890>

Note that the print(self) within the print_me() method provides some memory location and not the list value used during instantiation. In the book example, self is both the instance of the object as well as the list.

I only did this so that the string from the __repr__ overloading method did not have to be printed. I only wanted the contents of self is all.

Comparing my simple test code above and the sample code from the book, the self from my test code is explicitly the instance object. The self from the book sample code is both the instance and the list.

It’s not “both the instance and the list”, it’s an instance of the list type.

The difference between ArbClass (your new example) and Set (from the book) is that the latter inherits from list (by defining itself as class Set(list): ...) while the former doesn’t. Inheritance makes your class behaves as if it’s the parent class – you can do self.append() even though you never defined the append method, you can print(self) and the result looks like a list, etc – otherwise it’s just a normal class.

If you want to define a class that behaves like a list but doesn’t inherit list, see UserList in the collections module (stdlib) (at the time of writing it’s on line 1213). As you can see, UserList has to define a lot more methods to make it behave like a list, it’s more difficult to write and usually unnecessary, but sometimes you need that level of customization. But because the book doesn’t take this route, it may be outside of its scope.

I do not have an issue with the append() method or with inheritance in general. I understand that append() is a method of the list class. As such, a subclass has access to it just as if it were its own through inheritance. In the following test code, I have added a super class to highlight arbitrary inheritance features (using the same class from the previous post).

class Super:
    
    misc_var = 500
    
    def add_nums(self, x, y):
        
        sum_result = x + y
        
        return sum_result

class ArbClass(Super):
    
    def __init__(self, value = None):
        
        if value == None:
            value = []
            
        self.value = value
        
    def print_me(self):

        print(self.misc_var)        
        print(self)

x = ArbClass([1,2,3,4])
print(x.value)

x.print_me()

print(x.add_nums(5, 19))
print(x.misc_var)
        

Note that every time that I want access to an attribute from the Super class, I have to qualify it (object.attribute or self.misc_var, etc.). What is throwing me off a bit is that for the case of extending type by subclassing, no qualifying is required. We get access to the list simply by self.

Okay, and why exactly is that surprising?

“The list” means the same object that self names.

It’s not as if “the list” were a component of the Set instance. Rather, it is a kind of list, in the same way that Nala, an instance of Felis catus, is a pet. You don’t take the pet part of Nala to the vet; you take Nala to the vet, who simply is qualified to look after all sorts of pets.

1 Like

I associate self with representing the ENTIRE instance and not just an argument of the class instance. But here, apparently, self, as you stated, means the same object as list.

Generally, if you want access to an attribute within a class, you prefix it with self and with the object name (qualify it) if referencing an attribute outside of a class. This is the part which is a bit odd for me. It is a new way of thinking when it comes to extending type by subclassing.