Hi. First post, be gentle!
I’d like to be able to find all current subclasses of a given class. The obvious answer is
__subclasses__, but that feature is not kept up to date in the face of class redefinition and scoping. For example:
At this point,
Foo.__subclasses__() is still a list of length 1, containing
Foo.__subclasses__() has length 3, with three copies of a local class
Presumably it’s too difficult to keep
__subclasses__ up to date, otherwise I assume it would be done. But as things stand it changes only at garbage collection; if I append
gc.collect() to either of the examples above,
Foo.__subclasses__() then returns an empty list.
My question is, what might I do? Is there another mechanism I can use? Is it worth asking the Python Gods whether
__subclasses__() can be made accurate? Should I redefine the function myself to call gc each time, just to be certain? Any ideas appreciated.
From the reference documentation for
Each class keeps a list of weak references to its immediate subclasses. This method returns a list of all those references still alive. The list is in definition order.
The give-away is “weak”. It is obviously meant to not be up-to-date, so I expect your wish to make it so will not be granted.
When you look at the list when it is up-to-date, there may be surprises. I altered your first example a little:
>>> class Foo():
>>> class Bar(Foo):
... a = "Old foo"
>>> class Baz(Bar):
>>> class Bar():
>>> bazzie = Baz()
>>> barrie = Bar()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Bar' object has no attribute 'a'
The “old” Bar is not reachable through the name Bar any more, but it does still exist. It is referenced from Baz, so not GC-ed. Two Bar entries in
__subclasses__, and rightfully so.
I would try to keep out of keeping track of subclasses, it is hardly ever worth while.
As said before, don’t bother.
No, when you use the entries in
__subclasses__, be aware they may be dead and catch the TypeError.
Yes, I agree with much of what you say, but I don’t see why in your example you say “Two Bar entries in
__subclasses__ , and rightfully so.” Why would there be two? Did you mean for the second definition of Bar also to inherit from Foo? (In that case I agree that there should be two!)
Also, you say: " when you use the entries in
__subclasses__ , be aware they may be dead and catch the TypeError". This I don’t get—what TypeError? Certainly it would suffice to have a way to tell that a class is dead! How do I do that?
That is stupid of me.
Sorry about that. There are two objects, one with the reference from the name Bar and the other with a reference from the class Baz.
You want those references to the classes to do something to them. So you do something classy - I used instantiating as an example - and you will probably get a TypeError (check before use) as I got from trying to call it. The easiest and fastest way is trying the operation and if it fails with the expected error, it is probably dead. What is wrong with that?
There must be a way, otherwise GC would not work. This Stackoverflow question shows you how and indeed the accepted answer uses the
gc module. Is try…except not much easier in this context?
Hm, are you sure? Here’s what I get in my tests:
Python 3.8.2+ (heads/3.8:3e72de9e08, Apr 16 2020, 12:25:15)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> class Foo:
>>> def f():
... class Baz(Foo):
[<class '__main__.f.<locals>.Baz'>, <class '__main__.f.<locals>.Baz'>, <class '__main__.f.<locals>.Baz'>]
>>> a = Foo.__subclasses__()()
<__main__.f.<locals>.Baz object at 0x7fd654f7acd0>
>>> import gc
It seems that as long as the class objects are not collected, nothing prevents you from referencing them and instantiating them – or how would GC work? The reason why the objects are not collected immediately is that they are part of a cycle. As far as I understand, objects involved in a reference cycle are not seen as dead by the interpreter until the garbage collector is run. Efficiency is why it does not run all the time.
If you want to kill dead objects, call
gc.collect() I don’t see a way around it.
Foo.__subclasses__()() after I recreated the Bar reference so it referred to another class and it failed with a TypeError. That is why I mentioned it. Either I messed up or there is a difference between the two. Guess the way forward is indeed to run GC just before the call to
__subclasses__. My argument:
doesn’t cut it
Thanks for correcting.