How to annotate a method call from within a dataclass' __init__?

I have stumbled upon an interesting problem that I can’t seem to solve on my own. I need your help.

So, let me give you a code example first and then I’ll explain what I want help with…

from dataclasses import dataclass

@dataclass
class GameManager:
    self.reset()

    def reset(self):
        pass

How does one call the reset() method from within the GameManager dataclass, so that it’ll actually be called? All I get is errors, and I tried multiple variations, like with no self. part and stuff like that. The @dataclass decorator should define an automatic __init__ for me, so I am baffled as to how can I call reset().

Also, how on Earth can I annotate such a method call by using the Callable type from the typing module while the method must actually be called?

This is something I haven’t found anywhere on the Internet, not even on StackOverflow. Please help me out.

I’m not entirely sure what you’re trying to do. But if you want to do something at instance __init__() time, but still want to use the automatically created __init__() for the dataclass, you can add it to a __post_init__() method.

post-init processing

3 Likes

Yeah, but that beats the purpose of me not needing to define the __init__() method of the GameManager dataclass, as dataclasses do that by default. Having to now define a __post_init__() method is the same as defining an __init__() method. Where’s that boilerplate code reduction in my case by using dataclasses? There isn’t any.

So, I decided that I won’t use dataclasses at all, ever. Especially, I won’t use annotations either, because they make a developer’s life a living hell.

I don’t see how adding a line in post-init is the same as defining a full init. The boilerplate still takes track of all the data elements. You would just add the single call you want to make.

2 Likes

I think dataclasses were designed to primarily be used as data containers, and those classes rarely need to call something like a setup() method. . Also, as @BowlOfRed mentions, using __post_init__ in your case would still result in a net loss of boilerplate. In fact, __post_init__ is very much not boilerplate.
As an example:

# Full class with boilerplate
class with_boilerplate:
    def __init__(self, a, b):
        self.a = a
        self.b = b
        
        self.setup()

    def setup(self):
        ...

# Dataclass with __post_init__
from dataclasses import dataclass

@dataclass
class without_boilerplate:
    a: int
    b: float

    def __post_init__(self):
        self.setup()

    def setup(self):
        ...

Also, answering your original question: All the code within a class block is run when the class itself (as a subtype of type or a metaclass) is built. Note that this does not include the inside of functions, only the def foo(...) is run which constructs a function within the class namespace. At this time, there is no self, since self would be an instantiation of a class which doesn’t yet exist.

I have myself run into this issue when I tried to create a decorator that would configure a method, but if I created that decorator within the class it was required called like this

class badly_decorated:
    @self.decorator
    def foo(self)

    def decorator(self, meth):
        ....

which is run at class construction time and no self exists at that point. I solved this by creating a decorator factory outside that class which returned a descriptor. For your use-case it would be overkill though, so I suggest using __post_init__.

2 Likes

Ah, now I understand. I need to define the __post_init__() special method in order to call my reset() method, as there seems to be no other way if the class is defined as a dataclass. So, I’ll do just that, thanks.

Now, I also need to know how can I annotate my reset() method. Say I have this dataclass setup:

from dataclasses import dataclass

@dataclass
class GameManager:
    ...

    def __post_init__(self):
        self.reset()

    def reset(self):
        ...

How would you annotate the reset() method? Say that the method doesn’t define any parameters and that it returns None.

I know about the Callable type from the typing module which is what I need. Does the annotation involve both the self.reset() call in the __post_init__() special method and the reset() method definition?

1 Like

I’m not a typing expert, but if your method takes no arguments and never explicitly returns you shouldn’t have to annotate it in any way. The type checkers will be smart enough to realize what is going on.

You’ll only need to annotate def reset(self) -> None:. Type checkers already know about self, so annotating that is usually redundant. You do need to annotate the return type though, because if the function is totally unannotated, by default the type checker suppresses any errors for the function, to allow you to gradually type your codebase.

5 Likes

You all have been very wonderful in helping me out. So, thank you all.

(I feel like annotating my whole codebase is like developing everything from scratch. So much work and so much nuances to deal with.)

1 Like

This could just be a shot in the dark, Boštjan, since it may not apply to dataclasses, but the quote exactly describes a situation I had with a method in a standard class definition. The solution was to use __call__ as a proxy method like this:

class CoolStuff:
    def do(self, argIN):
        ...
    __call__ = do

The __call__ at the same nest level as def do() allowed me to use coolStuff.do() on the object instance.

Wow, that’s a very cool hack! Thanks.

While using __call__ is an interesting hack, it’s often more confusing than creating a regular method. Consider this code

foo = Foo()

foo()

bar = Bar()

bar.setup()

Here it is more obvious that you are performing some kind of setup with bar.setup(), whereas it one cannot be sure what is happening when foo() is called without reading the documentation. Creating a __call__ should be done after much deliveration imo.

1 Like

I’m curious if the __call__ binding worked for you. Have you tried it?

This isn’t a proper use of __call__ but it did exactly what I needed.

(Normally I don’t use the word “hack” but in this case it does apply. My apolgies to the purists–which is usually me. :nerd_face:)

100% agreed. I used a class: in that case because I use multiple instances of the function with distinct self parameters in each. Some of those function instances took a method just fine without the __call__. Others required me to bind the ~.do to explicitly assert the call.

1 Like

Another question. What if in a dataclass definition you inherit from another class where you then, obviously, must call super().__init__()? Is __post_init__() the place to do it? Also, must the parent class also be defined as a dataclass for the inheritance to work?

from dataclasses import dataclass

class MyParentClass:
    ...

@dataclass
class MyChildClass(MyParentClass):
    ...

I am wondering whether calling super().__init__() in __post_init__() of MyChildClass is too slow for the __init__() of MyParentClass to be called when creating an instance of MyChildClass. Does one still obtain full access to all of the attributes & methods of MyParentClass?

When @dataclass runs, it goes through the inheritance tree and collects all the field definitions (subclasses overriding superclasses), then takes all of them and creates __init__ etc. So if the parents are also dataclasses, there’s no need to call the superclass init because the new generated version already does the assignment. If the parent isn’t a dataclass, then yeah you’ll likely need to manually call the superclass. __post_init__ isn’t too late, assuming it’s fine if the generated attribute assignments can happen before the superclass init is called. Potentially you might need some init-only fields, to make the superclass parameters present on __init__ to pass along.