Sub-class of pd dataframe returning itself from pandas methods

I want to create a sub class of a pandas dataframe, and then sub classes from that super class, such that when any method is called on either the super class, sub class, or from pd.dataframe class itself, they all return the same type of object that called it. Below code seems to do what I want, but I just want to confirm this is the correct way to do it, or if this is wrong and could pose a problem I’m not aware of, and maybe there is a correct way of doing it.

class Base(pd.DataFrame):
    def __init__(self, data=None, *args, **kwargs):
        super().__init__(data, *args, **kwargs)

    @property
    def _constructor(self):
         # return Base <- this doesn't work as I want
        return type(self)

    def method1(self):
        return self.rename({"old": "new"})

class subBase(Base):
    def method2(self):
        # some other method

When I call method1 from a subBase class, I want it to return a subBase class, not a Base class.
Any advice appreciated.

Naming a class after a Python built-in, e.g. super, is not the correct way to do anything other than deliberately shadow that built-in.

As to the goal, it all depends on what pd.DataFrame.copy returns.

1 Like

IIRC from digging into the pandas and numpy sources some months ago,
they try quite hard to produce the same type of object as called them. I
believe to aid interoperability of various libraries without each having
to rewrite the whole universe.

2 Likes

Apologies, that was pseudocode for demonstration purposes. I’ve edited the post to make it more clear.

No problem.

Just try it, and have a look at the type of object .copy returns. If it returns Self in a simple test, on all the different subclasses, the Pandas devs are unlikely to have added code to make it do something different, but only in some special circumstances, that could trap you.