True constructors

NeilGirdhar · February 10, 2025, 12:52am

This idea entails an idealized version of classes in Python. There are some benefits, but I recognize that there may not be enough justification for it. What I’m hoping is to start the conversation about how we could make classes in Python more perfect so that one day we will have collected a motivating enough set of justifications to change things.

Background

There are a variety of warts that we hope to address.

Calling super in the initializer

In multiple inheritance, sending parameters to various base classes is brittle:

class Y:
    def __init__(self, y: int, **kwargs: Any):
        super().__init__(**kwargs)
        self.y = y

class Z:
    def __init__(self, z: int, **kwargs: Any):
        super().__init__(**kwargs)
        self.z = z

class X(Y, Z):
    def __init__(self, x: int, **kwargs: Any):
        super().__init__(**kwargs)  # Hopefully send y to Y and z to Z.
        self.x = x

This relies on:

every initializer for every class delegating all unknown parameters to super, and
that there are no collisions between parameter names in the inheritance chain.

Initializers, class factories, and `replace` disobey LSP

Consider the above classes X, Y, Z. We can see that __init__ does not obey LSP. In many circumstances, we can’t even know the signature of __init__. Even in the above code, super().__init__ does not have a known signature since you can inherit from X and insert all kinds of classes into the MRO list.

This problem gets even worse for class factories. If you wanted to add a class factory like:

class Y:
    @classmethod
    def create(cls, y: int) -> Y:
        return Y(y)

Then X < Y cannot also define create with incompatible parameters.

This problem plagued __replace__, which was finally exempted from LSP.

Dataclasses do not easily support multiple inheritance

__post_init__ is necessary if you want your dataclass to have member variables that are not passed to the constructor. But, if you want to use __post_init__ and have it support multiple inheritance, you need to call super. But you can’t call super without checking if the superclass even defines __post_init__. And even if it does, there’s no easy way to find out what the super parameters are!

@dataclass
class Y:
    y: int
    def __post_init__(self) -> None:
        if hasattr(super(), '__post_init__'):
            # Hopefully, there are no parameters on this!!
            super().__post_init__()  # pyright: ignore
        assert self.y > 0

@dataclass
class X(Y):
    x: int
    def __post_init__(self) -> None:
        if hasattr(super(), '__post_init__'):
            # Hopefully, there are no parameters on this!!
            super().__post_init__()  # pyright: ignore
        assert self.x > 0

Redundancy in dataclass definition

Consider this real code for a mixer:

class Mixer2d(Module):
    input_size: InitVar[int]
    height: InitVar[int]
    width: InitVar[int]
    patch_size: InitVar[int]
    hidden_size: InitVar[int]
    mix_patch_size: InitVar[int]
    mix_hidden_size: InitVar[int]
    num_blocks: InitVar[int]
    t1: JaxArray
    conv_in: eqx.nn.Conv2d = eqx.field(init=False)
    conv_out: eqx.nn.ConvTranspose2d = eqx.field(init=False)
    blocks: list[MixerBlock] = eqx.field(init=False)
    norm: eqx.nn.LayerNorm = eqx.field(init=False)

    def __post_init__(self,  # noqa: PLR0917
                      streams: Mapping[str, RngStream],
                      input_size: int,
                      height: int,
                      width: int,
                      patch_size: int,
                      hidden_size: int,
                      mix_patch_size: int,
                      mix_hidden_size: int,
                      num_blocks: int,
                      ) -> None:

This illustrates three classes of member variables. True members and InitVars that are specified twice (in the class body, and __post_init__). Some fields are marked init=False to specify that they won’t be parameters. It would be a lot simpler to only have true members in the class body.

Proposal

Consider instead adding two special decorators for constructors @constructor and @transformer, and special language and typing support for such constructors and transformers. These constructs can only be used on a special form of dataclass, which we’ll call a C-class.

All constructors are class-methods that return Self. All transformers are regular methods that return Self. These work as usual. If x = X(), then X.some_constructor(...) calls the constructor and builds an X always. And similarly, x.some_transformer(...) builds an X always. init is a special constructor that can be called as X.init(...) or just X(...)

All C-classes are dataclasses. Therefore, init is automatically generated unless specified.

Constructors and transformers are never inherited. And calling super().init (or any other constructor or transformer) is not allowed.

You would only ever choose to define __post_init__ if you left the init constructor unspecified (otherwise, you can do whatever you want in init). Therefore, __post_init__ accepts whatever init accepts, and cannot call super (since that has unknown signature).

C-classes disallow InitVar since the init constructor can be specified instead with whatever parameters you want. It would also complicate inheritance too much since you would be inclined to forward these to super. Instead, we discuss inheritance below.

The field parameter init=False is likewise disallowed since you can simply not specify it in a constructor if you don’t want that constructor to accept it.

Example

@c_class
class Point:
    x: int
    y: int = field(default=1, converter=int)

    @constructor
    def init(cls, x: int, y: int = 1) -> Self:
        return cls.__new__(x, y)

    @constructor
    def on_diagonal(cls, z: int, /) -> Self:
        return cls.__new__(z, z)


@c_class
class TDPoint(Point):
    _: KW_ONLY
    z: int

    @constructor
    def init(cls, x: int, y: int = 1, *, z: int) -> Self:
        point = Point(x, y)
        return cls.__new__(***point, z=z)

    @constructor
    def on_diagonal(cls, z: int, /) -> Self:
        return cls.__new__(z, z, z)

    @constructor
    def on_diagonal_alternative(cls, z: int, /) -> Self:
        point = Point.on_diagonal(z)
        return cls.__new__(***point, z=z)

Here, Point.init and TDPoint.init are provided just for illustration. They would be automatically generated had they not been defined.

Inheritance

Inheritance is illustrated in on_diagonal_alternative. *** unpacks positional-only arguments as position arguments, and the rest of the arguments as keyword arguments. The rules for collisions are the same as the rule for collions of members with base classes. If a parameter is explicitly given, it overrides anything provided by a ***-unpacking. This makes it easy to construct a class with many base classes. Just *** each of them, and fill in what you want to override and what’s missing.

The default init

In the context of inheritance, the default generated init is as follows:

Collect all of the parameters of all superclass init functions (collect all positional parameters first, and then keyword-only parameters, etc.) This is your init parameter list. If there are any collisions, that’s a definition error.
In the body, call all init constructors for all your parent classes and produce an object for each.
Glue all of these objects together along with any parameters that are unique to this class using the *** operator.

To address the enormous cost computational cost of generating this code and doing this, CPython would accelerate the common cases (e.g., when parent classes also don’t specify init).

Closing remarks

This is intended to be an outline for repairing some of the warts with classes in Python. It probably doesn’t have enough benefits to pay for the cost of adding the various decorators and the *** operator. But if people identify other warts with classes there may eventually be enough to justify such changes.

JamesParrott · February 10, 2025, 11:35am

Interesting ideas Neil. I agree Dataclass inheritance could be improved, but it’s a minefield.
What’s the difference between @constructor and @transformer? Is anything preventing a third party implementing those two now, using a metaclass?

peterc · February 10, 2025, 12:31pm

I’m all in favour of phasing out super().init.
But are you just replacing super().init with explicit calls to the parent class constructors?
And doesn’t super().init do something clever when you have a diamond inheritance pattern, to ensure that the common parent is only called once, and how would you replace that functionality?

Also I’m curious what context you work in where inheritance and ISP is an actual good idea. Could you link to any code where building an inheritance tree is good design?

My own experience is with:

one library that (over)used inheritance, and created an unparsable on-fire dumpster truck. I spend many many hours trying to change a little detail of an object I created by instantiating a class of the library, and eventually had to give up.
one library that required me to sub-class to use its API, which made it unnecessarily unclear what was actually happening.
some very clean code where classes might inherrit from a Protocol or ABC
and the occasional defensible fudge where a class inherits from another class that inherits from a Protocol, for the purpose of easyish code reuse, as for example in frozendict (github link).

abessman · February 10, 2025, 1:04pm

Request for clarification: Is the purpose of this idea to make it easier to design dataclasses for multiple inheritance (MI), or to make it easier to use dataclasses not designed for MI in MI applications?

MI is almost always messy, except when it isn’t, and then only because someone thought very carefully about their software architecture design. The brittleness of MI cannot be avoided at the level of language design, only in the design of an individual piece of software.

Making it easier for developers to write classes intended for MI could be a good thing. On the other hand, trying to accommodate MI with classes not specifically designed for it is just handing developers a massive footgun.

hprodh · February 10, 2025, 3:46pm

I did not full understand the problem, yet 2 remarks :
1 - This works :

class X(Y, Z):
    def __init__(self, x, **kwargs):
        Y.__init__(self, kwargs['y'])
        Z.__init__(self, kwargs['z'])
        self.x = x

2 - Reading this (long but interesting blog) has made me understand that multiple inheritance is actually a bad practice :
https://python-patterns.guide/

jamestwebber · February 10, 2025, 4:13pm

Could we remove the need for a new operator by not allowing positional-only arguments? How big is that need?

NeilGirdhar · February 10, 2025, 7:05pm

I wanted to also able to define __replace__, which needs to accept self. Since constructors are class methods, then I figured that transformers could be the ordinary method equivalent. So, __replace__ is an example of a transformer.

You could probably get all of the runtime stuff working (since you can do practically anything in Python), but there would be no static typing support. For example, constructors aren’t supposed to be inherited.

You don’t have to call the parent constructors if you don’t want to. In many cases of multiple inheritance, you probably will want to delegate some construction to your immediate parents, and I wanted that to be as simple as possible.

What it does is that your first super class (one arm of the diamond) needs to call super and that eventually calls the other arm of the diamond. This relies on every init method calling super, which means that everyone needs to accept keyword arguments and forward what they don’t understand—and there cannot be any collisions.

Calling super is fine for instance methods since instance methods obey LSP. An instance method can call super with confidence since it knows the parameter list. But the initializer cannot do that in general.

The idea is to replace the super-call mechanism with explicit calls to superclasses. Yes, that means that both arms of the diamond pattern are initialized separately, and the common parent is initialized once for each. To consolidate the common parent’s data members, the *** operator needs to arbitrarily choose one arm as having priority. You can always override this choice if you want.

I’m not sure what you mean by inheritance and LSP. LSP is a fundamental, necessary property of polymorphism. It doesn’t make sense to talk about inheritance without LSP.

I guess you’re asking when multiple inheritance is good design. The most prevalent kind of multiple inheritance is multiple interface inheritance. We see that with a lot of ABCs, which themselves inherit from multiple ABCs. For example, a collection is a sized, iterable, container.

More general multiple inheritance is rarer. Here’s an example from some of my code: GammaEP inherits from ExpToNat, which contains an optional minimizer in order to do the parameter transformation. The ExpToNat base class should not (imo) be provided by composition (rather than inheritance) since it implements the abstract method to_nat (the conversion to the natural parametrization), and it wants to expose the data member of the optimizer (and might want to expose other things in the future, e.g., settings related to optimization).

The purpose of the proposal is to address the four drawbacks from the background:

calling super is brittle,
initializers, class factories, and replace disobey LSP,
dataclasses do not easily support multiple inheritance, and
dataclass specification contains unnecessary redundancy.

So, to answer your question, it’s both.

100% agree.

The problem is that some people intend for their class to be able to be inherited from, but they forget to call super (e.g., networkx.DiGraph), so if you try to multiply inherit, it won’t work. With the above design, they don’t have to call super or do anything special. They can design their class in their little sandbox.

That pattern is broken. Consider:

class W(X, V): ...

Now, V won’t be initialized.

It’s not “bad practice” according to actual OO textbooks. However, some care needs to be taken to respect how it’s supposed to work in Python. I could respond his points in sequence if you want when I find time. In general, you should avoid it unless it makes logical sense (composition over inheritance, unless you have a true is-a relationship; the data members of the parents need to be reflected on the child class; the parent classes change independent of the child class; etc.)

Also, I’m not sure how good of source that is. His next section is “Dodge: mixins”. Python is chock full of mixins, and the mixins in collections.abc are considered extremely Pythonic, and a best practice.

Yes, 100%.

Rosuav · February 10, 2025, 7:19pm

If your ideas of multiple inheritance are defined primarily by Java, then yes, all of this is true. MI doesn’t work the same way in Python as it does in Java, and has to be thought about differently.

It looks to me like what you’re doing here would be better served by composition than inheritance, anyway.

peterc · February 11, 2025, 1:44pm

I meant “inheritance and LSP” as a unit. So, they way you understand inheritance.
It’s clarification for people who don’t believe in LSP as a design principle, but who still use the word “inheritance” to refer for example to inheriting from multiple Protocols.

No, I agree multiple interface inheritance is good design. To me the gold standard is Rust, which goes so far that it actually allows you to implement an interface for a class after the class has been defined.
In Python if you want a ‘union’ of Protocols like Collection, inheriting from multiple Protocols can be a good way to to implement this.

What I refer to as an inheritance tree is loosely something {where you can’t see all the names that one class inherits from within one screen}.
I think it’s confusing when I have to traverse multiple branches of an inheritance tree in order to find where a method was defined. I lose oversight and forget which of the ancestors of the class I’ve already viewed. And even when you’ve found a method you can’t be sure that it won’t be overwritten by one of the other ancestor classes.

First of all: Respect for making and maintaining ~~the~~ a (Edit: Typo) library for (use with) JAX. I have used JAX and I like JAX, and this library looks like a lot of work.
But I don’t think I could ever work on this library, or use the library in a way to goes beyond your intended design, because of the ISP-based design.
I do think composition would be better design for the particular example you name. One could implement composition via a pattern like

class ToNat(Protocol, Generic[NP]):
    def convert(self) -> NP:
        raise NotImplementedError

class ExpectationParametrization(Distribution, Generic[NP]):
    to_nat: ToNat[NP]
    # Now you can use .to_nat.convert() instead of .to_nat()
    # ... rest of class below ...

class ExpToNat(ExpectationParametrization[NP], SimpleDistribution, Generic[NP]):
    @jit
    def convert(self) -> NP:
        flattener, flattened = Flattener[Self].flatten(self)  # type: ignore[arg-type]
    # ... rest of class below ...

@dataclass
class GammaEP(HasEntropyEP[GammaNP],
              Samplable,
              ExpectationParametrization[GammaNP],
              SimpleDistribution):
    minimizer: InitVar[ExpToNatMinimizer | None] = field(default=None, repr=False)
    to_nat: ExpToNat[GammaNP] = field(init=False)

    def __post_init__(self) -> None:
        if hasattr(super(), '__post_init__'):
            super().__post_init__()  # type: ignore # pyright: ignore
        self.to_nat = ExpToNat(self.minimizer)

This way the abstract class ExpectationParametrization has access to the method it needs, and any data members of the optimizer you want to exposed are exposed via the .to_nat attribute.

Furthermore, any attributes of ExpToNat that are used in GammaEP are explicitly called via the .to_nat attribute, which means reader of the code don’t have to traverse the inheritance tree.

At the same time this inheritance-based code pattern apparently works for you.
Designing code via composition can be hard because it forces you to have one-way relations. Owners request methods and attributes from their properties, not the other way around. I think that effort is worth it because of the added clarity of the structure. (And I prefer explicit over ‘clean’ obfuscation.) But maybe for you it isn’t. I’m generally in favour of enabling Python to work for as many people as possible, so I’m glad it works for you too.

NeilGirdhar · February 11, 2025, 5:47pm

LSP isn’t a “design principle”. LSP is a necessary aspect of inheritance. When should A inherit from B? Answer: When A is-a B, which implies LSP. It doesn’t make any sense to “not believe in LSP”.

You should use the @override decorator, and ask a linter/type-checker like PyRight to verify that you’re always using it. Then, you can easily identify the root method since it will be undecorated.

I don’t maintain Jax! I maintain some libraries that use Jax.

What do you mean by ISP-based design? Is this a typo for LSP? I ask because you’ve said it a couple times this way and I’m worried I’m misunderstanding you.

If you mean LSP, like I said, all inheritance demands respecting LSP.

I explained to you why it would not:

Your implementation missed providing the abstract method (which is the whole point of the mixin). It missed exposing the data member of the optimizer. And it missed providing access to self in the the conversion function, which is necessary.

And what if in the future I want to take the optimizer common between various mixins? Then your composition solution becomes quite complicated since I would need to manage all of that. Similarly, what if the mixin wants to implement other methods or provide other methods on the distribution class?

This is a classic mixin. If I had removed the data member, there would be no question that it should be a base class and not composed. Therefore, adding the data member should not change that relationship.

oscarbenjamin · February 11, 2025, 6:35pm

I don’t think you can say that there would be “no question that it should be a base class”. Evidently this style of inheritance is something that seems quite natural to you. I can’t say how I would have written the code without studying it carefully but I am fairly certain that I would have not used inheritance for this.

Anything that you want to expose like methods, attributes etc can be exposed in some way just by adding some code in the class that would otherwise inherit those methods from the mixin. Using composition over inheritance sometimes means having a bit of boiler-plate code like:

class Foo:
    def bar(self):
         return self._bar.bar()

There can be ways to avoid/reduce it but it is at least clear that with possibly some boiler-plate composition can always achieve any effect that inheritance can.

There are really two reasons for using inheritance:

It determines what isinstance returns.
It provides code sharing across classes.

Looking through some of my own code the only time I use multiple inheritance is for exception classes and that is because with those I am very explicitly concerned about isinstance. Other reasons for inheritance like polymorphism don’t apply in Python where protocols, duck-typing etc can be used instead.

The question is really how the benefits of code sharing vs a bit of boiler-plate stack up against having simple class hierarchies. In my experience it is better to have self-contained classes with all their code in one place rather than indirection through different superclasses (this also applies to deep hierarchies with single inheritance). I have sometimes written classes with a lot of boiler-plate methods but I generally prefer that over any nontrivial inheritance schemes.

Liz · February 11, 2025, 6:36pm

There’s no question that it shouldn’t be a class.

I think people should stop attaching methods to data. classes should be things that are stateful, and methods should need or manipulate that state in a managed way. If it’s just operating on data or public state of an object, you don’t need inheritance, write a function not a method.

I can empathize with the fact that there’s decades of code that’s doing it via inheritance, but that code would have to change to use this idea too.

NeilGirdhar · February 11, 2025, 7:58pm

Not quite. Inheritance is always different in one fundamental way, which is the is-a relationship. It answers differently to isinstance/issubclass calls. That’s why the gold test for whether something should be inherited or composed is whether you have an is-a relationship.

I think protocols should be reserved for situations in which true inheritance is impossible or difficult to achieve. For example, if people have added methods like __hash__ to their code, and it’s too much work to expect them to inherit from Hashable. Or it may be impossible for them to inherit if they have some subclasses that they don’t control. Or they may not want all client code to depend on base class definition.

Using protocols as a substitute for inheritance is otherwise, I think, a bad idea since

You make isinstance more brittle (it checks member names—not signatures, which is also quite a bit slower).
If the protocol or subclass changes, the inheritance hierarchy change from under you.
You have to verify invariants after definitions rather than doing so at definition time.
The error produced by verifying is-subclass is significantly poorer than the error you get when declaring that the subclass inherits from the base.
You can’t add behaviour to protocols; nor can you add data members.

Protocols were a great idea, but they can be overused. In general, for the reasons above, you should prefer inheritance over protocols except when it’s too onerous to explicitly inherit.

The primary motivation for inheritance should be the is-a relationship—not code sharing. If you’re finding that you’re using inheritance as a way of sharing code alone, then you should recast to composition (I think we agree here). Inheritance should not just be a convenient way of sharing code.

And coming full circle, LSP errors are nearly always an indication that you’ve misused inheritance.

I don’t think we actually disagree on anything anyway. We might come to slightly different conclusions depending on how we imagine the code would evolve.

Anyway, none of this discussion about inheritance is really related to the idea of adding constructors. I only mentioned multiple inheritance in the idea to ensure that the idea was compatible with it. Although, the LSP problems mentioned as something addressed only come up when you use inheritance.

Rosuav · February 11, 2025, 8:23pm

A strict reading of the principle makes subclassing utterly useless, because anything true of the superclass has to be true of the subclass. Thus it is a design principle, NOT a a necessary fact. For example, we know that an integer is-an object, yet object.__repr__(42) != int.__repr__(42) and we are okay with that. Subclasses ARE allowed to make changes in behaviour.

oscarbenjamin · February 11, 2025, 8:40pm

There are different senses of the word here e.g. people have been using the “iterator protocol” since long before type annotations. With typing now I would use Protocols or ABCs as type annotations but I would not inherit from them or use them with isinstance.

You mention here that isinstance is made slower when using protocols but that is because Protocols are ABCs and inheriting from ABCs makes isinstance slower for real types:

In [1]: from collections.abc import Hashable

In [2]: class A(Hashable): pass

In [3]: class B: pass

In [6]: c = []

In [7]: %timeit isinstance(c, A)
340 ns ± 2.72 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [8]: %timeit isinstance(c, B)
67.7 ns ± 0.248 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

Why should inheriting Hashable change how isinstance works?

The problems discussed in the OP seem to be almost entirely about inheritance. It is not clear to me what problems you want to solve otherwise.

NeilGirdhar · February 11, 2025, 8:42pm

I think that’s a misunderstanding of what LSP promises. Yes,all superclasses can promise an API (which is both an interface and can promise some invariants). However, the object superclass does not promise that the string returned by __repr__ must have a certain form. It could, of course, have made that promise, but no one wants that promise.

NeilGirdhar · February 11, 2025, 8:45pm

In fairness, it doesn’t “change how it works”. It does make it very slightly slower, sure. But do you want to compare that with the time to check a protocol with a dozen methods versus a base class with a dozen methods?

Rosuav · February 11, 2025, 8:50pm

Subtype Requirement : Let ⁠ϕ(x) ${isplaystyle hi (x)}$ ⁠ be a property provable about objects ⁠x ${isplaystyle x}$ ⁠ of type T. Then ⁠ϕ(y) ${isplaystyle hi (y)}$ ⁠ should be true for objects ⁠y ${isplaystyle y}$ ⁠ of type S where S is a subtype of T.

“A property provable about objects”. Like I said, a strict reading of it means fungibility, which isn’t what we usually want. It’s a principle that we follow with APIs when it makes sense to do so, and which we don’t follow when it doesn’t make sense to do so. I think you’re overblowing its importance a bit by calling it a “necessary aspect of inheritance”.

NeilGirdhar · February 11, 2025, 9:04pm

I think you’ve misunderstood this line. It doesn’t mean that anything that superclasses do, subclasses have to do as well. It means explicit promises (and consequences thereof) are inherited.

Substitutability, which is what LSP ensures is a necessary aspect of inheritance. Any method that accepts A should be able to accept B < A. And B should be able to be written to any writeable member variable of type A.

oscarbenjamin · February 11, 2025, 9:19pm

It may be slower with Protocol than with an ABC but I would not want to use either with isinstance. To me isinstance is for checking real runtime types. If you use either a Protocol or an ABC as a superclass though then you incur this runtime cost when using isinstance with concrete types rather than abstract types:

In [26]: class P(Protocol):
    ...:     def f(self): pass
    ...: 

In [27]: class A(metaclass=ABCMeta):
    ...:     def f(self): pass
    ...: 

In [28]: class P2(P):
    ...:     def f(self): pass
    ...: 

In [29]: class A2(A):
    ...:     def f(self): pass
    ...: 

In [30]: %timeit isinstance(c, P2)
506 ns ± 18.7 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [31]: %timeit isinstance(c, A2)
333 ns ± 2.15 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [32]: %timeit isinstance(c, tuple)
68.4 ns ± 0.959 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

With static typing we should be able to get the benefits of Protocols and ABCs without any runtime overhead.