Dataclass_transform - Inheriting class doesn't recognize kw_only=True from parent class

Following the post on the mypy issue tracker – PEP-681 - Inheriting class doesn't recognize kw_only=True from parent class · Issue #17375 · python/mypy · GitHub

I currently don’t see a reason why this shouldn’t be added to the typing spec and implemented. In runtime, the child class acts as if kw_only is set.

I’d like to propose a change to the spec, let me know if you have any comments beforehand

Thanks

1 Like

Thanks for posting the idea.

After doing a bit of research, I found that the stdlib dataclass library apparently doesn’t work this way. If you specify kw_only=True in a dataclass and then derive another dataclass from it, the derived class does not inherit the kw_only property from its parent.

Here’s a code sample that demonstrates this:

from dataclasses import dataclass

@dataclass(kw_only=True)
class A:
    x: int

@dataclass
class B(A):
    x: int
    y: int

B(1, 2)

Likewise, it doesn’t appear to inherit other dataclass properties like init=False and order=True.

The dataclass_transform is intended to mirror the behavior of stdlib’s dataclass library. What you’re proposing here would deviate from this behavior. Based on that observation, I think this spec change would be problematic.

Thanks Eric, it seems like you’re right.

I think I also understand why I thought it would work.

Specifically, this errors on B

@dataclass
class A:
    x: Optional[int] = None


@dataclass
class B(A):
    y: int
    z: int

and this does not

@dataclass(kw_only=True)
class A:
    x: Optional[int] = None


@dataclass
class B(A):
    y: int
    z: int

I believe I got confused because this way it seems like B does have kw_only=True whereas it simply respects its parent. It does make it a bit weird because now B’s fields are positional whereas A’s are kw_only

B(1, 2, x=1)

@erictraut I just realized what weirded me initially and still seems inconsistent now.

Looking at the dataclass_tranform example using sqlalchemy (which is what I stumbled upon initially)

from typing import Optional
from sqlalchemy.orm import DeclarativeBase, Mapped, MappedAsDataclass, mapped_column


class Base(MappedAsDataclass, DeclarativeBase, kw_only=True):
    id: Mapped[int] = mapped_column(
        init=False, primary_key=True
    )

class User(Base):
    email: Mapped[str]
    favorite_band: Mapped[Optional[str]] = mapped_column(default=None)

    # Pyright: Fields without default values cannot appear after fields with default values Pyright (reportGeneralTypeIssues)
    # Works in runtime
    password: Mapped[str]

The code above has a typing issue but instantiating the model on runtime works, even though it seems like it shouldn’t with what we understand now.

Using naive dataclasses, it does not work in runtime

@dataclass(kw_only=True)
class A:
    x: Optional[int] = None


# Throws error on definition
# TypeError: non-default argument 'z' follows default argument
@dataclass()
class B(A):
    y: int
    s: Optional[int] = None
    z: int

So I tried to reproduce it using dataclasses_transform

def field(default: Any = None, init: bool = True, repr: bool = True) -> Any:
    pass


@typing_extensions.dataclass_transform(field_specifiers=(field,))
class ModelMeta(type):
    pass


class ModelBase(metaclass=ModelMeta):
    def __init_subclass__(cls, kw_only: bool = False) -> None:
        pass


class Base(ModelBase, kw_only=True):
    pass


class Vehicle(Base):
    b: Optional[str] = field(default=None)
    c: int


Vehicle(b="1", c=1)

For that I didn’t get an error on definition, but I have an unexpected error on instantiation
Vehicle() takes no arguments

I tried to look at sqlalchemy’s usage but the code seems to convoluted and irrelevant for me right now to get into. I was hoping you could share your opinion on this matter, as although my initial issue is not as relevant, this seems like an inconsistency (in the implementation?) that caused this.

Yes, the assumption with dataclass_transform is that the library that implements the dataclass-like semantics mirrors the behaviors of stdlib’s dataclass. Some libraries deviate from dataclass behaviors in subtle ways. It appears that you’ve discovered one such example in sqlalchemy’s implementation.

The dataclass_transform mechanism allows for some degree of customization (e.g. the kw_only_default and frozen_default parameters), but these are limited to a set of predefined knobs that we collectively decided were important to support at the time dataclass_transform was initially spec’ed.

The dataclass_transform call is extensible, so we can add more customization knobs in the future if there is consensus on the need for such knobs.