Type checking subclasses of dataclass

I’m trying to extend dataclass. I want to create something that is a dataclass, but with some extra features like being able to calculate its own size and serializing itself.

import struct
from dataclasses import astuple, dataclass
from typing import Annotated

def datastructclass(cls=None, **kwargs):
    """Make a struct-like dataclass.

    Raises
    ------
    TypeError
        If any `field` does not contain `struct` metadata.
    """
    dcls = dataclass(cls)
    fmt_strs = []

    for f in fields(dcls):
        try:
            f.type.__metadata__
        except AttributeError as exc:
            msg = f"Field '{f.name}' missing annotation metadata"
            raise TypeError(msg) from exc

        try:
            fmt_strs.append(
                next(
                    filter(lambda m: isinstance(m, struct.Struct), f.type.__metadata__)
                ).format
            )
        except StopIteration as exc:
            msg = f"Field '{f.name}' has no Struct metadata"
            raise TypeError(msg) from exc

    serializer = struct.Struct("".join(fmt_strs))

    def pack(self) -> bytes:
        return bytes(self)

    @classmethod
    def unpack(cls, buffer):
        return cls(*serializer.unpack(buffer))

    def __bytes__(self) -> bytes:
        return serializer.pack(*astuple(self))

    dcls.format = serializer.format
    dcls.size = serializer.size
    dcls.pack = pack
    dcls.unpack = unpack
    dcls.__bytes__ = __bytes__
    return dcls

This works fine.

import struct
from typing import Annotated

from datastructclass import datastructclass

@datastructclass
class A:
    a: Annotated[int, struct.Struct("H")] = 0

aaa = A.unpack(b"12")
print(aaa.a)
print(aaa.pack())
# Output:
# 12849
# b'12'

But mypy doesn’t like it:

error: "type[A]" has no attribute "unpack"  [attr-defined]

type[A] does, in fact, have an attribute unpack, but how can I convince mypy of this?

Unfortunately, this isn’t well supported by Python’s type system. What you’d need is an intersection type to communicate that the return value of datastructclass is compatible both with A and with some protocol that describes the added unpack/pack functionality. Intersection types currently can’t be expressed in type annotations (though some type checkers support them internally).

As far as convincing mypy to make this work, you may have luck with implementing a mypy plugin for your package.

See also this related thread for more on adding attributes with a decorator and intersection types (edit: d’oh, I just noticed you were the OP of that thread as well :slight_smile:):

1 Like

Another good solution is to not use a decorate and instead use the base-class style of dataclass transforms, using __init_subclass__. The only drawback is that you can’t use slots=True since the class is already fully created then, but OTOH adding methods is not an issue at all and typing tools will pick them up correctly.

1 Like

Have you tried using attrs for this instead of dataclasses?

1 Like

Right, I’d forgotten about that. It seems I encounter this particular problem, or a variation of it, every now and then and bash my head against it for at while.

Oh, nice, I had actually never encountered __init_subclass__ before! I’ll have to look at it more closely, but that looks like it might do the trick.

I haven’t. I’ve never actually used attrs, thought I have encountered them a few times. Thanks for the suggestion, I’ll have a look to see if that could be a solution to my problem.

Unless I’m misunderstanding your suggestion, __init_subclass__ turns out to not work, unfortunately. __init_subclass__ is called too early in the class construction machinery, before the dataclass decorator, which means that the subclass’ fields aren’t set yet. The fields contain the type metadata needed in the serializer.

from dataclasses import dataclass, fields

@dataclass
class FieldsPrinter:
    def __init_subclass__(cls):
        print(fields(cls))

@dataclass
class MyDataclass(FieldsPrinter):
    a: int

mdc = MyDataclass(0)
# Output:
# ()

I’ll look into attrs next.

Don’t use the dataclass decorator explicitly at all, call it from the __init_subclass__ method. And decorator your base class with dataclass_transform so that type checkers know about it.

1 Like

That gets me closer, thanks!

Code
import struct
from dataclasses import astuple, dataclass, fields
from typing import Annotated, ClassVar, dataclass_transform

@dataclass_transform()
class StructLike:
    _serializer: ClassVar[struct.Struct]

    def __init_subclass__(cls):
        dataclass(cls)
        fmt_strs = []
    
        for f in fields(cls):
            try:
                f.type.__metadata__
            except AttributeError as exc:
                msg = f"Field '{f.name}' missing annotation metadata"
                raise TypeError(msg) from exc
    
            try:
                fmt_strs.append(
                    next(
                        filter(lambda m: isinstance(m, struct.Struct), f.type.__metadata__)
                    ).format
                )
            except StopIteration as exc:
                msg = f"Field '{f.name}' has no Struct metadata"
                raise TypeError(msg) from exc
    
        cls._serializer = struct.Struct("".join(fmt_strs))

    def pack(self) -> bytes:
        return bytes(self)

    @classmethod
    def unpack(cls, buffer):
        return cls(*cls._serializer.unpack(buffer))

    def __bytes__(self) -> bytes:
        return self._serializer.pack(*astuple(self))

class A(StructLike):
    a: Annotated[int, struct.Struct("H")] = 0

aaa = A.unpack(b"12")
print(aaa.a)
print(aaa.pack())

Now mypy says:

error: No overload variant of “astuple” matches argument type “StructLike” [call-overload]
note: Possible overload variants:
note: def astuple(obj: DataclassInstance) → tuple[Any, …]
note: def [_T] astuple(obj: DataclassInstance, *, tuple_factory: Callable[[list[Any]], _T]) → _T

I tried to

from __future__ import annotations
from typing import TYPE_CHECKING, dataclass_transform

if TYPE_CHECKING:
    from _typeshed import DataclassInstance

@dataclass_transform
class StructLike:
    def __init_subclass__(cls: type[DataclassInstance]):
        ...

but this (specifically from __future__ import annotations) makes it not work?

TypeError: Field 'a' missing annotation metadata

Well, I managed to satisfy mypy with this:

    def __bytes__(self) -> bytes:
+       assert is_dataclass(self)
+       assert isinstance(self, StructLike)
        return self._serializer.pack(*astuple(self))

Feels a little hacky. Would welcome alternatives.

You might be able to apply @dataclass to StructLike directly to shut mypy up.

1 Like

Yep, that works. Thanks!

@dataclass
@dataclass_transform()
class StructLike:
    ...
# mypy: Success: no issues found in 1 source file