When working with deeply nested data structures, @dataclass requires each level to be defined separately and wired together manually:
@dataclass
class GrandChild:
grandchild_str: str = "grandchild1"
grandchild_num: int = 1
@dataclass
class Child:
grandchild: GrandChild = field(default_factory=GrandChild)
child_str: str = "child"
@dataclass
class Parent:
child: Child = field(default_factory=Child)
parent_str: str = "parent"
The nesting structure is only implicit — you have to read all three classes to understand the shape of the data. The field(default_factory=...) wiring is pure boilerplate.
PEP 712’s field(converter=...) helps at the field level, but the structural verbosity remains.
This gap became concrete while working on fargv, an argument parser prefering dataclass definitions. Nested subcommands have no ergonomic definition syntax in vanilla dataclasses — the hierarchy is real, but expressing it readably requires either significant boilerplate or giving up on dataclasses altogether.
I’ve been experimenting with a decorator that lets you express the same hierarchy inline:
@deep_dataclass
class Parent:
class child:
class grandchild:
grandchild_str: str = "grandchild1"
grandchild_num: int = 1
child_str: str = "child"
parent_str: str = "parent"
The decorator recursively converts nested class blocks into proper @dataclass types and wires field(default_factory=...) automatically. The result is fully compatible with asdict(), ==, and all other stdlib dataclass tooling.
Notably, the motivation here is readability and expressiveness — not serialization. Libraries like dacite and pydantic solve dict-to-dataclass coercion well, but the problem of defining a nested hierarchy cleanly is separate and, I think, underserved.
I’ve published an early version at deep-dataclasses · PyPI and the source is at [ GitHub - anguelos/deep_dataclasses: a decorator to create nested dataclasses from nested class definitions. · GitHub ].
Questions I’d like community input on:
- Is the nested
classsyntax a natural fit, or does it feel like it takes too many liberties with class definition conventions? - Is there a better way to express nested hierarchies that I’m missing?
- Is there appetite for something like this in the stdlib, perhaps as an addition to the
dataclassesmodule?
Happy to discuss tradeoffs — this is early and I’m genuinely uncertain about the right direction.