In which case what is wrong with using an __init__ method for that?
Taking the example from the PEP:
def str_or_none(x: Any) -> str | None:
return str(x) if x is not None else None
@dataclasses.dataclass
class InventoryItem:
id: int = dataclasses.field(converter=int)
skus: tuple[int, ...] = dataclasses.field(converter=tuple[int, ...])
vendor: str | None = dataclasses.field(converter=str_or_none)
names: tuple[str, ...] = dataclasses.field(
converter=lambda names: tuple(map(str.lower, names))
)
stock_image_path: pathlib.PurePosixPath = dataclasses.field(
converter=pathlib.PurePosixPath, default="assets/unknown.png"
)
shelves: tuple = dataclasses.field(
converter=tuple, default_factory=list
)
With __init__ that is
@dataclasses.dataclass
class InventoryItem:
id: int
skus: tuple[int, ...]
vendor: str | None
names: tuple[str, ...]
stock_image_path: pathlib.PurePosixPath
shelves: tuple
def __init__(self,
id: int | str,
skus: Iterable[int | str],
vendor: Vendor | None,
names: Iterable[str],
stock_image_path: str | pathlib.PurePosixPath = "assets/unknown.png",
shelves: Iterable = (),
):
self.id = int(id)
self.skus = tuple(map(int, skus))
self.vendor = str(vendor) if vendor is not None else None,
self.names = tuple(map(str.lower, names))
self.stock_image_path = pathlib.PurePosixPath(stock_image_path)
self.shelves = tuple(shelves)
Some might consider this boilerplate but I don’t because nothing here is really redundant. The types for the fields are not redundant. The signature of __init__ with types and defaults for parameters is not redundant. The code in the body of the __init__ method is not redundant. The field names are repeated a few times but no line of code here is redundant. If there were no converters then there would be redundancy because the types in the signature of __init__ would be the same as the types of the fields and each line in the body of __init__ would just be self.x = x. Without converters the __init__ method looks like redundant boilerplate but as soon as you want to have actual code in __init__ it is not boilerplate any more.
The example with __init__ has a few more lines of code but that comes from the inclusion of types in the signature of __init__. It might seem like the types of the parameters for __init__ are redundant but they are not. For example the parameter for str_or_none might be typed as Any but that does not necessarily mean that you would want to accept Any as an input for the vendor parameter in the InventoryItem constructor. I have guessed here that the type should be Vendor | None but in the original code it is unclear what it is supposed to be.
I don’t think that trying to make something that should usually be code in an __init__ method look declarative makes anything easier to understand or makes it any easier to write the code. It is better to put the code in an __init__ method all in one place rather than writing auxiliary functions like str_or_none and noun-ifying simple code into “converters” and “default factories”. It is definitely easier to understand what the signature of __init__ is if you can see the __init__ method rather than scanning through default factories and converter functions. It is also easier to understand what is actually executing in the constructor if you can see the body of the __init__ method. The fact that behind the scenes the dataclass decorator will go and textually build the code for this __init__ method is a clear sign that maybe what you should be doing is just writing an __init__ method.
What does not quite work with __init__ is frozen dataclasses. It does not seem to be possible to use either __init__ or __new__ with a frozen dataclass without using object.__setattr__ which is awkward. You can add an alternate classmethod constructor like InventoryItem.new(...) but then that cannot be used with the ordinary InventoryItem(...) syntax. Maybe there is a way to improve defining conversions or validation for frozen dataclasses.