Add converter to dataclass.field

thejcannon · January 19, 2023, 9:20pm

This already exists in attrs, and has great benefits for type-narrowing code (e.g. converter=tuple has an __init__ accepting any iterable, but the field type is tuple).

My own specific purposes are for a case where we can’t use 3rdparty, so using attrs isn’t a solution.

Additionally, this would help shore some of the gap between type checkers and attrs, as even with dataclass_transform this isn’t supported.

This also helps free users from attempting to roll their own (possibly incorrect) __init__. E.g. did you get the right type for the unary arg to dict’s constructor for all your dict fields?

(I floated the idea on typing-sig to add this only to dataclass_transform and they suggested just adding it to the stdlib)

tmk · January 20, 2023, 9:45am

In the rejected ideas section of the dataclass_transform PEP it says this regarding the converter field:

This is tricky to support since the parameter type in the synthesized __init__ method needs to accept uncovered values, but the resulting field is typed according to the output of the converter.

[…]

There may be no good way to support this because there’s not enough information to derive the type of the input parameter. One possible solution would be to add support for a converter field specifier parameter but then use the Any type for the corresponding parameter in the __init__ method.

Is this still an issue? I’m actually a bit confused by this section, because it seems to work fine in attrs?

While we’re adding features from attrs to the stdlib, I would really like to have the alias field that was introduced in attrs 22.2.0 (and which is also supported by dataclass_transform!)

>>> from attrs import define, field
>>> @define
... class C:
...    _x: int = field(alias="x")
>>> inspect.signature(C.__init__)
<Signature (self, x: int) -> None>

You could then emulate the converter functionality like this:

from functools import cached_property
from attrs import define, field

@define(slots=False)
class C:
    _x: str = field(alias="x")

    @cached_property
    def x(self) -> int:
        return int(self._x)  # conversion

but this is of course much more verbose.

thejcannon · January 20, 2023, 2:58pm

Is this still an issue? …

I don’t think so. Jelle Zijlstra also didn’t think so on typing-sig. And it makes sense. The converter’s output type should be the field type, and the unary input becomes the __init__ param type.

My own specific use-case here is in a library that requires truly immutable dataclasses, so we use frozen=True dataclass with immutable types (like tuple). In that regard, the cached property way of doing things is a double-whammy.

I’d prefer to keep the resulting PEP single and focused to just converter

Dutcho · January 21, 2023, 7:15am

I tried conversion in __post_init__, but

the field annotation then needs to be the Union of pre- and post-conversion types, which confuses human readers and type checkers
conversion in __post_init__ violates any frozen=True

This idea would solve both, so I’d be happy to see it implemented.