Draft PEP - Adding "converter" dataclasses field specifier parameter

This toy class is not necessarily intended to take everything int can accept, even if some of it will work. If your intention is to explicitly accept everything int can then yes, that’s quite a long type signature.

Otherwise, in this implementation the input type is declared once in the code you write as the type of the argument in __post_init__. The type declared on the class is the instance variable type, which when conversion of inputs is involved is no longer guaranteed to match. The __init__ method is still generated, but passes assignment off to __post_init__ for the fields declared in its signature. There’s a bit more to it than that but I’ll note that I was not and am still not proposing this implementation.

The reason I have this __post_init__ implementation instead of converters was I found myself writing lots of small helper functions to handle the special cases that a basic int or str wouldn’t know how to handle. Doing this kind of thing inline was cleaner for me to read than having converter=func arguments. Typing was not my primary concern.

def str2int(x: int | str) -> int | None:
    return int(x) if x != "" else None

@define
class X:
    x: int | None = field(converter=str2int)

I agree with @DavidCEllis, besides being counter intuitive, the fact the converters won’t process default values will make it impossible to safely use mutable objects as a default value. As an example, the following:

def to_list_or_empty(value: Iterable | None) -> Sequence[int]:
    if value is None:
        return []
    return list(value)

@dataclass
class SomeClass:
    sequence: Sequence[int] = Field(None, converter=to_list_or_empty)

Would not be possible and:

  1. Many people would fall out to that trap since is the intuitive way to work and…
  2. Developers would have to call the converter in the class initialization anyway which is exactly what the enhancement is trying to prevent:

To workaround this limitation, library authors/users are forced to choose
to:

  • Shuck conversion responsibility onto the caller of the dataclass
    constructor. This can make constructing certain dataclasses unnecessarily
    verbose and repetitive.
  • Provide a custom __init__ and which declares “wider” parameter types and
    converts them when setting the appropriate attribute. This not only duplicates
    the typing annotations between the converter and __init__, but also opts
    the user out of many of the features dataclass provides.
1 Like

I’m not quite sure what point you are trying to make, but you don’t need mutable defaults; that’s what default_factory is for:

@dataclass
class SomeClass:
    sequence: Sequence[Int] = field(default_factory=int)

The class author has the opportunity to use the converter to define an appropriate default value or an appropriate default factory. The converter would only need to be applied to values provided by the user of the class.

1 Like

I was trying to prove we needed to run the converter for default value.
I was wrong hahah. I wasn’t aware of the default_factory existence or that there already is a Field class in the standard library.
Joke’s on me.

Discussing is moving to: PEP 712: Adding a "converter" parameter to dataclasses.field