Dataclasses.asdict - transformation of dict-fields - change type(obj) to dict directly

cschroeer · March 5, 2023, 2:25pm

Hello all,

I refer to the current implementation of the public method asdict within dataclasses-module transforming the dataclass input to a dictionary.

Sometimes, a dataclass has itself a dictionary as field.
Therefore, the current implementation is used for transformation ( see. cpython/dataclasses.py at 0a7936a38f0bab1619ee9fe257880a51c9d839d5 · python/cpython · GitHub):

...
    elif isinstance(obj, dict):
        return type(obj)((_asdict_inner(k, dict_factory),
                          _asdict_inner(v, dict_factory))
                         for k, v in obj.items())
...

As it is already clear, that obj is a dict, I prefer to change this line to

...
    elif isinstance(obj, dict):
        return dict((_asdict_inner(k, dict_factory),
                          _asdict_inner(v, dict_factory))
                         for k, v in obj.items())
...

Otherwise, using dataclasses in combination with SqlAlchemy, the current implementation of asdict leads to an error, as already reported here:

github.com/sqlalchemy/sqlalchemy

TypeError _MKeyfuncMapped.init() when calling type() on mapped dict-field in a dataclass

opened 05:34PM - 03 Mar 23 UTC

cschroeer

bug orm awaiting info dataclasses

### Describe the bug Having a dataclass like ``` @dataclass class Datacl…assParent: id: UUID = field(default_factory=uuid7, init=False) name: str childs: dict[str, DataclassChild] = field(default_factory=dict, init=False) ``` and the corresponding child class as ``` @dataclass class DataclassChild: id: UUID = field(default_factory=uuid7, init=False) name: str ``` with a imperative mapping like: ``` ... mapper_registry.map_imperatively( DataclassParent, dataclass_parent_table, properties={ "childs": relationship( DataclassChild, collection_class=attribute_keyed_dict("name"), ) }, ... ``` leads to the following behavior: Without mapping (only dataclasses), the following code works: ``` childs = {"child1": DataclassChild("child1"), "child2": DataclassChild("child2")} parent = DataclassParent("parent1") parent.childs = childs type(parent.childs)(((k), (v)) for k, v in parent.childs.items()) ``` However, with SqlAlchemy Mapping, the following code leads to an error: ``` map_entities() childs = {"child1": DataclassChild("child1"), "child2": DataclassChild("child2")} parent = DataclassParent("parent1") parent.childs = childs type(parent.childs)(((k), (v)) for k, v in parent.childs.items()) ``` The following code works: ``` map_entities() childs = {"child1": DataclassChild("child1"), "child2": DataclassChild("child2")} parent = DataclassParent("parent1") parent.childs = childs dict(((k), (v)) for k, v in parent.childs.items()) ``` No database connection needed to reproduce this error. The code of ``` type(parent.childs)(((k), (v)) for k, v in parent.childs.items()) ``` is used e. b. in asdict function of dataclasses (see. https://github.com/python/cpython/blob/0a7936a38f0bab1619ee9fe257880a51c9d839d5/Lib/dataclasses.py#L1388 ### Optional link from https://docs.sqlalchemy.org which documents the behavior that is expected https://docs.sqlalchemy.org/en/20/orm/collection_api.html ### SQLAlchemy Version in Use 2.0.4 ### DBAPI (i.e. the database driver) n.a. ### Database Vendor and Major Version n.a. ### Python Version 3.10 ### Operating system Linux ### To Reproduce ```python see. https://github.com/cschroeer/dataclasses-asdict-core ``` ### Error ``` TypeError: _mapped_collection_cls.<locals>._MKeyfuncMapped.__init__() takes 1 positional argument but 2 were given ``` ### Additional context _No response_

What do you think?

Thanks and best,
Christoph

DavidCEllis · March 5, 2023, 4:21pm

It is not clear the object is a dict, this also catches subclasses of dict.

>>> class MyDict(dict): ...
...
>>> isinstance(MyDict(), dict)
True

This change would alter the behaviour of asdict in cases where subclasses of dict being returned as-is is relied upon.

Where asdict_altered has this change.


class MyDict(dict):
    """Assume this does something useful"""
    def __repr__(self):
        original_repr = super().__repr__()
        return f"MyDict({original_repr})"

@dataclass
class X:
    x: MyDict = field(default_factory=MyDict)


inst = X()
inst.x["Key"] = "value"

converted = asdict(inst, dict_factory=MyDict)
new_converted = asdict_altered(inst, dict_factory=MyDict)

print(f"{converted=}")
print(f"{new_converted=}")

Output:

converted=MyDict({'x': MyDict({'Key': 'value'})})
new_converted=MyDict({'x': {'Key': 'value'}})

In this case MyDict doesn’t do anything useful but if it did you’ve potentially broken code that relies on asdict returning the subclass.

cschroeer · March 5, 2023, 6:19pm

Ok check. Thanks for clarification. Make sense.

However, adding an init-Method, the asdict-method leads to the following behavior.

Reproduce the error

from dataclasses import field, asdict, dataclass


class MyDict(dict):

    def __init__(self):
        pass

    """Assume this does something useful"""
    def __repr__(self):
        original_repr = super().__repr__()
        return f"MyDict({original_repr})"

@dataclass
class X:
    x: MyDict = field(default_factory=MyDict)


inst = X()
inst.x["Key"] = "value"

converted = asdict(inst, dict_factory=MyDict)

print(f"{converted=}")

Error

Traceback (most recent call last):
  File "/home/cschroeer/python-projects/dataclasses-asdict-core/mydict.py", line 22, in <module>
    converted = asdict(inst, dict_factory=MyDict)
  File "/usr/local/lib/python3.10/dataclasses.py", line 1238, in asdict
    return _asdict_inner(obj, dict_factory)
  File "/usr/local/lib/python3.10/dataclasses.py", line 1245, in _asdict_inner
    value = _asdict_inner(getattr(obj, f.name), dict_factory)
  File "/usr/local/lib/python3.10/dataclasses.py", line 1275, in _asdict_inner
    return type(obj)((_asdict_inner(k, dict_factory),
TypeError: MyDict.__init__() takes 1 positional argument but 2 were given

Is this also expected?

DavidCEllis · March 5, 2023, 6:28pm

Well, yes - you’ve changed the __init__ which is what’s being used by type(cls)(...). This usage requires the subclass to have the same signature as dict (or at least a compatible signature). This would also fail to function as the argument to dict_factory even with the change.

cschroeer · March 6, 2023, 9:13am

Thanks for clarification