Why did the developers add deepcopy to asdict, but did not add it to _field_init (for safer creation of default values via default_factory)?

Why did the developers add deepcopy to asdict, but did not add it to _field_init (for safer creation of default values via default_factory)?

from typing import List
from dataclasses import dataclass, field, asdict

class Viewer:
    Name: str

Default_users = [Viewer('Admin')]

class Films:
    Title: str
    Viewers: List[Viewer] = field(default_factory=lambda: Default_users)

film_1 = Films('Satan-tango')
film_2 = Films('Mirror')
film_1_dict = asdict(film_1)
film_2_dict = asdict(film_2)
assert film_1 == Films(Title='Satan-tango', Viewers=[Viewer(Name='Admin'), Viewer(Name='Andrey')])
assert film_2 == Films(Title='Mirror', Viewers=[Viewer(Name='Admin'), Viewer(Name='Andrey')])
assert film_1_dict == {'Title': 'Satan-tango', 'Viewers': [{'Name': 'Admin'}, Viewer(Name='Guido')]}
assert film_2_dict == {'Title': 'Mirror', 'Viewers': [{'Name': 'Admin'}]}

The viewer Andrey is added to two films. Everything is correct in the dict version.

If this line is replaced with this - globals[default_name] = lambda: copy.deepcopy(f.default_factory()) then everything will be correct.

I think current consensus is we regret adding deepcopy to asdict, see dataclasses.astuple (and .asdict) do deepcopy on all fields · Issue #88071 · python/cpython · GitHub

1 Like

The code example uses the default_factory incorrectly and is effectively the same as field(default=Default_users). The option was added to be able to return a new value on each instantiation, which your code doesn’t do.

I wasn’t involved with the design of this API, but one reason for not copying the return value of the default_factory callback is that this is not necessary in general (you can always do the copy yourself if that is convenient), and can be expensive when the default is calculated dynamically.

1 Like

If you use field(default=Default_users), then ValueError: mutable default <class 'list'> for field Viewers is not allowed: use default_factory will be thrown. One of the main tasks of default_factory is to avoid mutable default values (see dataclasses — Data Classes — Python 3.11.5 documentation). I built an unnatural example of using mutable fields via default_factory. If you add deepcopy to default_factory, then it will be more difficult (or impossible) to build such an example. Although I will agree that for most cases, default_factory is a safe API for creating default values.

For your example field(default_factory=lambda: [Viewer('Admin')]) is clearer and doesn’t have the problem you ran into.

Automaticly deep copying the result of a default_factory callback would not be backward compatible and would introduce problems as described in the issue that @hauntsaninja linked to.

1 Like