Generalize replace() function

It is common task to create a new object based on the existing object, but with some attributes changed. dataclasses.replace() provides this feature for dataclasses, named tuples have the _raplace() method, and some concrete classes (date, time, datetime, inspect.Signature, inspect.Parameter, code object) have the replace() method.

I propose to make dataclasses.replace() extensible and work with all these classes and user classes which support corresponding protocol. It should simple call the __replace__() method. All classes mentioned above should provide this method (as an alias of existing _replace() or replace() method), and user classes can implement it as well. Good candidates for adding the __replace__() method are SimpleNamespace and AttrDict (if we keep the latter).

Advantage of using replace(obj) over obj.replace() is that the method name does not conflict with attribute name (especially important for dataclasses, named tuples and SimpleNamespace).

Advantage of using replace(obj) over obj._replace() is that the latter looks like using non-public API.

20 Likes

Another advantage of the free function is the general object oriented design principle that free functions should be preferred over methods when possible.

2 Likes

I really like this idea. Is the proposal to keep this in dataclasses? I feel like it would be nice to have somewhere more generic, but I canā€™t think of anywhere more appropriate.

1 Like

Yes, it is perhaps not the best place for such function, but what would be better? At least dataclasses.replace() already exists.

1 Like

Maybe the types module?

The types module contains:

  1. A number of built-in types which are not exposed in the builtins module.
  2. Some functions related to building new types.

replace() is not related to types.

Fair enough. I suppose it mostly applies to dataclass-like classes, in any case, as your internal fields-to-replace have to pretty closely map to your __init__ arguments for such a generic function to be useful.

If a __replace__ special method gets traction (and I very much like the idea), the replace call would (and could) naturally be a built-in function.

Although there are some methods-that-use dunders that arenā€™t - like copy.copy and deepcopy, and all the pickle protocol, so it is not a given.

Since it looks like the default behavior for .__replace__ is to create a new shallow copy with the requested replacements, maybe the copy module itself could be a coherent place for it.

14 Likes

My two cents are that if this were added, copy seems a reasonable location since the method promises to return a new object, so itā€™s basically a copy-with-replacing. Relatedly, are the implementations/protocol contract supposed to return a shallow or deep copy? Or should the API have an option for either?

I would expect it to be the same as calling __init__() with some identical values and some alternative values. If __init__ makes a (deep)copy, then __replace__ will make a (deep)copy. Any deviations from that I would want documented.

And if I wanted a deepcopy, regardless, I would deepcopy(replace(obj, **kwargs)) or replace(deepcopy(obj), **kwargs).

1 Like

Absorbing the replace() functionality into copy(obj, attr=newvalue) and deepcopy(obj, attr=newvalue) is not the worst interface either.

1 Like

Yes, at first glance the copy module looks the best candidate. I thought about this. And it would be nice to support wider class of objects in replace() by falling back to copy() or the pickle protocol. But there are differences between copy() and replace() which makes this difficult.

  1. copy() supports immutable objects. Setting attributes will fail later, but with wrong exception. Some of these objects could be supported using the pickle protocol, but __copy__() and global registry have priority.
  2. copy() treats classes and functions as atomic objects and return the argument. Most of attributes of Python classes and functions are mutable. Changing them will affect the original object.
  3. By default (when the pickle protocol is used) copy() bypasses __init__(). For replace() we usually want to call __init__() to set calculated attribute which depend on specified attributes.
  4. By default copy() sets all attributes, including internal attributes which should not be specified by user and should not be shared between instances (in dataclasses they are defined as fields with init=False).

The behavior of copy(obj) and replace(obj) will be too different to merge them in one function, and perhaps too different to have them in the same module. It is possible to add support of more general objects in replace(), but it will either be limited to very narrow class of objects, or work incorrectly in many cases.

1 Like

I think replace() should be its own function and not merged into copy(), but the copy module seems like a fine place for it ā€“ you are, essentially, copying an obect, and then making a change to it.

5 Likes

I also think a replace that worked on immutable objects via __init__ could be useful when trying to write in somewhat functional style. Imagine I had a list of namedtuples and I wanted to blank out a field. I could write [replace(t, big_secret="") for t in my_list] to map them.

edit: of course, I didnā€™t use the proposed form :sweat_smile:

Maybe a pro, maybe a con, but right now I donā€™t see mypy or pyright typechecking dataclasses.replace semantically. (opened an issue for pyright)

Perhaps __prepare__ would make that easier? Hopefully not harder :sweat_smile:

+1 overall though, and +1 to copy.replace or similar.

I can add this to attrs if we decide to do it.

It might be more convincing if there were a type checking operator to convert the dataclass fields to a TypedDict. Then, the annotation for replace would be:

def replace[T: DataclassInstance](x: T,
                                  /,
                                  **kwargs: Unpack[DataclassFields[T]]) -> T:
    ...

Iā€™m not sure about the more general replace requested here. Are all non-method attributes replaceable? If so, maybe replace DataclassFields with NonMethodAttributes in the above?

I guess that would need some special operator to make all fields optional?

Thank you all. By the results of the discussion the new function will be added in the copy module.

Issue and PR:

8 Likes

Thatā€™s awesome!!

Will the typeshed have an appropriate base class? Maybe something like:

class SupportsReplace[**P]:
    def __replace__(self, **kwargs: P.kwargs) -> Self:
        ...

# copy.replace:
def replace[T: SupportsReplace[P], **P](obj: T, /, **changes: P.kwargs) -> T:
    ...

Iā€™m not sure if the latter annotation is allowed?