Generalize replace() function

storchaka · June 26, 2023, 9:42am

It is common task to create a new object based on the existing object, but with some attributes changed. dataclasses.replace() provides this feature for dataclasses, named tuples have the _raplace() method, and some concrete classes (date, time, datetime, inspect.Signature, inspect.Parameter, code object) have the replace() method.

I propose to make dataclasses.replace() extensible and work with all these classes and user classes which support corresponding protocol. It should simple call the __replace__() method. All classes mentioned above should provide this method (as an alias of existing _replace() or replace() method), and user classes can implement it as well. Good candidates for adding the __replace__() method are SimpleNamespace and AttrDict (if we keep the latter).

Advantage of using replace(obj) over obj.replace() is that the method name does not conflict with attribute name (especially important for dataclasses, named tuples and SimpleNamespace).

Advantage of using replace(obj) over obj._replace() is that the latter looks like using non-public API.

NeilGirdhar · June 26, 2023, 12:18pm

Another advantage of the free function is the general object oriented design principle that free functions should be preferred over methods when possible.

effigies · June 26, 2023, 2:40pm

I really like this idea. Is the proposal to keep this in dataclasses? I feel like it would be nice to have somewhere more generic, but I can’t think of anywhere more appropriate.

storchaka · June 26, 2023, 4:38pm

Yes, it is perhaps not the best place for such function, but what would be better? At least dataclasses.replace() already exists.

Jelle · June 26, 2023, 4:59pm

Maybe the types module?

storchaka · June 26, 2023, 6:00pm

The types module contains:

A number of built-in types which are not exposed in the builtins module.
Some functions related to building new types.

replace() is not related to types.

effigies · June 26, 2023, 6:14pm

Fair enough. I suppose it mostly applies to dataclass-like classes, in any case, as your internal fields-to-replace have to pretty closely map to your __init__ arguments for such a generic function to be useful.

jsbueno · June 27, 2023, 2:59pm

If a __replace__ special method gets traction (and I very much like the idea), the replace call would (and could) naturally be a built-in function.

Although there are some methods-that-use dunders that aren’t - like copy.copy and deepcopy, and all the pickle protocol, so it is not a given.

Since it looks like the default behavior for .__replace__ is to create a new shallow copy with the requested replacements, maybe the copy module itself could be a coherent place for it.

a-reich · June 27, 2023, 4:50pm

My two cents are that if this were added, copy seems a reasonable location since the method promises to return a new object, so it’s basically a copy-with-replacing. Relatedly, are the implementations/protocol contract supposed to return a shallow or deep copy? Or should the API have an option for either?

effigies · June 27, 2023, 5:07pm

I would expect it to be the same as calling __init__() with some identical values and some alternative values. If __init__ makes a (deep)copy, then __replace__ will make a (deep)copy. Any deviations from that I would want documented.

And if I wanted a deepcopy, regardless, I would deepcopy(replace(obj, **kwargs)) or replace(deepcopy(obj), **kwargs).

ntessore · June 27, 2023, 5:21pm

Absorbing the replace() functionality into copy(obj, attr=newvalue) and deepcopy(obj, attr=newvalue) is not the worst interface either.

storchaka · June 27, 2023, 5:58pm

Yes, at first glance the copy module looks the best candidate. I thought about this. And it would be nice to support wider class of objects in replace() by falling back to copy() or the pickle protocol. But there are differences between copy() and replace() which makes this difficult.

copy() supports immutable objects. Setting attributes will fail later, but with wrong exception. Some of these objects could be supported using the pickle protocol, but __copy__() and global registry have priority.
copy() treats classes and functions as atomic objects and return the argument. Most of attributes of Python classes and functions are mutable. Changing them will affect the original object.
By default (when the pickle protocol is used) copy() bypasses __init__(). For replace() we usually want to call __init__() to set calculated attribute which depend on specified attributes.
By default copy() sets all attributes, including internal attributes which should not be specified by user and should not be shared between instances (in dataclasses they are defined as fields with init=False).

The behavior of copy(obj) and replace(obj) will be too different to merge them in one function, and perhaps too different to have them in the same module. It is possible to add support of more general objects in replace(), but it will either be limited to very narrow class of objects, or work incorrectly in many cases.

stoneleaf · June 27, 2023, 11:16pm

I think replace() should be its own function and not merged into copy(), but the copy module seems like a fine place for it – you are, essentially, copying an obect, and then making a change to it.

jamestwebber · June 27, 2023, 11:30pm

I also think a replace that worked on immutable objects via __init__ could be useful when trying to write in somewhat functional style. Imagine I had a list of namedtuples and I wanted to blank out a field. I could write [replace(t, big_secret="") for t in my_list] to map them.

edit: of course, I didn’t use the proposed form

thejcannon · June 28, 2023, 2:26am

Maybe a pro, maybe a con, but right now I don’t see mypy or pyright typechecking dataclasses.replace semantically. (opened an issue for pyright)

Perhaps __prepare__ would make that easier? Hopefully not harder

+1 overall though, and +1 to copy.replace or similar.

Tinche · June 28, 2023, 8:43am

I can add this to attrs if we decide to do it.

NeilGirdhar · June 28, 2023, 9:53am

It might be more convincing if there were a type checking operator to convert the dataclass fields to a TypedDict. Then, the annotation for replace would be:

def replace[T: DataclassInstance](x: T,
                                  /,
                                  **kwargs: Unpack[DataclassFields[T]]) -> T:
    ...

I’m not sure about the more general replace requested here. Are all non-method attributes replaceable? If so, maybe replace DataclassFields with NonMethodAttributes in the above?

antonagestam · July 2, 2023, 6:45pm

I guess that would need some special operator to make all fields optional?

storchaka · September 1, 2023, 10:22am

Thank you all. By the results of the discussion the new function will be added in the copy module.

Issue and PR:

github.com/python/cpython

Add generalized replace() function

opened 10:10AM - 01 Sep 23 UTC

serhiy-storchaka

type-feature stdlib

# Feature or enhancement ### Has this already been discussed elsewhere? I have… already discussed this feature proposal on Discourse ### Links to previous discussion of this feature: https://discuss.python.org/t/generalize-replace-function/28511 ### Proposal: Some classes have the `replace()` method, which creates a modified copy of the object (modified values are provided as keyword arguments). Named tuples have the `_replace()` for this (to avoid conflict with attribute `replace`). Dataclasses provide a global function for this. I proposed to generalize `dataclasses.replace()` to support all classes which need this feature. By the result of the discussion on discuss.python.org, the new function will be added in the `copy` module as `copy.replace()`. `dataclasses.replace()` will continue to support only dataclasses. Dataclasses, named tuples and all classes which currently have the `replace()` method with suitable semantic will get also the `__replace__()` method. Now you can add such feature in new classes without conflicting with the `replace` attribute, and use this feature in general code without conflicting with `str.replace()` and like. For now, `copy.replace()` is more limited than `copy.copy()` and does not fall back to use the powerful pickle protocol. ### Linked PRs * gh-108752

github.com/python/cpython

gh-108751: Add copy.replace() function

python:main ← serhiy-storchaka:copy-replace

opened 10:19AM - 01 Sep 23 UTC

serhiy-storchaka

+297 -71

It creates a modified copy of an object by calling the object's `__replace__()` …method. It is a generalization of dataclasses.replace(), named tuple's _replace() method and replace() methods in various classes, and supports all these stdlib classes. * Issue: gh-108751 ---- :books: Documentation preview :books:: https://cpython-previews--108752.org.readthedocs.build/

NeilGirdhar · September 1, 2023, 5:43pm

That’s awesome!!

Will the typeshed have an appropriate base class? Maybe something like:

class SupportsReplace[**P]:
    def __replace__(self, **kwargs: P.kwargs) -> Self:
        ...

# copy.replace:
def replace[T: SupportsReplace[P], **P](obj: T, /, **changes: P.kwargs) -> T:
    ...

I’m not sure if the latter annotation is allowed?