Make __replace__ stop interfering with variance inference

Would this be a patch that for 3.13 or would need to wait for 3.15? It seems ignoring __replace__ would just be an update to the typing spec?

I think the current situation also makes the subtyping section of PEP 767 – Annotating Read-Only Attributes | peps.python.org moot.

What is blocking from special casing __replace__ and allowing linters to warn unsafe uses? Most uses may be unsafe, but I’d prefer this over a replace=False keyword or __replace__ = None field as it would end up being cryptic line noise for most users.

1 Like

ignoring replace would be less sound. Putting that type checkers should ignore it in the typing spec than means that no type checker can even optionally warn for this, as they wouldn’t be specification compliant. The other way around here is actually better. Leave the behavior as-is, typecheckers can offer a setting to ignore this, but that they still flag it by default leaves the typesystem sound, and does not put soundness at odds with specification compliance.

1 Like

I think that would be a bad solution. It would mean that according to the spec we can’t get covariant behavior of these types, and consequently the pattern should not be taught in the docs either. If special casing __replace__ in the spec is rejected we should go the keyword option to remove the method.

I don’t think it’s a good idea to have the covariant behavior of these types promoted while knowing that to be unsound. I’m fine with a spec update that handles __replace__ being checked based on it’s presence as a method with covariant concerns, and thus still being behavior people can opt out of, but I’m currently of the opinion that the way that already works is preferable to one that won’t work until at least python 3.15 at this point. __replace__ = None is a reasonable option. This matches other data model methods such as writing __hash__ = None and works without any further changes.

Is there any discussion happening in the background regarding this? Personally I would like to know whether I can expect to keep Python 3.12 frozen covariant dataclasses when upgrading to newer versions. If we need to tell type checkers to be aware of __replace__ = None as a stop-gap so be it, but I think some escape hatch is needed.

__hash__ = None being the solution for hash makes sense as that’s a feature of plain classes, it’s not something being added later. __replace__ on the other hand is a feature that is being added (in this case) by @dataclass so to me it makes sense to (eventually) have this be configurable as an argument to @dataclass.

For dataclasses I would suggest:

  1. dataclass_transform adds a replace_default argument and tools recognise a replace argument on objects decorated by it.
    • This matters as dataclasses adding this means that anything decorated by @dataclass_transform is now assumed to have __replace__ even if they don’t implement it and there is no current way to disable this behaviour
    • This could also be used to support the argument in older versions of dataclasses by wrapping @dataclass on those versions
  2. dataclasses gains a replace option in a later version of Python

Tools could also respect __replace__ = None for current Python, but I don’t think this is satisfactory as the intended way to configure dataclass construction. Other methods added by dataclasses are configured by arguments to @dataclass so it seems out of place that __replace__ isn’t.

3 Likes

I agree that this is the best path forward, and just to get more concrete about next steps: I don’t believe this requires a PEP. I think (1) would require someone to make a PR to the typing spec, which the typing council would need to approve, and (2) would require a PR to CPython.

(I also wouldn’t be opposed to having the typing spec at least suggest that type checkers respect __replace__ = Noneas well, so we have something that can work for already-existing Python versions with __replace__. But I don’t think that’s the preferable long-term supported solution.)

For (2) I’ll note that @tmk already made an issue for dataclasses - Add a new parameter to `@dataclass` to optionally turn off the synthesizing of a `__replace__` method Ā· Issue #140457 Ā· python/cpython Ā· GitHub

1 Like

I’m trying to wrap my head around this problem for the n-th time. Was making a class @final ever been proposed as a way of restoring covariance while still keeping __replace__? If the main issue is potential subclasses narrowing a field’s type, making sure there are no subclasses would probably close that gap?

What is the use of covariance if you can’t subclass?

2 Likes

I think it’s a good idea to have the option to remove replace for dataclasses.

Still, users of such classes are going to want an idiomatic way of replacing an attribute. Since replace(obj, **kwargs) is gone, I propose that we add a function in copy:

def static_replace[T: SupportsStaticReplace](cls: type[T], obj: T, /, **kwargs: Any) -> T: ...

This form of replace does not suffer from the covariant attribute problem since you know which type you’re creating. It also can’t break LSP since it’s not inherited. If the type checker sees cls statically, then it can verify kwargs.

1 Like

Covariance fundamentally concerns whether C[A] is a subtype of C[B] given the relationship of A to B, so whether C is subclassable or not doesn’t necessarily enter into it.

Most of the unsoundness mentioned in this thread is related to subclasses of C narrowing A to a subtype and then this getting violated via replace. This problem completely disappears if C is final, yeah? I’m not saying A or B should be final.

Let me re-summarize the issues as I understand them.

First of all we have some unsoundness, which is exposed even without generics

from dataclasses import dataclass

@dataclass(frozen=True)
class A:
    value: str | int

@dataclass(frozen=True)
class B(A):
    value: int


def process(elem: A) -> None:
    new_elem = elem.__replace__(value='hello')
    print(type(new_elem), new_elem.value)

process(B(value=1)) # <class '__main__.B'> hello  <-  False negative introduced when adding __replace__ in 3.13

Then we have some regressions due to frozen generic containers becoming invariant in 3.13.

from dataclasses import dataclass

@dataclass(frozen=True)
class A[T]:
    value: T


@dataclass(frozen=True)
class B(A[int]):
    value: int



def process(elem: A[int | str]) -> None:  ...


a = A(value='hello')
process(a) # 3.12 OK , 3.13 Fail
process(A(value='hello')) # 3.12 OK, 3.13 OK (due to bidirectional inference, I assume)
process(B(value=1))     # 3.12 OK , 3.13 Fail

# Additional example with a subclasses
class H: ...
class H2(H): ...

def process2(elem: A[H]) -> None: ...

a2 = A(value=H2())
process2(a2) # 3.12 OK , 3.13 Fail
process2(A(value=H2())) # 3.12 OK, 3.13 OK

This shows that one can be exposed to this both by subclassing the container and by instantiating the container as I think @Tinche is suggesting.

Luckily mypy and pyright behave the same at least!
Mypy Playground
Pyright Playground

Worth noting that dataclasses.replace does not use __replace__. It just checks if the object is a dataclass and then directly calls dataclasses._replace, which is the same function that gets attached as the __replace__ method. Only copy.replace requires the method be present.

I was actually somewhat surprised __replace__ isn’t generated per-class like the other methods.

I understand. You folks in here are mostly fixated on this scenario where the container subclass narrows the type of a field, generic or not, and introduces unsoundness via replace. The proposed solution is a way to opt out of replace.

The OP, however, just mentions that frozen generic dataclasses (say, C[T]) become invariant with respect to T when the 3.12+ auto variance algorithm is used, and replace is the reason.

Here’s my situation. I like using frozen generic dataclasses (and as one of the authors of attrs, the progenitor of dataclasses, I’ve probably used them more than most), and I want them to be covariant. I don’t want to give up replace since replace is very useful. I would gladly give up subclassing with type-narrowing (since I’ve literally never used it), but that’s not really possible. Then, I would rather give up subclassing in general (via @final) than replace. I’m not sure if making the container class @final would actually solve the issue, hence me posting here to get feedback from typing experts.

1 Like

Just a note that it is not limited to new style generics.

from dataclasses import dataclass
from typing import TypeVar, Generic

T = TypeVar('T', covariant=True)

@dataclass(frozen=True)
class K(Generic[T]):
    value: T

3.12: No errors for mypy nor pyright
3.13: Mypy: error: Cannot use a covariant type variable as a parameter

Making the container class final doesn’t affect whether replace is safe or not. Consider this code:

@final
@dataclass(frozen=True)
class C[T]:
    attr: T

class A:
    pass

class B(A):
    pass

def func(val: C[A]) -> C[A]:
    return replace(val, attr=A())

val: C[B] = C(B())
func(val)

The call to func cannot be allowed since it would change the attribute to an A object even though it is defined to be B. The only way to make that happen is if C[B] is not assignable to C[A], i.e. if T is not covariant.

Whenever a class’s type variable (directly) appears in the arguments of a method, that type variable cannot be covariant. Unfortunately, there’s no real way around that since that behaviour doesn’t depend on some other interaction with the type system. So if you want to keep covariance in a type variable that occurs in an attribute’s type, you have to give up __replace__.

Hey, thanks for the reply but I can’t figure out the problem in your snippet.

C is frozen so nothing is being changed. func takes an instance of C[A] (in this case, a C[B] in particular but it’s fine because of covariance) and produces a fresh instance of C[A]. The return type is also C[A] so it matches. Where’s the soundness hole?

1 Like

Even if giving up subclassing were an option for generic containers, you’d also have to forbid this to close soundness holes with replace.

from dataclasses import dataclass

@dataclass(frozen=True)
class A:
    value: int | str 

class B(A):
    value: int

I agree that replace is convenient but as designed it simply does not play well with the covariant types and the true solution is to have a replace like