Thanks for the clear exposition—you’re right that one way to restore soundness would be to block covariant inheritance of dataclass fields.
However, I think that’s too high a price to pay just for the convenience of replace
. I’d like to propose an alternative.
First, let’s observe that the problem arises with class factories written like this:
class C:
@classmethod
def create(cls, ...) -> Self:
return cls(...) # You can't actually know the parameters of cls!
You could have create
return C
, which would be more accurate, but that raises the question: why are we inheriting this class factory in subclasses at all?
The core issue is that class factories are inherited, even though they probably shouldn’t be. The intended use of a class factory like create
is via an explicit call like C.create
. You don’t typically need something like:
def f(t: type[C]) -> C:
return t.create(...)
f(C)
If you do, a better approach might be:
def f(factory: Callable[..., C]) -> C:
return factory(...)
f(C.create)
In short, create
should only be callable on a specific class, not on a variable whose type merely is a subclass of that class.
A similar situation arises with __replace__
. Consider:
def f(x: D) -> D:
return replace(x, some_member=1)
As others have noted, we can’t guarantee that some_member
hasn’t been narrowed in a way that makes assigning it an int
invalid.
A straightforward solution would be to treat replace
more like a class factory. If class factories shouldn’t be inherited, then replace
shouldn’t be either. Instead, you’d write:
def f(x: D) -> D:
return D.__replace__(x, some_member=1)
By explicitly calling the “replace factory” of D
, we sidestep the covariance issue—we’re constructing an instance of D
and using x
only as an input.
This also justifies an optional type warning for the following:
replace(x, some_member=1) # What type are you trying to construct? We can't be sure some_member is valid here.
—unless x
has a final type.
However, if you write:
D.__replace__(x, some_member=1)
then there’s no ambiguity, and no warning would be necessary.
I agree that specifying the class explicitly isn’t ideal—but the current situation, where inheritance introduces unsoundness, is much worse.
This is consistent with Randolph’s hypothesis:
It follows Doug’s intuition that __replace__
is a constructor:
I think it’s a variation of the Joren’s idea that:
in that __replace__
's parameter is treated specially. It’s neither mutated, nor used to polymorphically choose between class factory return types.