Expanding `ReadOnly` to normal classes & protocols

Eneg · October 15, 2024, 7:03pm

A way to remain inclusive of any future changes on the dataclass_transform side would be to talk of initialization as initialization and not specifically __new__/__init__.
This way some abstract mechanism can expand the gamut of methods that take part in the initialization and they’d be automatically included by the definition.

beauxq · October 15, 2024, 7:07pm

I strongly disagree that this should be inferred to be a ClassVar
for the same reason that x: int = 2 in the class scope shouldn’t be inferred to be a ClassVar.

It’s a default value for a read-only variable, where some instances of the class might have a different read-only value.

Being forced to define it in __init__ if you want it to be different for some instances could require a lot of extra memory if many instances are created and most will have the same value.

Eneg · October 15, 2024, 7:26pm

A Final attribute with an initializing value is implicitly considered a ClassVar by the aforementioned spec. I don’t see why ReadOnly would deviate from that.

If memory is a concern, the class should likely define __slots__ - in which case it’s not possible to have both a class and instance variable of the same name.
You can achieve a similar interface by creating a class-level dict[int, T] where keys are id(self), and read from it via .get(id(self), default) in a property.

Tinche · October 15, 2024, 7:45pm

I think this reasoning predates dataclasses/attrs/et al. A Final attribute with an initializing value makes perfect sense as an instance attribute in this context - the initializing value is just the default. For it to unambiguously be a classvar it’d have to be init=False I think.

Eneg · October 15, 2024, 7:56pm

This is specific to dataclasses, but it is already what’s happening. Dataclasses make Final attributes normal instance attributes, and type checkers treat it as such since typing#1669

beauxq · October 15, 2024, 8:37pm

If one of the problems being addressed is that Final is overspecifying what we want, it doesn’t seem like a good idea to inherit a bunch of spec from Final
It has already been seen as a problem that Final is implicitly ClassVar - Now we have this weird situation where sometimes it is and sometimes isn’t. Why would you want to inherit that when we could keep things simpler with ReadOnly?

You’re suggesting a lot of extra complexity when it could be something really simple.

carljm · October 15, 2024, 10:37pm

I am willing to sponsor this PEP. I read over the existing draft sections and they look good! I think the motivation is strong.

I would prefer to not infer ClassVar for an initialized ReadOnly class attribute, so as to allow read-only instance attributes with class-level default value. This behavior seems simpler and easier to explain, and doesn’t require a special carve-out for dataclasses.

I don’t think there is much benefit to hewing too closely to Final semantics. The constructs have different meanings and will behave differently, at the very least in terms of whether the value can be overridden by a subclass. And I’m not sure that Final would have been specified in the way that it is, with the benefit of hindsight.

I prefer allowing multiple initialization sites, as long as they are in the class body or in __init__, and nowhere else. That also seems easy to explain. ReadOnly is intended to specify the behavior of the attribute once the class, or an instance of it, has been constructed.

Pyright does allow calling __init__ directly on an instance, so doesn’t provide a guarantee that __init__ will only be called once for a given instance. Mypy already errors on this. This ReadOnly specification might provide an additional reason to prohibit this (in addition to the existing reason that __init__ isn’t subject to LSP, so calling it on an instance is unsafe.)

Eneg · October 16, 2024, 7:28pm

Alright, I hear you all.

So far I thought of Final as already defining a good portion of what a read-only attribute should look like, thus it only felt natural to define ReadOnly as a subset of Final.
And under the assumption of it being a subset it does not sound good to deviate from the prior art.

I get now much more can be done in this PEP.

Some considerations:

Under my original assumption ReadOnly[Final[...]] would be merely redundant and I was going to assume the same wording as for ClassVars ^[1].
Now, since the qualifiers will deviate in some contexts, the combination of ReadOnly and Final should be treated as an error, since it’s ambiguous.
I think the __post_init__ situation could be resolved by specifying something along the lines:

__init__, __new__ and class-level defaults are the default set of contexts where assignment to a read-only attribute is permitted. However, type checkers may permit additional special methods to facilitate mechanisms like dataclasses’ __post_init__.

Should __init_subclass__ be a part of this set?
Should a class implementing __getattr__ match a protocol with a read-only attribute?
I will not include the implication of ClassVar to an initalized class-level ReadOnly. Keep in mind that having class and instance level variables of same name is not possible with __slots__ ^[2].

combining ClassVar and Final is redundant, and type checkers may choose to warn or error on the redundancy. ↩︎
I still believe using __slots__ would be a better solution to beauxq’s problem, but I agree that I may have overengineered the rest. ↩︎

Eneg · October 17, 2024, 9:26pm

I have pushed new changes, filling out Rationale, partial Specification, and some polishing touches. source

Eneg · October 17, 2024, 9:28pm

Does that mean I can fill you in as the sponsor?

carljm · October 17, 2024, 10:07pm

Yes, you can, and now I have to reach a minimum post length

beauxq · October 18, 2024, 1:02am

The combination of ReadOnly and ClassVar imposes the attribute must be initialized in the class scope, as there are no other valid initialization scopes.

I think this is not right. It can be defined in a metaclass, or a class decorator, or something similar, so it does not need to be defined in the class scope.

If a declaration looks like this, without an assignment:

class Foo(Something):
    foo: ClassVar[ReadOnly[str]]

I expect that foo is probably assigned a value in a metaclass.

This is a pattern used a lot in one of my projects.

beauxq · October 18, 2024, 3:37am

Another probably more common case of this is with abstract classes, where the subclass is expected to define it.

And further, the type arg of ReadOnly might also be abstract with no concrete classes in existence in the library, so it wouldn’t even be possible to initialize it.

Eneg · October 18, 2024, 5:52am

This would clearly violate the laid out rules.

class HasName(Protocol):
    name: ClassVar[str]


def give_name[T: HasName](cls: type[T]) -> type[T]:
    cls.name = "..."
    return cls


# error: ReadOnly[str] is not assignable to str
@give_name
class Foo:
    name: ClassVar[ReadOnly[str]]

# what'd prevent this?
Foo = give_name(give_name(Foo))

Change the definition of the protocol to use ReadOnly and now you get the error in give_name.

I don’t think this should be allowed; at least not the way you describe.

If a metaclass wishes to initialize a class variable of a class ^[1], then the metaclass should be the body where that variable is declared (omitting ClassVar).

So, instead of:

class Foo(metaclass=MyMeta):
    foo: ClassVar[ReadOnly[str]]

You’d do:

class MyMeta(type):
    foo: ReadOnly[str]

    def __init__(cls, *args):
        # I don't remember exact signature
        cls.foo = random.choice("abcdef")
        # ... rest of machinery

class Foo(metaclass=MyMeta): ...
reveal_type(Foo.foo)
print(Foo.foo)

I believe this is fully compatible with the rules I’ve laid out in the PEP so far.

I’m not sure on this. My immediate reaction is an ABC should use ... to indicate missing value.
ABCs lie in a weird middle ground between classes and protocols, which section I haven’t yet described in great detail.

Do you mean that the ABC would declare it as ReadOnly[ClassVar], without specifying the exact type?
If the type isn’t known ahead of time, it should use object, or be generic over it.
Either way, it should be the same case as with above.

which is de facto an instance variable of its instance ↩︎

mikeshardmind · October 18, 2024, 5:52am

Do you have a use case for read only abstract classvars? This seems like a stretch beyond what is useful, and since this in no way matches runtime and is only static checking of intent, I’m not sure scoping it for this is reasonable.

beauxq · October 18, 2024, 11:37am

I think that should be what that means, but at least one type-checker maintainer disagrees with you that a metaclass instance member means a class ClassVar.
But that doesn’t solve the problem anyway, because the type might not be known at metaclass definition time. (If it were, then Final would probably be sufficient.) The metaclass method can get the type from the annotations and instantiate it.
This is a pattern currently used in a project I work in, and it is one of the reasons I want ReadOnly.

No, the type is specified, but it is abstract.

class AbcA(abc.ABC):
    @abc.abstractmethod
    def foo(self):
        ...


class AbcB(abc.ABC):
    a: ClassVar[ReadOnly[AbcA]]  # not possible to initialize this

And then the library user is expected to use it like this:

class ConcA(AbcA):
    def foo(self):
        ...


class ConcB(AbcB):
    a = ConcA()

This is a pattern used in a project I work in currently, and I would be really surprised if there isn’t a lot more of this pattern.

mikeshardmind · October 18, 2024, 5:06pm

I’d be surprised if there was more of that. assigning instances of mutable objects as classvars creates several easily avoidable problems because that’s a single class instance of ConcA shared by all instances of ConcB, which if it isn’t something everyone is always aware of, is a problem. I’ve always preferred other options here like a method get_shared_state(self) as it is impossible to confuse for something that’s instance scoped.

Here, you’re indicating that you have exactly this avoidable problem because you want to now mark it as read-only too.

From a typing perspective, this also appears to be something that should be written with a typevar, because as it is, you can’t ever rely on anything user-added existing, and your type information is going to be lossy.

mikeshardmind · October 18, 2024, 5:14pm

I realized after writing that that it sounded more dismissive than intended.

I see your use case clearly, you have a problem and want to communicate intent better here, I just don’t see it as likely to be a common problem. If we can work out how the rules require would play nicely to support that, it may be worth doing, but right now I think we could leave that as “unsupported for now, open to revisiting” if we don’t yet have that consistent way for the rules.

beauxq · October 18, 2024, 5:21pm

JukkaL also pointed out this being common.

(That example didn’t involve the abstract part. But you didn’t really talk about the abstract part, so it seems you’re talking about the same thing.)

Eneg · October 18, 2024, 6:14pm

I reckon the core of your problem is this part:

The combination of ReadOnly and ClassVar imposes the attribute must be initialized in the class scope, as there are no other valid initialization scopes.

I’ve already planned to write that initialization isn’t required for protocols (would be rather nonsensical).
Imo demanding that an attribute is eventually initialized is desirable, though now that I think of it, type checkers rarely enforce it.

I can rephrase the problematic part so as not to imply that the attribute must be initialized (at all), but only that it can only be initialized in class scope.