Disallow access to class variables with Self

While trying to pass the conformance tests for Zuban, I have encountered a few fundamental issues, this is #2 of 4.

I have simplified and slightly adapted generics_self_advanced.py to show the problem:

from typing import Self, assert_type

class ParentB:
    a: list[Self]
    @classmethod
    def method4(cls) -> None:
        cls.a = [cls()]

class ChildB(ParentB):
    @classmethod
    def method3(cls) -> None:
        assert_type(cls, type[Self])
        assert_type(cls.a, list[Self]) # This should be a type error, but is not in conformance tests
        assert_type(cls.a[0], Self)  # Same
        print(cls.a)

ParentB.method4()
ChildB.method3()  # Prints [<__main__.ParentB object>]

The problem here are the statements that access cls.a. Accessing a variable that contains Self should cause a type error. Zuban currently catches that. There are two directions we can take from here:

I would personally force an error there and disallow access to cls.a. It would of course also be possible to allow both ways, but this is clearly wrong in an inheritance context (could be fine with @final though).

I feel like this was just an oversight and people did not realize that this was a potential problem.

If people agree with this proposal, I will add a PR to the typing repository and change the conformance tests.

3 Likes

This proposal sounds reasonable to me, but getting the wording correct in the spec could be difficult. Have you thought about how to word this and where in the spec to add it? I recommend starting with a concrete proposal for the spec modification. This is a change that the TC will need to review and approve. The spec change should precede any modifications to the conformance tests.

1 Like

I’m open to working on this.

I would probably add a note at the end of Use in Attribute Annotations. Something like this:


Note that accessing a variable that contains Self via a class is not allowed, because Self can be different depending on the class in the inheritance hierarchy.

class C:
    others: list[Self]

class D(C): ...

C.others = [C()]
print(D.others)  # This is not list[Self], but list[C]

How does this sound? Details can of course be reviewed in a PR, I’m just trying to get a feel for the right direction.

I think the wording will need to be much more precise than that. The phrase “a variable that contains Self via a class” doesn’t really make sense. I think you mean something like “a variable that has a type annotation where the annotation contains Self" or something like that. Even then, the notion of “contains” in the context of an annotation is not well defined in the spec, so this will require additional explanation. For example, what if the annotation is T but the type variable is specialized with Self. Is that covered by this rule? What if the annotation itself references a class-scoped type alias whose definition references Self? You’ll need to think through all cases and make sure that the wording anticipates and covers all of them.

1 Like

This is sort of related to something that was brought up a little differently before (sorry, I don’t have a reference on hand, I can probably find one later if needed).

It’s generally not type safe to have class-scoped invariant or contravariant generics on a non-final type.

Forbidding that would remove all of the issues with accessing a class variable annotated using Self, but it comes with other issues. That may give you a direction on how to specify when it is forbidden even if trying to come up with more narrowly tailored restrictions.

That was still less precise than what you need for this.

It’s the responsibility of the subtype to not violate the supertype. The subtype can’t be defined in such a way that a generic in the supertype no longer has compatible type bounds because of the bounds imposed by the subtype.

I agree. However I’m not quite sure what wording to use instead of contains. For example, the spec currently does not talk about assigning to attributes that “contain” type variables, it simply generalizes the rule and prohibits attribute access for generic classes. I’m sure the spec doesn’t want to prohibit writing to an int attribute within a generic class:

Using generic classes (parameterized or not) to access attributes will result in type check failure. Outside the class definition body, a class attribute cannot be assigned, and can only be looked up by accessing it through a class instance that does not have an instance attribute with the same name

This just feels very imprecise (like my initial proposal for Self). I’m afraid that without guidance I won’t be able to find a good wording for what I call “contains”. Maybe somebody has an idea here how to word this? And should we change the generic part as well? Because they feel very connected.

I think this is not a relevant case, because writing to annotations with type variables is already prohibited by the spec (see above).

I think it’s clear that this should also be disallowed, I’m however unsure how to word this. Are there no similar cases in the spec where aliases are treated as part of the type and some check has to be made for all type vars/Self usages within a type?

How special is class access here? It’s unsafe to ever inherit a mutable attribute containing Self, even if access only ever happens on instances:

class Parent:
    x: list[Self]

class Child(Parent):
    pass

def f(p: Parent):
    p.x.append(Parent())

c = Child()
f(c)
# no type errors above, but now c.x contains a Parent instance

I think this is what @mikeshardmind was referring to above.

I think it is hard to phrase this restriction partly because the fully coherent restriction would be phrased in terms of Liskov restrictions on inheritance in general and variance of Self types, not in terms of limitations on what you can access via a class.

1 Like

It follows out of existing agreed upon theory and doesn’t require new rules: It’s not accessing the class variable typed with Self that is the problem, it’s creating a subtype that the type system should reject, but as far as I know, no type checker rejects the definition using Self currently. It’s one of several places I’ve brought up deficiencies with type checker behavior previously.

In that case, the error should be at the declaration of Child. When discussing this with others, a couple of people mentioned it was easier for them to see the issue when Self was replaced, and inherited types were shown as if they were declared anew.

If we ignore the parts of Self that imply behavior for other possible subtypes not defined here, and just look at this as a closed system, we already have a reason to error, you’ve effectively given type checkers this information:

class Parent:
    x: list["Parent"]

class Child(Parent):
    x: list["Child"]

def f(p: Parent):
    p.x.append(Parent())

c = Child()
f(c)

In this form, mypy and pyright reject it. ty does not

2 Likes

Just to be clear, that’s not a policy choice, just a known gap in our Liskov checking that we’ll fix before stable release; we definitely intend to error on this.

I think we’d like to error on the equivalent Self case, too, but we’ll have to see how bad that looks in the ecosystem. It might need to be a distinct rule, possibly off-by-default, but at least we’d like to provide it as an option.

4 Likes

The more general statement I gave before was probably too abstract, though it covers more of the cases that are currently incorrect with ecosystem checking of subtype validity.

In terms of detecting this specific issue of Self, if when looking at the declaration of a class, Self exists in any position that isn’t covariant, there are no valid subtypes of that class via inheritance.

2 Likes