Rules for subclassing of Any

Discussion continued from Another issue with subclassing of Any · Issue #7395 · microsoft/pyright · GitHub

Subclassing Any was allowed in stubs for a while, subclassing Any at runtime was added in 3.11, see Allow subclassing Any at runtime · Issue #91154 · python/cpython · GitHub

When considering unknown types (due to gradual typing) Any is the type in question.

Given all of this, and that the specification states

Any can also be used as a base class. This can be useful for avoiding type checker errors with classes that can duck type anywhere or are highly dynamic.

I had expected different behavior of the below cases. (Note that both mypy and pyright agree in these cases currently)
from typing import Any, reveal_type

class X:
    def foo(self) -> int:
        return 1

class XFirst(X, Any):
    pass

reveal_type(XFirst().foo())  # int
XFirst().foo("errors and should")
reveal_type(XFirst().bar())  # Any, but errors at runtime, intended.

class AnyFirst(Any, X):
    pass

reveal_type(AnyFirst().foo())  # int, should be int if and only if pyright is assuming Any must be compatible with X
reveal_type(AnyFirst().foo)  # should be Callable[..., Any], is () -> int 
AnyFirst().foo("shouldn't be broken, is")
reveal_type(AnyFirst().bar())  # Any, but errors at runtime, intended

There are multiple questions this raises:

1 If Any is in the MRO, do we have to assume Any cannot have violated LSP under the compatibility guarantee of Any?
2. If Any is in the MRO, should attributes and methods be dynamic from the point in the MRO Any is in?

I think the answer to number 2 must be yes (even though this disagrees with 2 current type checkers) as this interacts with not just subclassing Any, but untyped code and gradual typing.

Number 1 however could be controversial. I’d certainly personally prefer the answer to that be “yes”, and the fact that type checkers already currently are acting as if this is the case would indicate that this isn’t creating too many false positives for users of gradual typing.

I think the behavior should be identical to what type checkers currently do if you subclass a class imported from an untyped module. I have always found that to already do the thing I expected it to do.

So yes to 1, but I’m not sure what exactly you mean by 2, I don’t think we should ever be going from a specific type to Any, even if Any being first in the MRO could mean that the type could either be less or more specific, depending on if we’re talking about an argument or return type.

If Any in the above example was actually from a module that a type checker couldn’t see, but had the interface:

class CompatExampleOne:
    def foo(self, *args: Any, **kwargs: Any) -> int: ...

Then for the case of AnyFirst, AnyFirst.foo("something not in X") shouldn’t error. The specification says Any as a base class is there for dynamic use, and being in front of X in the MRO means that even if LSP can’t be violated, X no longer specifies an interface, it specifies a minimum interface.

There’s multiple options here.

Right now, for a given class

class Example(UnknownType, KnownType):
    ...

Despite that gradual typing says it is for compatibility and reducing false negatives, type checkers are preferring the type of KnownType and ignoring that UnknownType could augment it at all.

If we assume it can’t violate LSP, but must otherwise follow the rules of the type system, I would expect:

“If it exists on KnownType, usage that that could be compatible with a valid subtype of KnownType is fine”
Instead, what type checkers are doing is “Prefer the definitions provided by KnownType, fallback to Any only when KnownType says nothing about it”

Yes, but generally it is more likely that the methods will not overlap, so dropping all this extra information, just so we can avoid a potentially incorrect signature, does not seem like a good trade.

My preference is if people have to provide the correct signature inside an if TYPE_CHECKING block or ignore the type errors for those rare cases, I don’t expect this to come up very often and when it does, due to how a certain API is structured, it might be worth to just run stubgen on that API once, without further specifying the types.

For the record, the answer to this will have long-reaching consequences as Intersections are currently on hold for answers to this to remain consistent with the rest of the type system in their design.

I don’t think subclassing Any should be allowed At all if the goal is for typing to avoid this, as it provides no benefit, however, it is specifically documented as allowed, and allowed to avoid errors from dynamic use. The current behavior appears to be out of specification.

Ed: link updated to point to Any

I don’t think it really makes sense to disallow subclassing Any. Since you are already doing that when you import a class from an untyped module. So you might as well allow that explicitly in place of forcing people to add __getattr__(self, name: str) -> Any: ... and __setattr__(self, name: str, value: Any) -> None: ... to the class, which is essentially equivalent.

Which is exactly how you can resolve this issue. Rather than think of Any as being LSP compatible with the other classes, you can think of it as being equivalent to having a __getattr__ and __setattr__ that always works, even if the attribute isn’t explicitly specified in any of the classes. Even if you get some false positives and negatives that way.

This is not what is specified. Beyond this, I’m not looking for a resolution for any specific case, I’m looking for a general resolution for the specification that will help with long term consistency with both current and future typing features, so I don’t want something that says the type system should throw away the MRO.

My personal view of it:

For issue 1:

  • I don’t think Any should be allowed in base classes if it’s allowed to violate LSP
  • I don’t think we can ban Any as it exists for compatability

Therefore, Any should be considered compatible, but compatibility must include LSP, or Any breaks substitutability, breaking compatibility again.

For issue 2:

MRO, an actual detail of the language should not be ignored. This will cause significant complications in the future.

Ignoring MRO seems bad here. This could be fixed by introducing an Unknown type that is less powerful than Any for unknown imports, but as-documented, I have to agree with you that type checkers are currently wrong, especially since the reasoning multiple typing council members gave around intersection options falls apart if we make intersections consistent with current type checking behavior.

1 Like

That’s fair, my suggestion would be to specify Any as being equivalent to:

class Any:
    def __getattr__(self, name: str, /) -> Any: ...
    def __setattr__(self, name: str, value: Any, /) -> None: ...

In that case you can answer both your questions with yes and remain consistent with the current behavior, safe for the case where one of the classes implements their own __getattr__/__setattr__.

Currently pyright seems to always prioritize Any with this interpretation, rather than respect the MRO there either, so the behavior strangely enough is inverted:
https://pyright-play.net/?pythonVersion=3.12&code=GYJw9gtgBALgngBwJYDsDmUkQWEMoCCKcAsAFDkDGANgIYDO9hxAYgFzlRdQAmApsCgB9IWj4xaMGCBEAKen2rAANFBS0IfNlHrTVAegCUUALQA%2BZnG0A6W5279BIhRKkyh8xSrUatOvVAAbrTUAK5%2BRHAGxuZQAHJgKH621uRUdIxQABocZNxQwGBg2qgw9lyOwqLiktJyCkqq6prauiAxFqU2dhRkNAxMAJqyWaqRht2pfRlMAFqykapZE1Ap6QNQgywjY6wra9Mbs9uRLEv7PeQgfIF8IULwCHyyw4bWhWCGVzd31A%2BIz3mbw%2BXzI11u90ezy2smBRVB4N%2B-yesmOsPe8O%2BEL%2BUJe6IARrR2likbigdZCcSwT9IQCXts3pSETScXS0YyiaCgA

I think I would be happy if Any was consistent with my interpretation, it is consistent with what most people would expect to happen without special casing too heavily.

Specifying Any using constructs that require typing in python itself to express doesn’t work. Any is special in the type system. I’m not interested in trying to half fake the behavior of Any, I’m interested in a consistent specification.

I know people don’t like this, but Any is very special in the type system, being consistent with it requires thinking at a level beyond what is expressible in python itself by the nature of it’s purpose.

x: Any
y: int = x.__getattr__

This is valid, and should be.

1 Like

You make a good point, although this should work:

class Any:
    __getattr__: Any
    __setattr__: Any

And this definition is purely to resolve conflicts between the classes in a way that prioritizes types we know the structure of, not a complete specification of Any.

This is also a good argument that either interpretation is valid, because Any is equivalent to the type where each attribute is Any. So that could either work through __getattr__ or an actual attribute.

I hope others chime in on this, but I think we should be prioritizing the type system matching the language, and therefore using the MRO, not arbitrarily picking that the top type doesn’t behave like the top type in one specific context.

I mean, the thing is, you aren’t really doing that, it just looks like it, because Any is special, both the possibility that the method exists with a more general signature and that it doesn’t or with the same signature are equally valid, you don’t have to reverse the MRO in order for that to happen, you could also argue that Y(Any, X): ... is equivalent to Any, because Any could override all the attributes, but that doesn’t seem very helpful.

I think the current behavior is what people generally expect to happen, they don’t expect to lose type information by mixing their class with Any, regardless of the order and regardless of the possibility that the class potentially could widen some method signatures if you assume LSP compatibility. I think it’s fine if subclassing Any always looks like Any is last in the MRO, even when it is not.

Irregardless of any of that, your code sample would be more helpful if it illustrated what would happen with a method which takes at least one argument, because every argument that was specified would now need to be turned into Any as well, since argument types are contravariant and you can’t know whether the subclass took advantage of it or not so you don’t know whether you should reject less specific types or not.

I think this is another argument against doing it this way. While it is arguably the most consistent, when assuming LSP compatibility, it is only marginally more consistent in a few cases at the cost of safety in most cases.

I’d want to hear from @guido and @erictraut on this. Both of them made arguments that cannot be consistent with this interpretation when rejecting a specific direction on intersections here and here respectively.

I don’t want to hold either of them to those views if they’ve changed since then, but the opinions of multiple typing council members in an intrinsically related case was that Any could not have that behavior in an intersection specifically because Any could precede the type we know.

I’m not bringing this up as a gotcha or some smoking gun that it has to be a specific way, but more that something in the process has ended up inconsistent. Either what was reasoned about did so with the wrong assumptions and Any is no longer the top type when a base class, and all we have to do is update the documentation and then re-reason about intersections, or type checkers are doing the wrong thing, the documentation is right, and what was said then about the intent and behavior of Any remains correctly reasoned about, or somewhere inbetween where only parts of it are wrong.

2 Likes

Just a slight side point - I did actually encounter a few months ago a use case for inheriting from Any, I think similar to the __getattr__ example shown above. I wanted to overwrite __getattribute__ with some custom behaviour, but found the type checker complained on attempt to access a member of the class (as it couldn’t inspect the custom logic). I found by making the class inherit from Any, it effectively allowed all attributes to be accessed.

Yeah, there are definitely reasons to allow subclassing from Any, the problem I have is that currently type checkers are doing something that ignore where it is in the MRO, so I’d like a clearer definition of what we should consider the behavior of Any as a base class. This doesn’t require changing what type checkers have reached in terms of agreed-upon behavior, but either the behavior of those should likely change or the documentation should reflect the actual behavior.

A way to document the existing behavior could be to add to this part of the specification

The current language says:

Any can also be used as a base class. This can be useful for avoiding type checker errors with classes that can duck type anywhere or are highly dynamic.

Ammedning this to say

Any can also be used as a base class. As a base class, it no longer indicates being consistent with all types, but must be consistent with the other bases as part of MRO, and instead indicates that the base is partially unknown or otherwise too dynamic to type accurately and that the unknown and dynamic parts should be considered not to overlap or conflict with the known bases.
This can be useful for avoiding type checker errors with classes that can duck type anywhere or are highly dynamic.

Would capture the current behavior.

I have no strong preferences on which route we go, but I would like to have a clear behavior here where the specification is clear so that I can get back to working on designing intersections to remain consistent with the type system and not have them conflict and have layers of nested special casing to reconcile.

1 Like

I still feel the current behavior of Any does not actually violate or circumvent the MRO in any way. The only thing it violates, is absolute LSP consistency because merging with Any disregards the possibility for there to be a more specific override, regardless of whether Any occurs first in the MRO.

This sound like a design trade-off to me. The willingness to accept a couple of false positives in order to avoid a much larger possibility for false negatives. I think this is fine and doesn’t hurt the usefulness of Any in intersections. All it really changes is the ratio of false positives to false negatives.

The __getattr__ analogy was used to illustrate that this is not a behavior unique to Any, we can have attribute access that prioritizes attributes that are lower in the MRO with any class.

That essentially is violating and circumventing the MRO

class X:
    def foo(self) -> int: ...

class WithAny(Any, X):
    ...

WithAny().foo

MRO says look at the type of Any.foo, so that should be Any, not int. Any isn’t documented as only a fallback, but as consistent with all possible types (it is the top type effectively)

I’d be open to changing behavior as you suggest here. It apparently only matters in cases where a class double-inherits from an Any and non-Any base, which is a situation where users should already expect very imprecise typing.

I would be interested in a PR to mypy showing the effect of this change on real-world code.