Take 2: Rules for subclassing Any

guido · April 4, 2024, 12:39am

Ouch. In this case my own (often naive!) intuition thinks of Any as a “flag” that switches on some extra behavior for the class (and its subclasses) rather than as an actual class to be inserted in the MRO. So I guess that the proposed addition to the spec, moving Any to the end of the MRO, just isn’t a good enough explanation for the intuitively expected behavior.

Liz · April 4, 2024, 1:52am

There’s another problem with just using what I suggested over on the PR itself for __getattr__, it breaks the fundamental rule of when you can replace a fully typed class with Any and not create a new type error.

This is extremely reasonable. The attempt to solve it by reordering is an approximation of the more theory driven lower-bound approach that was clarified here but not what was chosen. A few of us in the thread reasoned would have the same behavior. It seems we all missed the interaction, and that it isn’t actually equivalent.

@mikeshardmind @carljm how bad of a delay would it be on intersections to make reconciling the lower-bound problem part of that pep instead, and clarify the behavior of Any as part of that? It seems to me that intersections using ordering rather than the lower bound will have the same problems.

mikeshardmind · April 4, 2024, 2:14am

I don’t know that we can or should adopt that definition as part of intersections. That particular definition is much more closely related to ideas in set-theoretic typing than we currently have language for in Python’s typing specification, doing this would be a much larger set of changes to the specification and would more significantly codify the behavior of gradual types as a class (possibly a good thing, but a larger change).

One example of something else that would be affected is Callable[..., Any]. In this ... would be an expression of potential args and kwargs having bounds Lower: (), Upper: (*args: Any, **kwargs: Any), and something that was glossed over before, but relevant here: being contravariant, the upper bound would be what was safe: (edit: safe to assign with, the lower bound safe to call with, sorry, expressing variance related stuff where we don’t have language for why it’s the case…), not the lower. (having worked this out, I need to go add detail to another current issue as well)

Adopting the definition has extremely predictable outcomes on intersections, but this might have surfaced as a large enough necessary change that might be worth an actual pep itself, and I think that those definitions also present a larger changeset for type checkers, but (as hinted at above) might solve multiple issues at once.

If the typing council would be comfortable with these changes being bundled initially, then split apart later if necessary, I could do it that way, but I don’t feel personally comfortable advocating for this being bundled with intersections rather than intersections relying on it.

Edit: I’m not going to pursue this approach right now, even if the typing council were comfortable with it, I’m not, to a degree that I wouldn’t do it.

mark-todd · April 5, 2024, 9:48am

Just to confirm, the mro in current python here is:

[<class '__main__.Class1'>, typing.Any, <class 'object'>]

But in the proposed version it’s:

[<class '__main__.Class1'>, <class 'object'>,  typing.Any]

That seems quite counter-intuitive to me, although I understand Any is a special case. For any other class it follows that:

class Class1(Class2):
    pass

[<class '__main__.Class1'>, <class '__main__.Class2'>, <class 'object'>]

Wouldn’t this be a breaking change for anything that relies on the mro order of Any (e.g. anything that uses super)? Apologies if I’m missing something from the other thread. I also thought it was a rule that all classes have object as the final item in the mro (everything inherits from object, including the Any class) - although this part I suppose is not a necessity.

hoelzeli · October 12, 2024, 3:32am

Disclaimer: I did not yet read any of the discussion before this thread.

It seems like another option has not been explored yet. If we see Any as a placeholder, like an unbounded type variable, in the inheritance hierarchy, we cannot know whether a certain member will be defined or not. In that case, isn’t the only viable result a Union of the resulting types of both cases (member defined or not defined)? This would be consistent with a similar case where the type system is missing information:

from typing import Any
obj: Any = object()

def test(val: int):  # return type: Any | Literal[0]
    if val > 0:
        return obj
    return 0

This would make the type annotation correct for the information that we currently have - it does not hide any defined contracts but also does not ignore the option for non-LSP-conform overrides. I’m still not sure about it, but doesn’t this prevent false negatives entirely?

The typecheckers could add a configuration option that could change the behaviour to “Any honours LSP” for suggestions in IDEs and users who know that all their class hierarchies are LSP-conform to prevent false positives where they are too annoying. This flag should also make things backwards-compatible to the current state, right?

Maybe it might even make sense to introduce a separate typing statement LSPCompatible[T1, ..., TN], which tells the type system that a set of types has no LSP conflicts (and if it has no LSP conflicts, inheritance is basically unordered, right?)? This would allow users to easily circumvent the pervasive Union annotation of members that could be overridden by Any, by including Any in the set of types T1...TN.
However, I do not think such a statement should be always global, as Any might be a replacement for different untyped Baseclasses in different classes. Maybe this should be a class decorator to better control the scope of the annotation? It could be a no-op at runtime, similar to typing.cast. When a class is annotated with this decorator, the meaning of Any in an inheritance hierarchy for the type system would change from “really any type” to “any LSP-compatible type”. For fully specified classes, typecheckers could raise a warning to help with (the transition away from) gradual typing. The decorator would also get us around introducing additional types for an LSP-conform Any (if inheritance is the only issue where such a type would be needed).

Liz · October 12, 2024, 6:38am

I know there’s a lot of history of discussion for this one that spans multiple locations (and I saw you nudged the intersection repo on the way here).

Check this post for treating any as a permissive unknown that still has to be compatible, it should get you up to speed on the theoretical best definition we have, even if that isn’t the currently adopted definition.

@mikeshardmind Made a few attempts at approximating that within the context of intersections and ordering, as well as separate from, but as a prerequisite for, intersections in an attempt to minimize impact because of feedback. But the approximations fell apart due to how dynamic python is. The definition is sound and compatible with python’s existing type system, but the lack of consensus on it and the lack of being able to currently specify arbitrary bounds on arbitrary types makes this not expressible in the current type system, and as far as I know, nobody has put much into it since.

jorenham · October 16, 2024, 4:49am

I don’t think that LSP applies to Any at all, because Any isn’t, and cannot be, a type.

To see why, start by assuming that Any actually is type, which requires it to follow the same rules that all other types follow, and consider the first part of its definition:

Every type is assignable to Any […]

Then, because everything is assignable to Any, we must conclude that Any is a supremum type – the supertype of everything. Or in other words; there exists no type that isn’t a subtype of Any.
This isn’t something new, because this also applies to object, which, in turn, shows that Any and object must then be equivalent types.

However, Any doesn’t stop there – the definition continues:

[…], and Any is assignable to every type.

But for this to be possible, all types must be supertypes of Any.
So we can conclude that Any is, just like Never and NoReturn, an infimum, or “bottom” type – the subtype of all types. Or in other words; there exists no type that isn’t a supertype of Any.
This also means that Any must be equivalent to NoReturn and Never.

Because Any is equivalent to object , Never, and NoReturn, then object must also be equivalent to Never and NoReturn.
By combining the two lemmas from before (in italic), we conclude that
there exists no type that isn’t both a supertype and a subtype of Any.

A direct consequence of this is that all types must be equivalent.
This is because if you have some type T1 that is both a subtype and a supertype of another type T2, then T1 and T2 are equivalent types.

Luckily, this isn’t the case.
And that leaves us with only one possible explanation for this: Any is not a type.

So I think we should first start by defining what Any actually is.
And before it could be “subclassed”, the current definition would probably have been sufficient.

Anyway, the only way I’m able to make sense of a subclass of Any, is by considering it equivalent (but not equal) to Any itself. This means that such subclasses also aren’t types, and shouldn’t be treated as such. So all methods and attributes should be ignored by type-checkers, even though they’re still allowed to exist at runtime.

I realize that this isn’t very practical, but I think that that’s actually a good thing:
From experience I know that Any can already lead to very difficult problems. And that’s because, as I’ve just shown, Any is itself a violation of the LSP. And as it can be used freely, this this becomes contagious:
Because if you have class A[T: B] with attribute a: T, then type X = A[Any] is allowed, but X.a: Any, is unsafe.
So as far as I’m concerned, using Any should be discouraged in general, and subclassing it should be a last-resort.

carljm · October 16, 2024, 11:29am

I like the “reductio ad absurdum” argument, it’s a clear way to make the point that Any is not a (fully static) type. I don’t know if you followed this link from the definition of Any in the spec that you linked:

See Type system concepts for more discussion of Any.

I think you’ll find that the spec already agrees with you about Any. The difference is just one of terminology: you use the term “type” to mean what the spec would call a “fully static type.” Non-fully-static types do behave oddly (and do not participate in subtyping) so it’s reasonable to not consider them a “type” in the classic sense, but the terminology we use in the spec instead considers them to be “non-fully-static types”, part of a broader category of “gradual types.”

And I think you’re right that this means a class inheriting from Any is also not a fully static type.

I don’t agree that this means we have to throw up our hands and ignore everything about it. At the very least, methods and attributes present directly on the subclass can be trusted to exist as they appear. (Of course it is possible that the unknown base class represented by Any has a metaclass or __init_subclass__ which does arbitrary things to the type, but in general this kind of behavior is not modeled or understood by the type system and can easily break it even without inheriting Any, so I think we can disregard that possibility here.)

Beyond this, we do get into tricky territory, but I think considering Any to be a stand-in for an unknown (but LSP-compatible) base type leads to useful and consistent behavior.

hoelzeli · October 16, 2024, 6:18pm

Sorry for the late response, and thank you for the summary and entry points Liz! I realize that I should not have made the previous post without reading further into the discussion and the type system.

What would be the issue with specifying the lower bound as

the type of the member (including uncalled methods) potentially overridden by the unknown type behind Any
the return type of the method potentially overridden by the unknown type behind Any
the return type of the __getattr__ method defined on a type located before Any in the MRO for unknown members

via the T | Any union type? This is, according to the typing documentation, a way to specify a lower bound (Thank you carljm for the link, it clarified a lot of terminology for me!).

This would (only partially, see [1]) consider the case that unknown classes are LSP-incompatible, and so far I would expect language server features like autocompletion to continue to work as expected. I did not test that assumption extensively though; I only checked suggestions for a value T | Any and they still seem to show as if the value was only of type T. It would also prevent a range of false negatives, as for example assignment of an int | Any to a str is not possible.

The only way I see where replacing an existing type with Any would then generate errors, is if we replace a class in an MRO that overrides a member in an LSP-incompatible way. However, in that case, there would already have been a type error before.

[1] I just realized that “for example assignment of an int | Any to a str is not possible” is also the reason why this does not support any arbitrary LSP-incompatible types, but it should support e.g. contravariant overrides of attributes or method return types (probably as a direct consequence of specifying the lower bound).

Liz · October 16, 2024, 8:40pm

The only, but major issue with that is that Any is for gradual typing, often of things that may not be expressible in python’s type system, or at least haven’t been by the author of some code yet.

T | Any creates a situation where only the values on T are considered safe to access (it’s only safe to access what exists for all members of a union), but the original purpose of Any in a subclass (I still think this is a mistake) was not that, but was originally to enable things like typing of Mock objects. What people wanted was “For things that T defines, give me T’s definition and error if I use it in another way, but for things T doesn’t define, don’t error”

The term “lower bound” here is actually being use differently as well.

What we want is a way to place bounds on a set of types based on knowledge.

For instance,

class A:
    def foo(self, x: int, /) -> int:  ...

class B:
    def foo(self, x: str, /) -> str:  ...


class CError(A, B):
    """CError, if allowed,
    breaks LSP and provides a way to break things
    that expect B and use B.foo"""


T = TypeVar("T", int, str)

class C(A, B):
    def foo(self, x: T, /) -> T:
        if isinstance(x, int):
            return super(A, self).foo(x)
        if isinstance(x, str):
            return super(B, self).foo(x)


x: A | B
x.foo(1)  # this errors, and that's good.
y: C
y.foo(1)
y.foo("a")

That demonstrates the issue with using a union for this, but it’s important to say that it’s also not an intersection. An intersection would only define the lower bound appropriately in this case, and not the upper, of which the type system must be permissive of anything compatible to not break the gradual purpose of Any.

To see one of the ways this falls apart when the type checker has unknowns, imagine that the definition of A is not visible to a type checker.

Is there a subclass of A and B that can be safely used as a substitute of A and of B without an LSP violation? Yes, we actually have it with C, but that’s not the only one, and a type checker can’t synthesize that and enforce that C actually implements that without knowledge of A.foo here, it can only keep enforcing consistency with the definition of B, assuming that the unknown is still required to be compatible.

I personally hate the current use and understanding people have of Any. It makes sense in a compiled language where this can be refuted by a compiler that at some point will have all the relevant info with the whole program. But it doesn’t make sense in python, where rules for inference don’t exist and where Any is used to just ignore things that weren’t ever written with type safety in mind, and where types can be so dynamic and even mock themselves. It’s the difference between not expressing a consistent type and potentially not having a consistent type that makes Any such a problem. Theory only accounts for the former.

jorenham · October 17, 2024, 2:51am

The proof (by contradiction) that I sketched indeed uses the terminology that is conventionally used for fully static types. However, it applies to types in general, which includes gradual types.

The referenced typing docs section explains that the (sub)typing relations that define fully-static types also apply to gradual types in the same way. The different terms for these relations (e.g. “subtype of” vs “assignable to”) are only used to reflect whether the relevant types are fully static or gradual – the relations themselves apply to both in the same way.

Anyway, the proof could easily be modified by prefixing “subtype” and “supertype” with "consistent ", and by replacing “equivalent to” with “consistent with”.
In other words, both fully static types and and gradal types are types; they just describe a different abstraction level.
So that in fact means that Any cannot be a gradual type, fully static type, or a type in general.

Subclasses of Any are therefore also something that isn’t a type. All we know that such subclasses also must follow the definition of Any, which makes them “equivalent to” and “consistent with” Any. So as I’ve said before, that means that static type checkers should consider subclasses of Any in the same way as Any itself.

MegaIng · October 17, 2024, 9:49am

No that doesn’t work. Specifically this section doesn’t translate:

To quote the reference docs you just linked to:

The consistency relation is not transitive.

Just because Any is consistent with object and Any is consistent with Never doesn’t mean object and Never are consistent.

jorenham · October 17, 2024, 10:43am

The consistency relation is not transitive. tuple[int, int] is consistent with tuple[Any, int] , and tuple[Any, int] is consistent with tuple[str, int] , but tuple[int, int] is not consistent with tuple[str, int] .

So yes, if you put Any into your type system, then it isn’t transitive anymore.
And that’s actually a big problem, because if there’s no transitivity, then one of the consequences is that there will be situations where type inference won’t be possible anymore.

Luckily, we now know that Any isn’t a type after all. And apart from Any, And as far as I could tell, there were no other reasons for removing transitivity. With transitivity back in place it’ll be a lot easier to reason about gradual types

MegaIng · October 17, 2024, 12:23pm

Ok, but now you are just redefining what type means, clearly in a way that conflicts with the way the typing spec uses it. If you think your terminology is clearly better, then you should strive to change the spec. Otherwise it’s just going to be confusing.

jorenham · October 17, 2024, 12:36pm

I’m not redefining the meaning of type at all. I was trying to illustrate that the currently missing transitivity property of gradual types can be restored if you drop the assumption that Any is a gradual type.

And the reason why I’m going on about how we should view Any, is because that’s literally what this entire discussion is about. I don’t see why that is confusing.

MegaIng · October 17, 2024, 12:42pm

You are IMO still wrong and appear to not have fully read the page you linked, but I don’t really want to continue this discussion here. If others are ok with your usage of these terms, it’s not a hill I am going to die on.

Daverball · October 17, 2024, 1:17pm

I am confused by this statement. If by your definition Any is not a gradual type because it violates transitivity then tuple[Any, int] necessarily also isn’t a gradual type because it also violates transitivity, now if you continue this game you end up excluding all gradual types and you’re left with a non-gradual type system. The same goes if you replace Any with any other gradual type with arbitrary constraints such as upper and lower bounds^[1].

If you look up the textbook definition of gradual typing you’ll notice that the inclusion of a dynamic type with a consistency relationship which is reflexive and symmetric but not transitive is at the very core of the idea. So you can’t simply restore transitivity by only removing the most gradual type, only fully static types have a transitive consistency relationship.

So @MegaIng is correct. By your definition type no longer means what the spec says it does and would now only include fully static types. I don’t see however how that is helpful since it’s just shuffling nomenclature, it doesn’t make it easier to reason about gradual types.

as long as the upper and lower bound aren’t the same, at which point it would be a fully static type ↩︎