Take 2: Rules for subclassing Any

Current state of specification

  • Subclassing of Any at runtime was accepted in 3.11. The motivating example at the time was unittest.Mock, and only showed what happens with single inheritance. It was always possible in stubs.

  • Type checkers treat untyped library code as if it were Any as part of gradual typing. This behavior predates subclassing of Any itself and exists due to gradual typing.

  • Subclassing of Any is specifically documented in the specification as existing for when things are more dynamic than the type system can express. I cannot find it as documented or specified for the behavior of untyped imports, but I don’t think untyped imports as Any is currently a point of controversy.

A problem of clarity

Any is documented as being consistent with all possible types. It is not specified what that actually means when present as a base class either due to untyped imports or due to 3.11+ behavior of directly subclassing Any. There are multiple reasonable interpretations of the possible behavior.

Illustrating the problem

The bodies of methods of functions used here will be left without an implementation when unnecessary to illustration.

from untyped import Unknown # standing for an untyped library

class ExampleA(untyped.Unknown):
    def foo(self) -> int:
        ...

def only_option(a: ExampleA):
    a.foo()  # int
    a.foo("something")  # error
    a.bar()  # Any

This case is relatively simple. While not explicitly specified, it directly matches the motivating case for runtime subclassing of typing.Any and there is only 1 possible interpretation

from untyped import Unknown  # standing for an untyped library

class SerialMixin:
    def serialize(self) -> bytes:
        ...

class EasyExample(SerialMixin, Unknown):
    def foo(self) -> int:
        ...

class AmbiguousExample(Unknown, SerialMixin):
    def foo(self) -> int:
        ...

def easy(a: EasyExample):
    a.foo()  # int
    a.serialize() # bytes
    a.serialize(byte_order="le")  # error
    a.serialize(web_safe_str=True)  # error
    a.bar()  # Any

def unspecified(a: AmbiguousExample):
    a.foo()  # int
    a.serialize()  # interpretation dependent
    a.serialize(byte_order="le")  # interpretation dependent
    a.serialize(web_safe_str=True)  # interpretation dependent
    a.bar()  # Any

The body of unspecified shows the problem cases and how they differ due to ambiguity in one potential MRO layout.

For the first case of interpretation-dependent behavior, Is a type documented as consistent with all types allowed to have to consider that some of those types are inconsistent with others, or do we assume that all uses of Any with this pattern must not violate LSP?

If Any must not violate LSP, then a.serialize() there is bytes is Any may violate LSP, then a.serialize() is Any

The current behavior of both mypy and pyright of this example is to determine this as bytes I believe assuming consistency here to be correct, but nothing is preventing this from having a different definition at runtime that, for instance, serialized to a string.

If we decide it can violate LSP, we don’t need to look at the latter example, it should also be Any

If we decide it must be consistent with the other bases, the second and third usages labeled interpretation dependent need consideration.

There are possible types that could be substituted in place of Unknown / Any here which remain consistent with all typed code and for which these could not error.

if the untyped code has this “shape”

class Unknown:
    @overload
    def serialize(self) -> bytes:  ...
    @overload
    def serialize(self, *, 
        web_safe_str: Literal[True], byte_order: Literal["le", "be"]) -> NoReturn:  ...
    @overload
    def serialize(self, *, web_safe_str: Literal[True]) -> str:  ...
    @overload
    def serialize(self, *, byte_order: Literal["le", "be"]) -> bytes:  ...
    def serialize(self, *,
        byte_order: Literal["le", "be"] | None = None, web_safe_str: Literal[True, False] = False,
    ) -> Any:  ...

None of the uses within the prior above function unspecified are inconsistent with this hypothetical type, and more is allowed.

There are two possible interpretations remaining here if we assume consistency for LSP:

  1. Type checkers prefer a known definition over a partially unknown one
  2. Type checkers use the known definition to create a minimum bound

Under interpretation 1, the two cases we still need to consider are each error.
Under interpretation 2, those same cases are each Any

I believe that the current specification calls for interpretation 2, especially when considering the intent of gradual typing not introducing errors for things that are not currently and may not be typable, but I do not think that it is clear enough to say so definitively or that we should not examine this for potential impact and pick a set of consistent rules for the behavior we intend to support within the type system.

Current type checker state

Of the type checkers I’m aware of that have support for Any as part of inheritance at this point in time,
Mypy (online playground) and pyright (online playground) both currently pick the known definition

pyre (online playground) errors when it doesn’t know if something is a valid base currently even when attempting to work around this

Possible resolutions

Assume LSP compatibility, prefer typed code over untyped code

This would keep the behavior that pyright and mypy are using for their users.

It would entail documenting that when something is a subtype of a known and unknown type, that type checkers should prefer the known definitions when available. This has a low incidence of potential False positives that only occur when untyped code is in a diamond pattern with typed code.

This can be accomplished in the following ways:
- specifying that Any is not all possible types in some contexts, and defining alternative behavior for a list of contexts.
- treating untyped imports as something that isn’t Any, perhaps Unknown, and defining the alternative behavior of that.

Assume LSP compatibility, use ordering within MRO to determine a minimum bound

This would entail specifying that Any as a base class has behavior where it it consistent with all possible types based on where it exists in MRO, more closely matching runtime. This has a high potential False negative rate, limited to cases where people have multiple inheritance involving untyped code.

Edit: Detailed below, this method can also be used to have the behavior of preferring the known minimum bound and not allow the upper bound when more is known

Do not assume LSP compatibility

When Any is ordered in MRO with higher precedence than typed code, it erases the typed code’s known types, since it could have alternative definitions, and untyped code has no guarantee about LSP.

Impact on current code and features

This first of these options with either method of going about it would have no impact on existing code using mypy and pyright. I have not exhaustively checked all type checkers, so there may be impact if other type checkers beyond what were listed above support subclassing of Any, but make a different decision here.

The second of the above options may narrowly cause some code to have false negatives and some other code to no longer have false positives, limited to multiple inheritance involving gradual typing, where the gradually typed code is listed by the affected code as having priority in MRO.

The third option would be the most disruptive and have a higher potential incidence of false negatives. It creates issues with the use of isinstance for narrowing. It is possible to rule out if we determine that Any’s definition of compatibility includes that it cannot be used to violate LSP. I do not view this as a viable interpretation due to the impact it would have on existing code, even if it may appear to some as a reasonable interpretation in a vacuum.

Impact on future features

I intend to take whatever resolution we can find in the currently existing case here, to then reexamine the ongoing work on Intersection types to ensure the future feature remains consistent with pre-existing parts of the type system. It is worth noting that multiple typing council members gave persuasive reasoning in that discussion which would only be consistent with the second option. While the reasoning was persuasive, I do not think they should be held to that reasoning here as it was only persuasive within the assumptions we were working under at the time, which included assumptions about subtying of Any.

Edit notes:

  • A Note about LSP-compatible Any and a link to a relevant post below was added
  • the phrase “top type” which was used lazily on my part was removed and replaced with a more appropriate phrasing.
8 Likes

Thank you for the great summary, I thought it would be interesting to run a small poll, since not everyone may feel confident enough to put their stance into words. This should hopefully also help us avoid falling into the same trap as with the previous discussion.

I’ll try to shortly re-summarize what the three main options are for how to interpret subclassing Any:

  1. Pure Any. With this we can make no assumption about the type at all, since Any has to be compatible with all other types. As such class A(Any, B): ... is almost no different from class A(Any): ..., the only piece of extra information we get, is that A must be a subclass of B, so we know which attributes it has for sure, but we don’t know their type.
  2. LSP-compatible Any. With this version we assume that Any cannot violate the LSP, when it is subclassed. As such we lose slightly less information about B when subclassing class A(Any, B): ..., we know a lower bound for all the attributes, which in some cases will not look that different from pure Any but in the case of methods we at least still know which parameters are required, so the potential for false negatives is a little bit lower.
  3. Any that always prioritizes known attribute types: This is how subclassing Any currently works in both mypy and pyright. It always returns the first known type of the attribute in the MRO. This has a potential for false positives, but it also avoids many false negatives, since overlapping attributes are less common in a multiple inheritance scenario and it’s even less common that those overlapping attributes differ in their type/signature.
  • Pure Any
  • LSP-compatible Any
  • Any that always prioritizes known attribute types
  • I have no preference/I don’t understand the difference
0 voters

If the top[1] option were chosen, would you want to split off the behavior of the third option into a new Unknown type form?

  • Yes
  • No
0 voters

  1. or possibly even second from top ↩︎

1 Like

Thank you for the excellent summary! I found this much clearer than the presentation in the previous thread, and very easy to understand.

I would like to suggest a slightly different theoretical foundation for the understanding of Any [1], which I think can help inform this decision. In particular, I think it helps frame options 2 (and to a slightly lesser extent, 1) as less of a special case; rather, they are consistent with our usual understanding of Any.

I don’t think that Any is ever “the top type” in the Python type system. The top type is object. In a set-theoretic treatment of types, the top type is “the set of all possible values.” In Python, this is object. In a static type system, the top type is not “compatible with any type.” In fact, it is compatible with no type at all, besides itself.

Any is, in contrast, not “the set of all possible values”, but some particular (and likely limited), but statically not known, set of values. This is the interpretation that allows Any to be compatible with any type, and to have any attribute or method. When a type-checker sees an attribute access on a value typed as Any, it should say “oh, Any here represents some set of values which all have that attribute,” and not throw a static type error.

The purpose of Any in gradual typing is to allow interoperability with statically-untyped code, where the code author takes responsibility for correctness, without having to deal with false positives from a static checker’s lack of knowledge. Thus, type checkers should generally aim for the most “favorable” (i.e. not-false-positive-causing) interpretation of Any.

IMO the option that best meets this criterion is “LSP-compatible Any,” and the second-best is “Any that always prioritizes known attribute types.” I think “pure Any” is not actually a pure interpretation at all, but a misunderstanding of Any. EDIT: this conclusion isn’t quite right. All three proposals are consistent with the above understanding of Any, and “pure Any” goes the furthest in making the “most favorable” interpretation. But the above understanding of Any also gives us freedom to choose a slightly more restrictive interpretation of what sets of values Any can represent in a particular situation, if we judge that some additional assumptions greatly reduce false negatives with a small cost in false positives. And in this case I think assuming LSP compatibility fits that bill.

So I don’t think that this is how we should approach option 1 or 2; rather, we should give a clearer definition of Any (as described above), which is already (and always has been) different from “the top type,” and is already consistent with these options.

[1] Full credit to @kmillikin for this definition of Any in his earlier work on a from-scratch specification for Python typing.

5 Likes

Thank you for this addition, it was enough for me to change my mind on a small detail. I think calling it the top type is not incorrect under many type theories, but we haven’t actually picked a formal theory, so we should probably avoid using the phrase “top type”.

I like the definition you borrowed from the prior work on a from-scratch specification, but using it could create a situation where some values of Any are incompatible with other values of Any and require more type checker work to determine if the totality of use was consistent. This is something @mikeshardmind advocated for to improve inference in the past, and is closer to the behavior of pytype as a type checker, but as I remember it there were many people who thought that wasn’t viable.

A type checker under that definition could reject this, which would change people’s current expectations of Any

x: Any
y: Any

@overload
def foo(a: str, b: str) -> str:  ...
@overload
def foo(a: int, b: int) -> int:  ...
def foo(a, b):  ...

foo(x, 1)
foo(y, "str")
foo(x, y)  # What consistent value could there be for each x and y?

I don’t have a problem with that per-se, and that interpretation of Any is consistently applicable within all formulations of type theory that allow for gradual typing that I’m aware of, but I have questions on what it means if we specify it this way, and if we do so, if it would not be better off spending however long necessary to adopt a formal theory and match all existing constructs to how they should be defined within that theory.

2 Likes

Thanks for starting this thread @mikeshardmind. I appreciate your willingness to give this alternative approach a try.

I agree that this case is under-specified in the typing spec. I’m tracking a list of other under-specified areas that are currently causing pain for users, and that list has grown to almost 90 in length. (We have a lot of work to do! At our current pace, it will take over two years to get through this list.) From my perspective, this issue isn’t currently causing problems for users of mypy and pyright, so my inclination (purely pragmatic) would be to put this on the back burner for now and focus on other areas that are more pressing. However, if this issue is blocking you from making progress on the Intersection proposal, then it raises the priority in my mind.

A few thoughts that might help inform the discussion:

  • We have already established a precedent with type[Any] that seems applicable here. In this thread, we reached consensus that type checkers should treat type[Any] as though all known methods and attributes of type and object are present and all other (not-present) methods and attributes are treated as type Any. Theoretically, a metaclass could override one of the methods in type, but we decided to ignore that possibility. I think this is reasonable. The current behavior of mypy and pyright (which prioritize known attribute types) is consistent with the type[Any] precedent.

  • Type information is used for more than just static type checking. It’s also used for runtime type checking and for edit-time features like completion suggestions and signature help. For every Python developer who uses pyright for static type checking, there are 40x as many developers who do not use static type checking but rely on static type evaluation for language server features. These developers are more likely to be using untyped or partially-typed code, so they will more often hit the case where a class has an unknown base class. If we were to change the spec to treat all methods and attributes of such a class as Any, it would significantly harm their user experience — enough so that I don’t think I could justify making such a change in pyright regardless of what the typing spec says.

  • I don’t think this issue is important enough to justify adding another Unknown type form and all the complexity and confusion that would entail. I’d prefer if we took that off the table.

This thinking leads me to favor the first option in Mike’s list. However, we may be able to specify the behavior in a more surgical manner so as to limit the impact on the rest of the type system. We could simply state that when a type checker computes the MRO of a class, it should generate the MRO as if Any were not present. If Any is present, it should append an Any to the end of the resulting MRO list (after object). In cases where Any isn’t present, the MRO always ends with object. By specifying it this way, I think the intended behavior is clear and is consistent with the type[Any] precedent. This approach allows us to sidestep a lengthier debate about the nuanced meaning of Any, and it hopefully provides the clarity needed to unblock work on the Intersection proposal.

5 Likes

I won’t be responding to every point right now as I’d like to take the time to formulate a response that accurately captures all discussion so far before doing so.

I can create a proposal for intersection consistent with any choice we come up with, and have taken the time to think out the consequences of each of these options for intersections already. The problem is that how these would be intertwined would mean that accepting a proposal on intersections would create a bit of a lock-in (though perhaps not a permanent and unresolvable one) on the behavior of Any in multiple-subtying relationships. I don’t think it would be responsible to present intersections if it locks in something simpler to be a specific way without being sure we have a clear definition, but I can probably defensively word the Intersection proposal and specification to allow it if that’s a more pragmatic way forward.

Without quoting your suggested handling in its entirety, behaviorally this would be equivalent for all current cases to what we currently have, and there is a reasonable way to interpret this in intersections. I have no issue with this. It would not be my first choice, but that’s due to the primary concern about the consistency of a more detailed feature causing lock-in (or even reluctance to further clarify) if accepted.

Feel free to reach out in messages on Discourse or as you have previously on Discord if you’d like to take a tangent on why this may be of higher priority than you have initially assessed it further without continuing that line of discussion here.

I’ll circle back on the main points of discussion either on Sunday or Monday to respond to what people have presented more comprehensively.

I’m going to cut to the chase first: I think this is the best option in the thread so far but I have concerns with it and I wonder if there’s a better way that is reasonable to work on now, or if this is good enough for now, and we can revisit it if it becomes a problem later or if we make enough progress on the other things that are bigger issues.


My concerns with MRO reordering for this

I think there’s some danger in introducing another pattern where typing intentionally deviates from runtime, so I gave it some more thought. I was especially concerned that we needed to be careful to prevent wording that would preclude fixing this in the future if it does become a problem.

I think the wording you provided works in this regard, at least for all of the things I’m aware of people currently working on additions for the type system, though I’m concerned a bit here because I found your argument against this persuasive enough in the past to take the time to avoid specifying something which would cause type errors when substituting in Any. I understand the pragmatism you’re aiming at here as it’s similar to the one I aimed for before as well, but I’m not without concerns. Part of the reason the middle ground between treating this as “Any” and treating it as “only the known types” was “LSP compatible-Any” is that is the most type information we could retain, while the argument made would not apply.

Argument made, for context

Here’s an example that shows how option 5, as it’s currently formulated, violates a core principle of the type system.

In a gradual type system, Any is treated as a stand-in for “any type that could conceivably satisfy type compatibility requirements”. In other words, when you see Any, you should think “is there any type that I could substitute here that would make this work without a type violation?”. If your code is fully typed (i.e. no use of Any) and no type violations are reported by a type checker, you should be able to replace any type used in your code with Any, and it should still type check. This is a core principle of any gradual type system, and its a principle that the Python type system (and mypy and pyright) try to honor in all cases. If you change an existing type to Any and new type errors are reported, then there’s something very wrong.

https://github.com/CarliJoy/intersection_examples/issues/31#issuecomment-1866869547

the case of type[SomeType] vs type[Any] is an existing place where this argument is ignored, so it isn’t universal in python right now, but I think that decision too could be changed in the future to use a lower and upper bound, rather than a single type, see below about “LSP-compatible Any”

As for impact, this is the status quo and I haven’t seen evidence that people expect more than this. It came to my attention while constructing “how to teach intersections”, not through any real code so since my prior message, I made attempts to construct an example where it might matter, and for each one, there were multiple solutions available which seemed better than “change the behavior of existing type checkers for this example in particular”

After examining both the immediate and foreseeable future impact, I believe that this would be an acceptable and low-impact decision to go with for now, and until someone has a motivating case beyond theory, and that it is a way that can be changed later if this particular decision causes problems later on.

Further detailing LSP compatible Any, and type bounds

Maybe this should have been expressed better, but based on a couple of responses, I think a couple of people realized this was related to the set-theoretic ideas.

I don’t think an “LSP-compatible any” (either by the methods I suggested, or improving the definition of Any as @carljm provided) would be “just Any”, but we would need a better way to express lower and upper bounds on a type within the type system to have this be a user-expressible concept, and likely for this to be usable by language servers as well. A language server could choose to serve completions for only the known lower bound, and have a separate configurable warning if using something above a minimum known bound, but within what’s allowed by the upper bound.

If Any is constrainable, then having a way to spell upper and lower bounds comes into play.

This could re-resolve the prior decision on type[Any] such that type checkers see

x: type[Any] = ...

And surmise that the lower known bound for x is type (the runtime type, type), and the upper bound is Any. This particular example might be a little too magical with the overloaded meaning of type, lets look at one of the ones from the original post:

from untyped import Unknown  # standing for an untyped library

class SerialMixin:
    def serialize(self) -> bytes:
        ...

class AmbiguousExample(Unknown, SerialMixin):
    def foo(self) -> int:

x = AmbiguousExample()
reveal_type(x.foo)  # Lower bound: () -> int, Upper bound:  () -> int
reveal_type(x.serialize)  # Lower bound: () -> bytes, Upper bound: Callable[..., Any]
reveal_type(x.bar)  # Lower bound: Any, Upper bound: Any

Interestingly, this also gives us another way to arrive at preferring the known definitions, saying that it’s only currently considered safe to use the known lower bound.

While I think most of Python falls under set-theoretic typing in all but the official adoption of the model, the issues this presents for existing indirect users of typing if not done with other supporting type system features suggests a strong reason to defer any such change to use a definition closer to the set-theoretic model of typing until we have stronger motivations to do so, or until there is nothing else more pressing. Doing this would be a larger undertaking than I can personally justify at this point in time, even seeing the theoretical issues at play, as well as the theoretical future benefits.

5 Likes

I mostly agree with what you said, except for this. Could we not adopt preferring what is known under this rationale and set of definitions, and then leave the door open to implementing ways to use the known potential higher bound under the already existing framework? This would only be documentation changes but would unify type[Any], Any as a baseclass, and Any in an intersection (future) al around current type checker behavior.

1 Like

For my own curiosity, I’m wondering if you have links handy to papers or other presentations of such theories? I think there’s some risk of terminology confusion here, since many type formalizations (often not even of gradual type systems) do use the term ANY for the top type, but despite the shared name, this is not Python’s Any, and we should be careful to avoid assuming equivalence. Python’s Any (a type that both accepts all values and permits ~all uses of those values) is specific to gradual typing; in the gradual typing literature it’s more often named “dyn” or “dynamic”. Sometimes it is informally described as “both the top and bottom type,” but I’m not sure how it could be considered simply “the top type” without straying pretty far from the usual definitions.

I don’t think defining Any as representing a limited but statically-unknown set of values necessarily places a requirement on type checkers to perform type inference or occurrence typing on values typed as Any. (Defining something as “statically unknown” is not an implicit requirement to place statically-known bounds on it; if anything it suggests the opposite.) I think it’s valid for a type-checker to do this, but it’s not required to be consistent with the definition.

I suspect the applicability would be rather narrow, though. In your example the valid type for x would be “a type that inherits both int and str.” (At runtime this particular example is not possible due to metaclass layout conflict, but I wouldn’t expect a type checker to model that; at least as long as int and str are not final types, I’d expect that code to be accepted.)

1 Like

I could definitely find some later, but I said this to avoid using the term at all without an agreed upon framework first. Some would say Any is the top type, Some would say it is both the top and bottom type, and some would say it is neither. I prefer a definition where it is neither. I believe this models use in python better, but it isn’t correct to say anything about a top type absent a working framework with much more rigor than exists in python right now.

I could construct some examples with invariant generics that more definitively disallow it, but the example there being looser was important to the example. You can’t place a tighter bound on either x or y from only that information because they could overlap if only either of them were a type that inherits from both, it doesn’t require both to be. Without being careful about saying when a type checker must use known context to place a bound on gradual types, for instance, to allow only using the lower bound, complexity would significantly increase.

1 Like

@Liz @carljm

I appreciate the concern about clarity, it was my fault in the first place for using “top type” without a specific definition, accepted or provided, for it, especially when I’m bringing forward an issue of not enough clarity. Thank you for bringing attention to it, I did not need to and will amend the earlier post to reduce confusion in only that part. I don’t think this otherwise changes our options for finding common ground forward.

It seems to me that the only two viable options currently viable (in that they would not be too disruptive) would each keep type-checking behavior the same for current users. (those being 1. a more formal definition of Any while using the lower bound and 2. MRO reordering) Can we take definitions closer to a set-theoretic definition without having to adopt the whole framework to not kick as much of this down the road for future contributors to have to figure out and untangle without introducing a full theory-driven typing model right now? If we can, I would think it is worth it, but I did not think we had an unambiguous way to do that, which should answer “why not…” with regard to

If you believe it can be phrased in a way to do this, completely unambiguously, and scoped appropriately, I’d be behind such a way forward.

1 Like

Perhaps another way forward, given the apparent popularity of that interpretation, would be to use your LSP-compatible definition, but leave it up to type checkers for how to deal with the extra information in the absence of an explicit type construct that expresses a gradual type with a lower bound[1]. That would leave the door open for type checkers to use either the lower bound or Any as a stand-in in the meantime, although I’m personally not convinced it’s worth the extra complexity. While it may be easier for type theorists to reason about, regular users almost certainly won’t understand what it means for a method to have a lower bound or how to interpret that, even when visualized somewhat sanely.


But if we want to stick with the current behavior: Another possible way to frame the current behavior as something other than MRO reordering, would be to specify how method resolution is supposed to work with Any (or perhaps in some cases even additional gradual types, like an unsolved TypeVar) independently of the object model’s MRO, that would also leave the door open to pick pyright’s current behavior over mypy’s current behavior:

In pyright the resolution steps are[2]:

  1. follow the MRO and look for the first explicit definition of that attribute
  2. if Any is part of the MRO then return Any
  3. follow the MRO again and look for the first explicit definition of __getattr__ and use its return type for the given attribute name
  4. attribute doesn’t exist

In mypy the steps are the same, but 2 and 3 are reversed.


In order to avoid another useless back and forth:

Ramblings about the meaning of MRO

Yes, I know what MRO stands for, but there’s a difference between the literal meaning and what the object model actually specifies and implements and yet another difference with how type checkers implement that model with gradual types, so it makes sense to specify it separately from the object model, since the rules are clearly ambiguous, otherwise we wouldn’t have ended up with different behavior between type checkers. The object model’s rules need to be implemented anyways, so whether we add an extra step in the resolution sequence or reorder the MRO, the result is indeed the same, but manipulating the MRO has other potential side-effects, whereas adding an additional step would leave the MRO itself intact. In terms of complexity I really don’t buy the argument that reordering the MRO is simpler.


  1. Since I apparently didn’t express myself clearly enough: What I mean by this is that we currently have no way to export that information to external tooling or even visualize it, so it should be up to type checkers how to deal with that inadequacy in the meantime, so this proposal doesn’t get rejected on the basis of ruining user-experience ↩︎

  2. if any of the steps are successful the following steps are skipped. I’m also leaving out some details here such as instance vs. class level attribute access for the sake of simplicity. ↩︎

This should not be an option here. exported types and stubs meant to be shared between type checkers should not end up meaning different things under different compliant type checkers.

I’m not sure you’re following the underlying issue with this being specification. I believe everyone else is preferring this because it has stronger ties to formal theory, has fewer special cases, and that we can specify that type checkers use the lower bound. Leaving open a future in specification to do more is not inviting type checkers to do more here without first adding to the specification to describe what that behavior should be, it’s writing in a way that does not block future progress.

MRO = Method resolution order. Your idea of special casing the way MRO works when Any is seen and having it take place in multiple places, is more work to essentially have the same effect as just reordering it in MRO in the first place.

1 Like

We’re discussing the specification here, and I think it’s reasonable to require more knowledge at this level of interaction. In the user-facing docs if it is a definition we can use (and this is a big if still as far as I’m concerned) we can (And should) spell out what this means in practical effects with something like “When you have an unknown base class, the type system will prefer definitions from known base classes, before falling back to Any” to ensure that the deeper theory is not a barrier to understanding the effects of the type system for end users; however, having a documented set of rules and a consistently applicable rationale is important to ensure that other features in the type system remain consistently applied.

Having stronger and more consistent definitions does not preclude also having accessible explanations, and I’d prefer we not use that as a reason against any path forward.

2 Likes

I generally agree with you, but this falls apart as soon as the end-user is faced with these types that don’t even have a spelling yet in the current type system, or worse, with a magically synthesized type that satisfies both the upper and lower bound through the use of Any. This is, as far as I understand, what is at the core of Eric’s objection to this interpretation, since regular users, that don’t even care about typing will come into contact with these types through pylance.

So unless we also specify a way to spell a gradual type with a lower bound, so that this type can be exported, inherited and composed with other types, this doesn’t seem like a solution that is worth the cost to the end-user experience, unless we specify that type checkers can choose what type they export in these cases, until we have an official spelling.


Additional motivating example why avoiding the use of a synthesized type for the combined upper/lower bound is probably a good idea

In the case that we synthesize a method signature that satisfies both the upper and lower bound when subclassing Any we run into issues when checking for LSP violations, because Any drops the information about our lower bound, so the following would now yield a false negative:


class A:
    def foo(self, x: float) -> None: ...

class B(Any, A):
   # synthesized foo looks like this:
   # def foo(self, x: Any, *args: Any, **kwargs: Any) -> None: ...

class C(B):
     # should violate LSP, but doesn't unless code that checks for
     # LSP violations specifically recalculates the method type
     # without taking into account `Any`, which seems inefficient.
    def foo(self, x: int) -> None: ...

Of course you still have the opposite problem with a false negative when calling the method on B with parameters that could have been added by the unknown class, but that’s no different from what we get currently, so it shouldn’t lead to new issues at least.

Okay, I think you just cleared up where we weren’t on the same page anymore.

The idea is to specify that type checkers do the same thing they currently do for now and that LSP-compatible Any would pick the lower bound (not both the lower bound and higher bound at this point in time). This leaves the higher bound open to exploration in the future without un-special casing 3 related constructs, 2 that exist and 1 that is actively being drafted.

2 Likes

Okay, it’s been a week with no further followup. I think we have a compelling reason to keep the current behavior of type checkers and that that behavior should be documented. If nobody has objections to that, in the interest of ensuring continuing progress on what this is currently blocking, here’s the options I see as viable going forward:

  1. Just document the special case as a special case matching current behavior. We can retroactively insert a consistent rationale (As seen above) should it later be needed.
  2. Pick and document a rationale that preserves current type checker behavior, and apply it to both type[Any] and class ...(Any, ...)

I want to be clear here: I expect zero code changes for mypy and pyright as a result of this and zero behavior change for users of typing, this is merely a “get something consistent enough specified”

edit for future readers: further testing surfaced other inconsistencies. It may not be major code changes, but code changes would be required.

I can open a PR against the spec with either of these options and work with it, but I don’t know if there’s a strong enough reason to prefer one of these over the other right now, and those maintaining type checkers may have stronger opinions about the language placed in the specification about this.

I’ve gone ahead and opened a minimal PR for consideration to update the spec to describe existing behavior only at this time, leaving choosing a consistent reasoning for a later time.

2 Likes

When implementing this proposed change in pyright, I ran into a problem that I hadn’t anticipated. The problem is with object.__new__ and object.__init__. Both of these methods are defined in the typeshed stubs to take zero parameters.

Consider the following code. This should not generate any type errors, and it doesn’t today in pyright. But with the new MRO linearization algorithm, whereby Any is placed in the MRO after object, this now generates an error.

from typing import Any

class Class1(Any):
    def __init__(self, x: int):
        # This should not generate an error.
        super(Class1, self).__init__(x, 1, 2, 3)

I’m not sure what to do about this. It may require us to rethink the proposed spec change — or make it more complex and specify special-case behavior for __init__ and __new__.

Another option is to put this entire topic on hold for now. As I’ve said above, I don’t have any evidence that this issues is causing pain for mypy or pyright users. We could just leave it unspecified for now and leave the existing behaviors in place.

I’d ask that we not. Without this, I’d need to mark intersections as blocked on this, there’s danger in the two diverging for behavior with subtyping from Any.

constructors are already special-cased significantly to be allowed to violate LSP, I’m not sure if this is a reasonable further exception, or if we can find a way to place Any right before the last instance of object/type.

It’s likely that we also need to have a decision about __getattr__ based on what’s come up in review.

@Liz suggested

What about going the other way on this? You could specify that a class that inherits from Any must only have a __getattr__ definition if the definition is typed to return Any

which seems to reasonably interact with expectations of gradual typing, while minimizing how much is special