As part of the continuing effort to fill in missing chapters in the typing spec, I’ve written a draft chapter for constructors.
This is an area where we see significant divergence in behavior between type checkers, so it would be good to agree on a standard set of behaviors.
For small typos or wording improvement suggestions, feel free to comment directly in the PR. For larger discussion topics, let’s use this thread for visibility.
Did you already write a set of conformance tests for this chapter? It would be good to see the results so we know which aspects diverge from current type checker behavior and therefore need a closer look.
One thing I was surprised by: Your draft states that if __new__ returns Any, then the return type should be ignored and the method should be assumed to return an instance of the class. That seems wrong to me; if the user explicitly said Any, shouldn’t we believe them? I checked (pyright, mypy) and both use this behavior even when the return type is an implicit Any (from an unimportable module). However, mypy generally isn’t great at believing __new__ return types and this is something we’re already actively considering changing.
The “Converting a Constructor to Callable” section says that to infer the callable type, type checkers should look at the __init__ method first, then __new__. But isn’t that backwards? At runtime, __call__ is used first, then __new__, then __init__. If the metaclass’s __call__ has a very different signature, then the callable type of the class object should reflect that. Here’s a test case where pyright behaves oddly due to this rule: A(1) is inferred as being of type int, but A is not compatible with Callable[[int], int].
Re: “Constructor Calls for type[T]”, I’ve come to realize that when people use type[T], what they usually actually want is Callable[[...], T] (where ideally the argument types are specified explicitly). The latter is actually type-safe (no subclassing problem), more flexible (you can also pass in functions in addition to classes), and generally more robust and explicit.
The only situation I can think of where you genuinely need type[T] is when you want to call static or class methods on the class object. But you’d probably be better off with defining a Protocol for that case:
class C:
@staticmethod
def f(x: int) -> str: ...
class P(Protocol):
def f(self, x: int) -> str: ...
def g(cls: P):
print(cls.f(0))
g(C) # passes in mypy and pyright
So, I wonder whether it shouldn’t just be left to type checkers whether they want to support constructors on type[T] or not?
I haven’t done this yet. I was hoping to get some feedback on the draft before investing the time in a full conformance test suite. Rather than writing a full test suite at this time, I can write a subset to get a sense for current type checker behaviors. I’ll post back here when I have those results.
I’m not very convicted on this point. However, I think it’s important that if the __new__ method is unannotated and its return type is inferred as Any (or Unknown, using pyright’s terminology), then a type checker should assume a return type of Self. If a __new__ is explicitly annotated to return Any, then I think there’s a reasonable argument that the constructor call should evaluate to Any. I’m not sure what that means for the __init__ evaluation though. Should a type checker assume that the __init__ method is not called in this case? Or that it is called? The former will mask errors (false negatives), and the latter could potentially lead to false positives.
The reason that pyright looks at __init__ methods first when converting to a callable is that __init__ method is typically richer in type information. __new__ method signatures often consist of (cls, *args, **kwargs) which is not very useful from a typing perspective. It appears that mypy does the same — presumably for the same reason.
Perspective on why some of the current behavior and user types are the way they are might help here.
Please don’t read the below as “we need to enforce LSP on __init__ / __new__”, that’s a way forward, but it’s not the only one, and as prior discussions got into, there’s reasons why this wasn’t done.
This is only the case due to type checkers not enforcing LSP for __init__ and might indicate that prior proposals to copy a signature for sure elsewhere paired with intersections would help here (ie. T & Callable[T.__init__, T]) This would ensure that a type could be constructed as expected.
While it might be better to define a complex protocol for some of those cases, I don’t see many people being willing to do it when type[T] appears to work until someone makes a subclass that violates LSP on __init__
I don’t think it’s a good idea to have a situation where Annotating with the inferred type without an annotation (Any) changes the behavior. This feels like it also is partially necessary due to how __init__ and __new__ are not required to be LSP compatible.
There might be a better inferrable type than Any using the method that was mentioned in the discussion of subtyping involving Any, but this isn’t something that would be user-denotable currently or for a while.
The ability to partially copy signatures might lead to more people typing __new__accurately, speaking for only myself, the inability to do so is the primary reason I don’t give new more detailed typing. The secondary reason is that type checkers largely ignore it anyway.
Regarding the Callable signature, I feel the most principled solution may be that the callable type of the class should be (informally) the union of the __call__, __new__, and __init__ signatures.
For instance, if class A has a metaclass __call__(*args, **kwargs) -> Self, __new__(*args, **kwargs) -> Self, and __init__ of (a: int) -> None, then this should “sum” to (a: int) -> Self. But if __new__ is (a: int) -> Self and __init__ is (*args: int) -> None, then this should sum to (a: int) -> None.
I’m looking forward to this being properly specified, since this is one of the areas where I’m jealous of pyright users, because it much better reflects the dynamic object model of Python. That being said I think there is still one case that could use improvement, and that is when __new__ is overloaded with different return types.
I think pyright’s heuristic is of preferring __init__ over __new__ when converting to Callable works well in all the cases, except for this one. I think it would be better to retain the signature of __new__ in this case for all the overloads, except for the ones returning Self. But Jelle’s approach of creating a union type might work as well.
I have here a little toy example inspired by WTForm’s API[1], that I think illustrates where the current rule can be inadequate:
Here are some quick-and-dirty conformance test results indicating where each type checker is consistent with the latest draft specification and where it’s not.
I’m also noticing that the behavior of __init_subclass__ is not covered in this section right now, is that intentional to be covered later or elsewhere?
After a quick reading, one question pops to mind. This section states the following:
If any class-scoped type variables are not solved when evaluating the __new__
method call using the supplied arguments, these type variables should be left
unsolved, allowing the __init__ method to be used to solve them.
What if, in the example given, __new__ is defined as __new__(cls) -> Self?
I guess it’s up to the user to have compatible signatures between __new__ and __init__? (this only applies when __new__ doesn’t return Any or something else than Self, according to this spec addition).
I was planning to include __init_subclass__ in a different chapter on metaclasses. It’s not related to the evaluation of constructor calls, so I don’t think it belongs in this chapter. In any case, this draft is already pretty long, so I don’t want to add more at this point.
@Jelle, I like your suggestion about combining the __call__, __new__ and __init__ signatures into a union when converting a constructor to a callable, but I was concerned about the compatibility implications.
As an experiment, I quickly implemented your suggestion. To my surprise, it generated no mypy_primer changes!
Based on this, I’m going to rewrite that section of the draft spec to incorporate your proposal.
Thanks to everyone who has reviewed and provided feedback on this chapter.
I’ve incorporated the feedback in the latest draft. I’ll leave it open for review and comments for another 24 hours. If I don’t receive any additional feedback, I plan to submit it to the TC for consideration.
Thanks. One area where I’m not convinced yet is the behavior when __new__ returns Any, as I mentioned before. I see the justification for the behavior, but it’s a special case and I’m not convinced user-defined __new__ methods are common enough to justify the special case. I’d be interested in hearing more opinions from the community on this point:
If the __new__ method of a class C is unannotated, or annotated as returning Any, should type checkers treat the return type as Self, so that constructing C() is inferred as returning C, or should type checkers assume that the return type is not known and infer C() as returning Any?
Yeah, I thought more about your feedback and adjusted the spec accordingly. Please review the latest draft and let me know what you think. I left you a comment in the PR to point out the updated section.
Oh great, thanks! I realized I was actually talking about __call__, not __new__, where the proposed spec still special-cases Any:
If the evaluated return type of the ``__call__`` method is something other than
``Any`` or an instance of the class being constructed, a type checker should
assume that the metaclass ``__call__`` method is overriding ``type.__call__``
in some special manner, and it should not attempt to evaluate the ``__new__``
or ``__init__`` methods on the class.
Sorry for mentioning the wrong method above. I think if we’re considering __call__, what I said above applies even more: overriding __call__ on a metaclass is not a common enough occurrence to warrant such a special case. Therefore, I would suggest to remove the “Any or” phrase in the paragraph I quoted above.
Given this test:
from typing import Any, TypeVar
class Meta1(type):
def __call__(self, *args: Any, **kwargs: Any) -> Any:
return super().__call__(*args, **kwargs)
class CallAny(metaclass=Meta1):
pass
class Meta2(type):
def __call__(self, *args: Any, **kwargs: Any) -> int:
return 42
class CallInt(metaclass=Meta2):
pass
T = TypeVar("T")
class Meta3(type):
def __call__(self: type[T], *args: Any, **kwargs: Any) -> T:
return super().__call__(*args, **kwargs)
class CallSelf(metaclass=Meta3):
pass
reveal_type(CallAny())
reveal_type(CallInt())
reveal_type(CallSelf())
I don’t agree with removing the Any special case for __call__. As you said, it’s rare that a metaclass overrides __call__ in a way that would change the normal behavior of type.__call__. Typically this is only done in cases where NoReturn is the intended return type. If we treat an Any return type as an indication that the metaclass is overriding the normal behavior, that creates big problems in the case where there is a metaclass __call__ method that happens to be unannotated. We’d need to differentiate between explicit Any and implicit Any (what pyright calls Unknown). That’s not a concept that’s found anywhere in the typing spec right now, and I don’t think this is the time to introduce it. I think the current proposal is preferable.
I see what you mean, but given that overriding __call__ in a metaclass is rare, I don’t think it is a big enough problem to warrant a special case. An unannotated __call__ would indeed make type checking more difficult, but there is a simple solution for users: add a type annotation.
That’s not an option if the class is in a library, which is most often where a developers will run into this issue. Developers have no recourse in this case — other than to file a bug with the library author and wait an indefinite time for the issue to be fixed and a new library to be released.
The current behavior of all type checkers is friendly to Python developers in this case. The change that you’re suggesting here would be hostile to developers. I don’t think that’s the right answer.
If the issue is in a library, users can provide stub files. That is no different from the situation with any other unannotated function.
My concern here is with keeping the type system easy to understand. I would find it confusing if I explicitly write that my __call__ method returns Any, and type checkers then interpret it as something else. Special cases like the one you propose can be helpful for usability, but they also hurt usability by making the entire system less predictable. In this case, I don’t think the use case is common enough to warrant a special case.