Specs clarification: `type[A | B]` is equivalent/the same as `type[A] | type[B]`

InSync · August 15, 2024, 4:18am

It has never been stated in the specs that type[] is distributive over union, even though type checkers are already treating it as such:

(Mypy, Pyright, Pyre, Pytype, PyCharm)

class A: ...
class B: ...

a: type[A | B] = A
b: type[A] | type[B] = B

a = b  # fine
b = a  # fine

I propose that the following line be added:

type[] distributes over unions: type[A | B] is the same as type[A] | type[B].

PR: @python/typing#1841.

erictraut · August 15, 2024, 4:42am

Thanks @InSync, I agree this is currently underspecified, and it would be a good addition to the typing spec. Most covariant types cannot be expanded like this (e.g. Sequence[int | str] is not consistent with Sequence[int] | Sequence[str]), so it’s important to call out cases that are. We recently accepted a similar statement in the tuples chapter.

My only feedback is about the proposed wording. The phrase “same as” is pretty informal. We recently adopted more precise terms, and we’re trying to use those consistently throughout the spec. I think the applicable term here is “consistent with”.

AlexWaygood · August 15, 2024, 7:51am

Thanks for raising this issue @InSync. It’s one of the points I raised in Clarifications to the typing spec regarding `type[]`. I have a draft PR locally addressing it and some other issues discussed in that earlier thread, but haven’t yet got round to polishing the PR and submitting it — so thanks for pushing this forward.

stroxler · August 15, 2024, 4:28pm

I don’t agree, based on my current understanding of what type[...] means.

I’ve come to think that type[C] generally means “the type object for class C” and is only really meaningful for classes, which I think is consistent with what Alex suggested was the current state in his Clarifications.

In other words, not only does the current spec say type[int | str] “is a type error”, it actually says that this symbol is undefined, it has no meaning whatsoever.

If we begin allowing forms such as type[int | str], my read is that this is actually a definition of a form that previously had no meaning. There doesn’t exist a class object for int | str, so the existing definition is meaningless here, I think we are effectively defining syntactic sugar.

As a result, I don’t currently agree that “is consistent with” would be better. To say it is consistent is to state a typing rule (i.e. axiom) as if both type[int | str] and type[int] | type[str] were well-defined apriori; such an axiom is nonsense if the LHS is an undefined symbol.

As a result, I would favor “the same as” over “consistent with” but I actually think “is defined to mean” is better.

stroxler · August 15, 2024, 4:34pm

I’m also a little bit skeptical about whether it’s a good idea to provide this syntactic sugar; I think if we allow forms like type[int | str] it might actually be worse for end users, because it moves in the direction of treating type[...] as if it represents and arbitrary static type which is incorrect.

My instinct is that the existing restrictions may actually be better, since they map more clearly to the underlying definition of type as being the type of a type object.

erictraut · August 15, 2024, 4:49pm

the current spec say[s] type[int | str] “is a type error”

I don’t think that’s the case. It explicitly allows unions to be used in a type subscript and even provides an example of this. Here’s the quote:

Note that it is legal to use a union of classes as the parameter for type[], as in:

def new_non_team_user(user_class: type[BasicUser | ProUser]):
    user = new_user(user_class)
    ...
However the actual argument passed in at runtime must still be a concrete class object, e.g. in the above example:

I’m also a little bit skeptical about whether it’s a good idea to provide this syntactic sugar

It would be really disruptive to disallow this now. It has been allowed since PEP 484, and all four major type checkers support it.

I think the intended purpose of this proposed change is to simply clarify the equivalence between type[int | str] and type[int] | type[str], which was previously unspecified but implied.

stroxler · August 15, 2024, 4:57pm

Yeah I was thinking some more and realized the syntactic sugar is probably necessary and pre-existing, I think you’re right here.

I was chatting with Jia and realized also type over union has some reasonable uses, e.g.

def f(x: X) -> type[X]:
     return type(x)

will work well expanded across unions given this definition.

So I think my major point is really that this is a definition (I guess you called it an equivalence, which is maybe the same thing practically although it still suggests that type[int | str] has some clear meaning absent this definition which I’m not sure it does).

But at any rate if it’s an equivalence I think that “consistent with” is still not better terminology - I believe that we are saying these two types are exactly the same, not that they have some subtyping relationship which is what the term “consistency” is usually used for.

Jelle · August 15, 2024, 4:58pm

I suggested on GitHub that the right term to use is “equivalent” (Glossary — typing documentation), not “consistent” (Glossary — typing documentation). We could also directly say, as Steven suggests that type[A | B] is syntactic sugar for type[A] | type[B].

I agree that this syntax has long been supported and we should continue to support it.

@erictraut some of your post above is garbled, I think you may have forgotten to paste a link you tried to add.

mikeshardmind · August 15, 2024, 5:01pm

I don’t think they should be equivalent, but it takes something like (though not necessarily the only way) a type with a metaclass __call__ implemented to show this.

You can implement a type such that when it is initialized, it returns one of a few types. (a real world example of this is pathlib.Path returning concrete implementations based on system)

Seperately, you could have one of a few types that always initializes to themselves.

carljm · August 16, 2024, 4:41am

It’s not clear to me what definition or invariants of type[] you are assuming that gives you this result.

The definition of type[C] in the type spec is simply “subclasses of C” – that is, the type expression type[C] represents the set of all class objects that are subclasses of C, including C itself. This definition has nothing at all to say about the result of calling a value of type[C], so metaclasses and custom __call__ methods don’t seem relevant to the definition of type[] as currently specified.

It seems like reasonable syntactic sugar (already strongly implied by the current spec, and supported by type checkers) to specify that type[B | C] means simply “the set of all class objects that are subclasses of either B or C”, which is equivalent to type[B] | type[C].

The type spec currently doesn’t specify the result of calling something typed as type[C] at all. Perhaps it should, but that’s a separate discussion. If we were to specify that, we would want to discuss several potential unsoundnesses, including non-enforcement of LSP on constructor signatures, and metaclasses with custom __call__ that return a different type.

mikeshardmind · August 16, 2024, 3:45pm

Right, sorry. That wasn’t a fully explained thought.

The way most people actually use typ: type[X] is with the intent that typ(...) becomes an instance of X. The type system does not currently reflect this. I’d prefer we fix this. I think codifying this as equivalent harms the path to get better expressiveness of intent for some cases that aren’t currently supported

carljm · August 16, 2024, 3:57pm

Can you spell this out a bit more? How would you like to see type[] defined differently, and how does specifying the union expansion prevent that change (more than it would already be constrained by backward compatibility)?

mikeshardmind · August 16, 2024, 4:10pm

I think we could redefine type in a zero negative user impact way here that could also help catch issues where there’s unsound subclassing involving constructors.

Changing type[X] from just an instance of the type of X, to a type object that when called with the signature X’s appropriate chain of calls based on init, call, and new dunders on both the type and any metaclass present, creates an instance of X.

This would also require that for type[X | Y] there is a means to construct both X and Y with the same calling convention. (very similarly to how a union of callable does not need to error when they have the same signature)

This change wouldn’t forbid an LSP violation on its own since that’s currently accepted, but it would allow:

class X:
    ...

class Y(X):
    def __init__(self, y: int):
        ...


def foo(typ: type[X]) -> X:
    return typ()


foo(Y)  # detectable error here, type[Y] is not a safe substitute for type[X]

This change would be incompatible with this specification clarification, as there’s a difference between a callable type which conditionally returns instances of a union of types and a union of callable types that each return instances of a specific type. It would allow the type system to actually express more of python.

carljm · August 16, 2024, 4:40pm

It seems like you want to redefine type[X] as syntactic sugar for expressing a callable type, where it automatically infers the signature of the callable from the constructor signature of X.

But you’re also suggesting that type[X] should always be a callable type that returns an instance of X. I’m still not clear how you intend this definition to work in cases where a class C has a metaclass __call__ that returns a type other than C. Is type[C] then simply invalid? Or it’s a callable type that returns something other than a C (this would violate the way you described the type above)? Or type[C] represents a callable type which returns a C, and therefore the class object C does not actually inhabit the type type[C]?

I don’t disagree that many people use type[] this way, but it’s worth noting that what you are describing can already be expressed (more verbosely/explicitly) with Callable, and the change would leave us with no way to express “the set of class objects that are subclasses of C”, which can also be a useful type (e.g. if you are calling classmethods or staticmethods.)

I also can’t quite get on board with your description of this change as “zero negative user impact.” Making type[Y] sometimes not be a subtype of type[X], when Y is a subtype of X, is definitely a change that will break existing code. (In some cases that existing code was already unsound, but not in all cases.)

beauxq · August 16, 2024, 4:46pm

I think this is also not accurate, because that wouldn’t also specify that it needs to be a subclass of C.

What people may want is the intersection of the Callable[[], C] & type[C]

mikeshardmind · August 16, 2024, 4:47pm

No, definitely not just sytactic sugar, and type[X] should still be an instance of a type that is consistent with the type of X (as an aside, We would benefit from better terminology, some of the sentences that I’ve written here feel contrived to work around our current definitions and terminology)

If I were put on the spot to refine this further now, then type[C] is only valid when constructing from the type results in an instance of C or one of it’s subtypes. I don’t really see another option here, and this is sort of the problem with python being as dynamic as it is, there will always be a limit of what we can express withing the type system, and a case like this is far outside that with such a simple construct as type[X] where the person assigning and the person using may have different ideas.

I can’t think of a case where such code wasn’t already unsound, but I’d be happy to have a counter example. I think this only occurs where there was an LSP violation on construction + using the LSP violating class in place of the original.

carljm · August 16, 2024, 4:50pm

class X:
    @staticmethod
    def do_fun(): pass

class Y(X):
    def __init__(self, y):
        pass

def f(t: type[X]):
    t.do_fun()

# nothing unsound about this call
f(Y)

mikeshardmind · August 16, 2024, 4:52pm

Fine, there will be some level of “this may surface issues”, I would argue that this case is trivially resolvable by accepting any object with the expected static method as a protocol and no real use cases have been lost, but that a much more common case is properly supported.

Liz · August 16, 2024, 5:26pm

While I think @mikeshardmind brings up a good point, there’s a better way here that incorporates this that was missed.

type[C] is now syntax sugar for type[C] & Callable[C.__call__ signature, C.__call__'s return], making it so that it doesn’t matter what C’s signature or returned actual type are, only that things are consistent with whatever it is.

I’d have said just wait for intersections and a way to copy the signature and return type of something, but even intersections are stuck in limbo and considered by at least a couple people who gave input on them as “low ROI”, but construction from a type is much more common than the other things people are waiting for intersections for.

InSync · August 16, 2024, 6:56pm

You probably mean __new__/__init__. On this same topic, should we expect anything from the metaclass’s __call__?