Why does `typing.Union` (`types.UnionType`) not implement the `collections.abc.Collection` (or `Set`) Protocol?

I didn’t say that. I am well aware that there are uses of runtime annotation inspection, I just wrote a decorator that did as much. But literally calling func(int | bar) is contrived and incomplete (it will never capture the deferred case).

The deferred case is somewhat irrelevant, since this whole thread is about runtime behavior of instances of the class types.UnionType. The question is: why does this class not implement collections.abc.Set or at least collections.abc.Collection, but instead forces you to go through applying typing.get_args.

If you don’t have an instance of types.UnionType to begin with, runtime behavior of instances of the class types.UnionType are irrelevant.

1 Like

I agree that this conversation has got pretty off-topic with the arguments about why you’d want to do introspection of type hints and the debate over whether it’s even feasible in a world where PEP-561 usage is widespread. The typing module has long allowed for introspection of typing objects, and there’s a lot of code out there that performs introspection of typing objects. I also think it’s reasonable to think of a Union as a set-like container of types, like Randolph is suggesting.

So, why not add these dunders? One reason is that we prefer to keep the typing special forms pretty opaque where possible. If we added __contains__ and __iter__, and documented that a Union can be iterated over and used with in, that could make it difficult to refactor the class in the future, should the need ever arise. Having a public “getter” function that returns a tuple of args, however, doesn’t tie our hands in nearly the same way – it’s much more future-proof. Are we likely to refactor Union now? Probably not; it’s been stable for a while now. But the typing module was only in the stdlib on a provisional basis in Python 3.5 and 3.6, and was refactored thoroughly several times in the first few years of its life.

Secondly, it makes life much easier for type checkers. While you’re right that PEP-604 introduced a class (types.UnionType) that could both be used in type annotations and in runtime contexts (in e.g. isinstance() calls), that created all kinds of headaches for type checkers. There were lots of situations where they couldn’t figure out if they should be treating str | int as a super-special typing primitive or as an instance of a class that could be passed to isinstance(), etc… Most of those bugs have now been fixed, but there are still a few lingering ones in the mypy issue tracker (e.g. Errors when using union type aliases with `isinstance` · Issue #14242 · python/mypy · GitHub). Overall I’m a massive fan of PEP-604, but blurring the lines between typeforms and objects usable in runtime contexts shouldn’t be done lightly, in my opinion.

Personally, while I agree that adding __iter__, __contains__, etc. to types.UnionType would make life slightly more ergonomic for people doing introspection of type hints, I don’t really see that there’s a massive problem here. I don’t find it particularly ugly to have to call get_args(union_obj) before doing introspection on the arguments, but maybe that’s just me.

2 Likes

I appreciate the viewpoint, but could it be that some of these headaches stem from the fact that these objects are not treated as regular classes? For instance, another lingering issue I discovered was that while isinstance(x, int | float) works, the same is not true for match-case: [match-case] Allow matching Union types · Issue #106246 · python/cpython · GitHub.

One would expect that since PEP604 this should just work automatically, since types.UnionType implements __isinstancecheck__ and __issubclasscheck__, but it doesn’t.

Like, personally, I’d love if all the constructs had real types associated with them that also expose their corresponding subtyping rules, like for instance a CallableType whose __issubclasscheck__ tests for contra-variance of arguments and covariance of return type, as well as signature.

You have the names of the dunders wrong there (it’s __instancecheck__ and __subclasscheck__), but FYI, you’re also incorrect in thinking that either method is implemented on types.UnionType:

>>> import types
>>> types.UnionType.__dict__['__instancecheck__']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: '__instancecheck__'
>>> types.UnionType.__dict__['__subclasscheck__']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: '__subclasscheck__'

Support for calling isinstance() with types.UnionType objects as the second argument is just hardcoded into the logic for isinstance() itself: https://github.com/python/cpython/blob/f9f085c326cdaa34ebb3ca018228a63825b12122/Objects/abstract.c#L2681-L2683

When it comes to pattern-matching I also wouldn’t really expect that class patterns would work with types.UnionType instances, since class patterns don’t work with any other objects that aren’t classes:

>>> class Foo:
...     def __instancecheck__(self, other): return True
...
>>> f = Foo()
>>> isinstance(42, f)
True
>>> match 42:
...     case f():
...         print('yes')
...
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
TypeError: called match pattern must be a type

Oh, ok. I thought this was the code responsible:

What is this doing then?

That’s typing.Union, a separate class to types.UnionType. The snippet you’re linking to enables you to do this:

>>> from typing import Union
>>> isinstance(42, Union[int, str])
True

That’s a different thing to:

>>> isinstance(42, int | str)
True

You can see that at runtime they are instances of different classes:

>>> type(int | str)
<class 'types.UnionType'>
>>> type(Union[int, str])
<class 'typing._UnionGenericAlias'>

Oh wow, I never realized, that’s crazy! No wonder there needs to be special casing everywhere.