# Syntactic sugar for union cases in match statements

Given these definitions:

``````from typing import NamedTuple, TypeAlias

class A(NamedTuple):
x: int
y: str

class B(NamedTuple):
x: int
y: str
``````

it would be nice if I could write the following:

``````AB: TypeAlias = A | B

match ab:
case AB(x, y):
print(f"{x=}, {y=}")
``````

``````match ab:
case A(x, y) | B(x, y):
print(f"{x=}, {y=}")
``````

That is, it would be great if the union type was expanded into an or-pattern in match-case.

Motivation: the above example is a bit silly. Where this would in practice be helpful is if you have a union over subclasses from a single base class:

``````from abc import abstractmethod
from dataclasses import dataclass

@dataclass(frozen=True)
class Base:
x: bool
y: int | str

@abstractmethod
def f(self) -> str:
raise NotImplementedError()

@dataclass(frozen=True)
class A(Base):
y: int  # narrowed type

def f(self) -> str:
return "I'm class A"

@dataclass(frozen=True)
class B(Base):
def f(self) -> str:
return "I'm class B"

# This gives us a type which can be used for exhaustiveness checking.
# We can't use `Base` for that because it might have other subclasses.
AB: TypeAlias = A | B
``````

When we want to distinguish `A` and `B`, we can exhaustively check over `AB`:

``````def f(ab: AB):
match ab:
case A(x, y):
print(f"an int: {y}")
case B(x, y):
print(f"either str or int: {y}")
``````

But we can also match on the whole thing at once:

``````def f(ab: AB):
match x:
case AB(x, y):
print(f"either str or int: {y}")
``````

Specification: There would be a runtime error if the `__match_args__` of the union elements arenâ€™t all the same.

Other justification: unions already work with `isinstance`:

``````isinstance(ab, AB)
``````

so why not `case` as well?

5 Likes

Naming a TypeAlias makes it reusable, unlike Or patterns. Itâ€™s logical, given isinstance. Unions give a bit more control than searching down the mro

Isnâ€™t this extremely niche? Is it very often the case that two child classes have matching members?

Anyway, isnâ€™t this also an unnecessary restriction if you use keyword arguments in the case statement? If youâ€™re going to go this route, then all of the keywords passed to the case statement just need to be an intersection of the members for each element of the union.

I think it would also help if there were motivating examples that match against `AB`. And for each example, to compare the case where `type AB = A | B` with having `AB` as explicit base class.

Right, it probably makes sense to make this less strict. The following seems pretty safe:

``````from typing import NamedTuple, TypeAlias

class A(NamedTuple):
x: int  # only `x` this time

class B(NamedTuple):
x: int
y: str

AB: TypeAlias = A | B

match ab:
case AB(x):  # only `x`
print(f"{x=}")
``````

as syntactic sugar for

``````match ab:
case A(x) | B(x, _):  # note the wildcard usage
print(f"{x=}")
``````

which would work as long as all the `__match_args__` start with an identical sequence.

And then you could loosen it even further by allowing different names in the `__match_args__`:

``````class A(NamedTuple):
x: int

class B(NamedTuple):
x: int

AB: TypeAlias = A | B

match ab:
case AB(x):
print(f"{x=}")
``````

Thatâ€™s probably also fine, though it could lead to hard-to-notice bugs.

Iâ€™m not sure Iâ€™m understanding correctly, but to re-iterate, the main motivation is as an alternative to the sealed class proposal; as an alternative way of making static exhaustiveness checks work for subclasses.

We can of course define it like this:

``````@dataclass
class AB: ...

@dataclass
class A(AB):
x: int

@dataclass
class B(AB):
x: int

ab: AB = A(3)

match ab:
case AB(x):
print(f"{x=}")
``````

This works, but then when we want to distinguish `A` and `B`, static type checkers will complain:

``````match ab:  # match is not exhaustive
case A(x):
print("this is A")
case B(x):
print("this is B")
``````

So, we would like to use the type `A | B`, so that exhaustiveness checking works, but when we do that and use a type alias of `A | B` everywhere, we encounter the problem that you canâ€™t use that type alias as a `case:`, which is something you might sometimes want to do.

``````class ExpressionBase:
def shared_functionality(self): ...

@dataclass
class Name(ExpressionBase):
name: str

@dataclass
class Operation(ExpressionBase):
left: "Expression"
op: str
right: "Expression"

Expression: TypeAlias = Name | Operation

def f(node: Expression | Statement):
match node:
case Expression():  # we want to match on any `Expression` here
print("it's an expression")
case Statement():
print("it's a statement")

def g(expr: Expression):
match expr:
case Name(name):
print(f"{name=}")
case Operation(left, op, right):
print("it's an operation")
``````

Hmm, typing out this example, it seems that itâ€™s in practice probably most useful to use the union type alias in a `case` that does not bind any variables.

So, we should definitely do the thing where itâ€™s fine when all the `__match_args__` arenâ€™t perfectly identical.

Right, I wasnâ€™t sure if you had another motivation in mind or not. I understand now why you want this.

I think itâ€™s a lot of effort to fix a minor inconvenience. But I do think case statements that support unions are a cool idea. Maybe itâ€™s worth keeping these in mind in case other use cases pop up over the years?

Assuming you meant to have different names in the two classes, then no, this canâ€™t really be done with the current model of pattern matching since `__match_args__` is statically looked up on `AB` without being able to know the type of `ab`. Everything else can already be implemented:

``````class UnionType(type):
def __init__(self, *_):
pass

def __new__(cls, *classes):
self = type.__new__(cls, 'Union', (), {})
self.classes = classes
return self

def __str__(self):
return ' | '.join(cls.__name__ for cls in self.classes)

@property
def __match_args__(self):
common_prefix = None
for cls in self.classes:
cls_ma = getattr(cls, '__match_args__', ())
if not cls_ma:
return ()
if common_prefix is None:
common_prefix = cls_ma
continue
if len(cls_ma) < len(common_prefix):
common_prefix, cls_ma = cls_ma, common_prefix
if common_prefix == cls_ma[:len(common_prefix)]:
continue
common_prefix = common_prefix[:next(i for i, (a, b) in enumerate(zip(common_prefix, cls_ma)) if a != b)]
return common_prefix

def __instancecheck__(self, instance):
return isinstance(instance, self.classes)

AB: TypeAlias = UnionType(A, B)
``````

Behaves correctly, setting `__match_args__` to the common prefix of all arguments.

Ofcourse, a small change to pattern matching so that it isnâ€™t necessary to subclass and misuse `type` here would be nice, but otherwise this feature is â€śpure pythonâ€ť, just requiring implementation in `typing` and telling type checkers about it.

2 Likes

Ah yes, that makes sense. Ignore that idea then.

Thank you for working this out! Nice to see that this works.

I think adding support for this would be a mistake. The original thread wanted ADTs, and while this solves a thing they wanted while avoiding the issues the sealed decorator proposed, adding an ad-hoc way to solve part of ADTs by special casing unions would create another special case related to typing without fully helping with adding ADTs.

Itâ€™s going to better if those who want ADTs work on full syntax level support for it that avoids the issues of the sealed decorator and of special casing unions here.

3 Likes

ADTs are already â€śsolvedâ€ť with unions. They are called â€śalgebraicâ€ť precisely because they let you add arbitrary types together! We shouldnâ€™t be trying to introduce a new way to spell a union that has a different feature set from the existing way to spell unions.

While I would disagree that Unions are ADTs as people are familiar with the term from other languages, I would agree that the problems ADTs help solve are already solvable with things python has.

Unions are a type-system exclusive construct, yes you can do a limited number of things with them at runtime, but thatâ€™s not their primary purpose and very little exists around that. This is partially intentional, as having runtime behavior on typing features that isnâ€™t clearly separated has lead to problems of inconsistency likeâ€¦

and

Even isinstanceâ€™s second parameter accepting a Union has been considered a mistake by a few people in hindsight. This recently came up in another discussion with adding isinstance support to other type-sytem parts (quoted below).

After some consideration, Iâ€™m negative on adding match support to Union but positive on an ADT construct that works with or without the type system and Unions (though it should be compatible for typing users, it should work for non-typing uses given the motivations)

3 Likes

Iâ€™m essentially in agreement with everything you said, but Iâ€™d add a further proviso here. It feels to me that the way discussions around ADT proposals are framed often takes the form of â€śhere is a useful construct from another language (often a strongly typed one that typically borrows ideas from pure functional programming[1]), letâ€™s add it to Pythonâ€ť. While this isnâ€™t necessarily bad, I think that such proposals would benefit from a much more balanced consideration of how Python currently solves the problems the new proposal is targeting and what new advantages the proposal brings. At this point, Python is perfectly capable of solving most problems, and we should be less interested in what problems a proposal targets, and more interested in why we need a new way of solving those problems. Otherwise we risk adding things just because they are currently trendy.

8 Likes

I believe the only distinction is whether the union is tagged, and you can trivially construct a tagged union from named tuples and untagged unions.

This is a pretty significant difference. you canâ€™t use class methods or constructors on a union, and thatâ€™s a good thing. The ADTs of other languages offer construction and the ability to implement methods/traits/etc for the ADT.

Not in a type safe manner in python, and not ergonomically. Youâ€™re better off inverting the problem and switching on the type in code that uses the value, rather than in the container holding the value. But thatâ€™s actually possible in python, and we donâ€™t have a compiler doing fancy things with ADTs. We donâ€™t need ADTs to solve problems in python.

``````from typing import NamedTuple
import enum

class _(enum.Enum):
EMPTY = enum.auto()

EMPTY = _.EMPTY

tag: type[A] | type[B]
a_value: A | Literal[EMPTY] = EMPTY
b_value: B | Literal[EMPTY] = EMPTY
``````

Even with placing EMPTY there, thereâ€™s no way to know if a_value or b_value is safe from the tag statically, and you also canâ€™t change that to just be:

``````class ADT(NamedTuple):
tag: type[A] | type[B]
value: A | B
``````

either one would need dependent types.

But you can just do:

``````value: A | B = some_called_thingamajig()
match value:
case A:
...
case B:
...
``````
``````class TagA(NamedTuple):
a: TypeA

class TagB(NamedTuple):
b: TypeB

TaggedUnion = TagA | TagB
``````

This is a fully type-safe tagged union, and the pattern is extremely common. It tends to crop up also with TypedDict, the other common building block of ADTs.

Thatâ€™s not a tagged union, and itâ€™s also completely unnecessary. In that case, you need to switch on the type of the named tuple. switch on the type of the value instead, itâ€™s available to you

1 Like

This doesnâ€™t look like a tagged union to me either. I donâ€™t see any benefit to this construction over just

``````SomeUnion = TypeA | TypeB
``````

This isnâ€™t the same as an ADT as provided by many compiled languages Youâ€™ve created two structures that are disjoint here and said you have one or the other, not one unified structure with disjoint fields that are checked for and that the language knows about. With this difference, thereâ€™s no benefit, and Iâ€™m inclined to think thereâ€™s no need to support ADTs if there isnâ€™t a stronger motivation than something currently solved like this.

The only reason I could see using this would be if you have some library modeling a web API you have no say over, and the library canâ€™t handle transforming data into a more useful representation, but thatâ€™s not a language limitation, that would be a library design choice.

I canâ€™t help that Thatâ€™s what a tagged union is - a union with a unique tag attached to each option so you can distinguish when two overlapping types get tagged. For instance, in my example, if you add a type thatâ€™s a subclass of both TypeA and TypeB, you canâ€™t tell which branch of `TypeA | TypeB` it is supposed to be. With the tagged union, you can, because the associated tag is either a or b. (Usually itâ€™s more obvious, two of the types just match for two different tags.)

Now often tagged unions are represented in memory as an explicit type int plus a pointer. Thatâ€™s not what a tagged union is, though. Itâ€™s just a way of storing it.

This is exactly my point, though. We donâ€™t need the tagged union that, say, OCAML provides, because itâ€™s almost never necessary versus a simple union (which OCAML does not have). We already have ADTs fully available in Python.

Right, but itâ€™s not valid to discriminate unions in python based on the presence of attributes, except in the case of runtime checkable protocols. There could be a hypothetical subclass of `TypeA` that has an attribute `b`. Thatâ€™s why this isnâ€™t a tagged union in python.

1 Like

Ah, I see, you may need to make the tuple classes final to fix that. I forgot about it because pyright stopped needing final for TypedDicts when used in this manner, and thatâ€™s how I normally write it. (Commit where pyright changed this behaviour, which I always struggle to track down.)

``````@final
class TagA(NamedTuple):
a: TypeA

@final
class TagB(NamedTuple):
b: TypeB

TaggedUnion = TagA | TagB
``````
1 Like

I donâ€™t know if this counts as a tagged union or how youâ€™re meant to operate on it, but both this proposal and `sealed` look rather unergonomic compared to the simplicity of e.g. enums in Rust. Thatâ€™s what I personally imagine a tagged union to look like in Python - an enum-like construct whose members are instantiable.

3 Likes