Another attempt at clean sum types/ADTs in Python

(skip to the heading The Proposal if you’re already familiar with the problem)

The Problem

Consider this code in Swift:

enum BgColor {
    case transparent
    case name(String)
    case rgb(Int, Int, Int)
    case hsv(Int, Int, Int)

var backgroundColor = BgColor.rgb(39, 127, 168)

switch backgroundColor {
    case BgColor.transparent:
        print("no color")
    case let
        print("color name: \(colorName)")
    case let Barcode.rgb(red, green, blue):
        print("RGB: \(red), \(green), \(blue).")
    case let Barcode.hsv(hue, saturation, value)
        print("HSV: \(hue), \(saturation), \(value).")

it let’s you precisely express that there are 4 possibilities and that different values are associated with each possibility.

In Python, you might express this like this:

background_color = {"type": "rgb", "val": (39, 127, 168)}

match background_color:
    case {"type": "transparent"}:
        print("no color")
    case {"type": "name", "val": color_name}:
        print(f"color name: {color_name}")
    case {"type": "rgb", "val": (red, green, blue)}:
        print(f"RGB: {red}, {green}, {blue}")
    case {"type": "hsv", "val": (hue, saturation, value)}:
        print(f"HSV: {hue}, {saturation}, {value}")

This works if you’re not interested in static type checking (which is fine), but if you are interested in static type checking then you’ll find that the current ways to make this type-safe are not as convenient as the Swift code.

You could type the above code with TypedDict but I think most would probably do it with NamedTuples:

from typing import NamedTuple, TypeAlias

class Transparent:
class Name(NamedTuple):
    color_name: str
class Rgb(NamedTuple):
    red: int
    green: int
    blue: int
class Hsv(NamedTuple):
    hue: int
    saturation: int
    value: int

BgColor: TypeAlias = Transparent | Name | Rgb | Hsv

background_color: BgColor = Rgb(39, 127, 168)
assert isinstance(background_color, BgColor)

match background_color:
    case Transparent():
        print("no color")
    case Name(color_name):
        print(f"color name: {color_name}")
    case Rgb(red, green, blue):
        print(f"RGB: {red}, {green}, {blue}")
    case Hsv(hue, saturation, value):
        print(f"HSV: {hue}, {saturation}, {value}")

As you can see, this has become very verbose.

You can do it a bit shorter, but it’s still not that elegant:

class Transparent: ...
Name = NamedTuple("Name", [("value", str)])
Rgb = NamedTuple("Rgb", [("value", tuple[int, int, int])])
Hsv = NamedTuple("Hsv", [("value", tuple[int, int, int])])

BgColor: TypeAlias = Transparent | Name | Rgb | Hsv

The Proposal

To solve this problem, I would like to propose a new construct: TypeEnum. It’s like an Enum but instead of its elements being values, they’re types.

(Alternative names for this concept: NamedUnion, TaggedUnion, NamedTupleUnion.)

It works like this:

from typing import TypeEnum

class BgColor(TypeEnum):
    transparent = ()
    name = (str,)
    rgb = (int, int, int)
    hsv = (int, int, int)

background_color = BgColor.rgb(39, 127, 168)
assert isinstance(background_color, BgColor)
assert not isinstance(BgColor.rgb, BgColor)  # different from Enum

match background_color:
    case BgColor.transparent:
        print("no color")
        print(f"color name: {color_name}")
    case BgColor.rgb(red, green, blue):
        print(f"RGB: {red}, {green}, {blue}")
    case BgColor.hsv(hue, saturation, value):
        print(f"HSV: {hue}, {saturation}, {value}")

Under the hood, TypeEnum does something like this:

class BgColor:
    transparent = 0
    name = NamedTuple("name", [("item0", str)])
    rgb = NamedTuple("rgb", [("item0", int) , ("item1", int), ("item2", int)])
    hsv = NamedTuple("hsv", [("item0", int) , ("item1", int), ("item2", int)])

However, this doesn’t include the magic necessary to make isinstance("cerulean"), BgColor) work.

I’m using NamedTuple here because I need something that will populate __match_args__ for the pattern matching, and NamedTuple is an easy way to get that. In the actual implementation, it probably wouldn’t be really NamedTuple but it needs to be something that you can pattern-match on.

It would also be nice if there was an option for named fields, but it’s not a must from my side:

from typing import TypeEnum

class BgColor(TypeEnum):
    name = (str,)
    rgb = {"red": int, "green": int, "blue": int}  # named args

background_color = BgColor.rgb(red=39, green=127, blue=168)

This syntax with named fields was actually previously suggested in this mypy issue:

So, what do you think?

Whenever you find yourself switching on related types, you should always consider using polymorphism. In my opinion, your different color should all inherit from some Color base class, which would have a display method that does what you want it to do here.

Consider what would happen if you added a RGBWithAlpha class. Then, all of you match statements would have to add a case. With polymorphism, you can have RGBWithAlpha decide which behaviour it should inherit from its parent RGB.

Another benefit of polymorphism is that you can define an interface with its appropriate contracts, which can be checked for all subclasses.

Unless you want iterability, I think you should replace NamedTuple with dataclass. At the very least, Name should not be a NamedTuple. Keep APIs as small as possible without sacrificing utility.

Your motivation seems to be that you want to switch on types. Switching on related types is generally a code smell. I think you should use polymorphism instead. I think the match statement is more appropriate for switching on unrelated types.

1 Like

I think this is a cool idea. :wink: (Sorry I don’t have more valuable input here as I don’t have a use case for this just yet.)

There are 3rd-party libraries like GitHub - dusty-phillips/match-variant: Python variant types that work with match which implement this today, albeit without syntax support (disclaimer: I am a co-creator of that project). To get something like this farther, either the community is going to need to pick it up and then convince the typing community to support it like enums and dataclasses, or you have to get into Python itself, and that’s going to take a PEP.