Amend PEP 586 to make `enum` values subtypes of `Literal`

Inspired by this highly upvoted issue: Literal of enum values · Issue #781 · python/typing · GitHub, I propose that enum values,

  • if they are defined literally
  • and are instances of the type of the literal
    (So this would be applicable mostly to IntEnum and StrEnum, but not regular Enum)

Then they should be subtypes of the corresponding literal type. For example, Number.ONE should be considered a subtype of Literal[1], but Weekday.MONDAY should not be considered a subtype of Literal[1], because Weekday doesn’t subclass int. Essentially, I propose the following amended inference rule for enums/literals:

enum.CASE <: Literal[<val>] if and only if type(enum.CASE) <: type(<val>) and enum.CASE.value = <val> literally. Moreover, in this case, equality comparisons should evaluate to True.

The rationale is that:

  1. It is type safe, if the enum subclasses the corresponding type.

  2. It is consistent with runtime behavior.

    In fact, both pyright and mypy are not consistent with runtime behavior here in some cases:

    Simple comparison [pyright playground], [mypy-playground]:

    from enum import IntEnum
    
    class Number(IntEnum):
        ONE = 1
        TWO = 2
    
    if 1 == Number.ONE:  # <- incorrect no-overlap
        print("equal")  # get's called at runtime.
    

    match-case example [pyright playground], [mypy-playground]:

    from enum import StrEnum
    from typing import Literal, assert_never
    
    class Options(StrEnum):
        A = "foo"
        B = "bar"
    
    def show(x: Literal["foo", "bar"]) -> None:
        match x:
            case Options.A:  # <-- incorrectly marked as unreachable
                return print("It's a foo!")
            case Options.B:  # <-- incorrectly marked as unreachable
                return print("It's a bar!")
        assert_never(x)    # <-- false positive
    
    show("foo")  # prints "It's a foo!"
    show("bar")  # prints "It's a bar!"
    

    And if we annotate show(x: Options) instead, then we get flagged at the call site if we pass literal strings. So, if both behaviors are to be allowed one has to annotate with Options | Literal["foo", "bar"] which introduces lots of redundancy. To get the checkers happy, something like this is needed.

  3. It is consistent with the “core behavior” of Literal types:

    • Literal types indicate that a variable has a specific and concrete value.

      Exactly what enums do

    • Given some value v that is a member of type T, the type Literal[v] shall be treated as a subtype of T

      Exactly how enums work, for example isisntance(Number.ONE, int) is True.

  4. It simplifies type hinting dramatically in some cases (as per Literal of enum values · Issue #781 · python/typing · GitHub)

9 Likes

A small update since this is still getting views/likes: pyright added support for this feature in 1.1.375. ty seems to support it out-of-the-box.

from enum import StrEnum
from typing import Literal

class Options(StrEnum):
    X = "X"
    Y = "Y"

x: Literal["X"] = Options.X
checker version status
pyright >=1.1.375 passes
mypy 1.16.0 fails
ty 0.0.1-alpha.8 passes
pyrefly 0.18.0 fails
1 Like

Correction: ty doesn’t support enums yet.

2 Likes

I don’t think this should be allowed. We should instead say clearly that Literal[X] means that the object is exactly that value, i.e. Literal[1] is only the instance of int, not a subclass.

Allowing subclasses of ints to inhabit Literal[1] would break a lot of assumptions that type checkers use and that make Literals convenient to use. For example, type checkers may use x == 1 to narrow a type to Literal[1]; or if x to narrow Literal[0] out of a type; or they may support mathematical operations on literals. All of those become unsafe if the object may be a subclass of int.

Here’s a sample of unsound behaviors in pyright due to this feature:

from enum import IntEnum
from typing import Literal

class X(IntEnum):
    a = 1

    def __add__(self, value: int, /) -> int:
        return self.value + value + 42

    def __eq__(self, other: object):
        return False

    def __bool__(self) -> bool:
        return False

def f(x: Literal[1]):
    print(reveal_type(x + 1))  # 44 at runtime, 2 according to pyright

def g(x: Literal[1, 2]):
    if x == 1:  # false for X.a
        reveal_type(x)  # Literal[1]
    else:
        reveal_type(x)  # Literal[2]

def h(x: Literal[0, 1]):
    if x:  # false for X.a
        reveal_type(x)  # Literal[1]
    else:
        reveal_type(x)  # Literal[0]

f(X.a)
g(X.a)
h(X.a)
12 Likes

And to provide a more constructive solution to the linked issue, I would suggest adding new type constructors EnumValues[Enum] and EnumNames[Enum]. These would accept a single argument, which must be an Enum class, and be equivalent to a union of the Literals of the names or values of the enum. For example, given the enum:

class E(enum.Enum):
    a = 1
    b = 2

EnumNames[E] would be equivalent to Literal["a", "b"], and EnumValues[E] would be Literal[1, 2]. It would be an error to use EnumValues on an enum with values that are not compatible with Literal.

8 Likes

I agree with Jelle’s argument here. I’ve created a bug report in the pyright issue tracker and plan to revert that change.

7 Likes

I’m disappointed by this resolution. The __add__ example is pretty convoluted, is that something anybody actually does? I don’t know of any IntEnums that redefine __add__, __bool__ or __eq__. What people actually do pretty often is the following case:

  1. You have a C library that you’re wrapping using a C extension. Le’t’s call it _foo. It exposes a function called, say, set_mode(flag) that accepts a bunch of integer flags as arguments.
  2. The flags are exposed as int constants in the C extension like FOO_MODE_QUIET, FOO_MODE_VERBOSE, FOO_MODE_DEBUG, etc.
  3. It’s a C module so you need a .pyi file to present it to mypy. The .pyi file correctly specifies that _foo.set_mode(flag) accepts a union of int literals.
  4. However, in the high-level Python foo library, you don’t want users to have to use the low-level _foo.FOO_MODE_QUIET names, so you make an elegant IntEnum with values like Mode.verbose, Mode.quiet, Mode.debug, etc. You assign the real int values from _foo.FOO_MODE_QUIET and friends to the respective fields on the enum.
  5. You expect users to be able to pass those int enum values to your foo.set_mode() function, but mypy complains. You can’t use the enum in the _foo.pyi file because you would have to import foo, which would create a cycle and is invalid in general.

This is not a theoretical example, this is for instance what pygit2 is doing today. They end up shadowing a good part of their Python code in the .pyi file currently because it’s impossible for int enum values to be passed to functions typed with a union of literals.

We would see the same problem with the socket standard module if it were directly typed, because it wraps the _socket C extension that doesn’t accept IntEnum values but straight integers. The only reason we don’t see this problem is because we’re cheating with typeshed.

3 Likes

I’d argue that this kind of overrides

…makes the class look like a duck, while it doesn’t quack like one.
The behavior of X seems unsound as it breaks common assumptions (if not the LSP itself) that can be made about an integer.

3 Likes

There’s actually a problem I came across where all the type checkers fail to agree with runtime: match-case with a value pattern.

from enum import StrEnum
from typing import Literal, assert_never, reveal_type

class Color(StrEnum):
    RED = "r"
    GREEN = "g"
    BLUE = "b"

def test_literal_as_enum(x: Literal["g"] = "g") -> None:
    match x:
        case Color.RED:
            assert_never(x)
        case Color.GREEN:
            reveal_type(x)
        case Color.BLUE:
            assert_never(x)
        case _:
            assert_never(x)

def test_enum_as_litearl(y: Literal[Color.BLUE] = Color.BLUE) -> None:
    match y:
        case "r":
            assert_never(y)
        case "g":
            assert_never(y)
        case "b":
            reveal_type(y)
        case _:
            assert_never(y)

test_literal_as_enum()
test_enum_as_litearl()

At runtime, this will select the green branch for the first match-case, and the blue branch for the second match-case. But the type checkers all fail:

  • mypy thinks the runtime branches are unreachable
  • pyright thinks the runtime branches are unreachable
  • pyrefly thinks the runtime branches are unreachable
  • ty thinks the runtime branches are unreachable
  • zuban accepts the green branch in the first match-case, but thinks the blue branch is unreachable in the second match-case

Thanks, I opened Incorrect narrowing of `StrEnum`s in `match` statements · Issue #1454 · astral-sh/ty · GitHub for the ty bug

I still think it’s a shame that this is not possible. What if we just made it a convention that IntEnum and StrEnum behave as expected, by:

  • annotating revelant methods like __eq__, __add__ and __bool__ with @final in typeshed / cpython itself.
  • This shouldn’t break any existing code, as the decorator does nothing at runtime.
  • Users that do want to override these methods, and not get typing errors, still can manually subclass enum.Enum and int/str instead of subclassing IntEnum/StrEnum.

The docs even explicitly state

IntEnum, StrEnum, and IntFlag

These three enum types are designed to be drop-in replacements for existing integer- and string-based values

So the purpose of these classes is, as @ambv says to be used exactly this way. I don’t think the examples @Jelle gave really satisfy the idea of a “drop-in replacement” for a regular integer value, because as @Eneg points out, they fail the duck test.

8 Likes