Draft of typing spec chapter for enums

I’ve written a first draft of a new chapter for the typing spec that focuses on type checker behaviors for enumerations.

The Enum class has many atypical behaviors, and type checkers need to include special-case logic to handle these behaviors. Not surprisingly, there are a number of differences between type checkers in how they handle enums. This is an attempt to define and standardize the desired behaviors.

If you are interested in reviewing the draft, you can find it in this PR. Minor feedback (typos, wording suggestions, etc.) can be added directly to the PR. For significant questions or areas of potential disagreement, please post to this thread so community members get better visibility and can engage in the discussion.

3 Likes

This looks like a good first draft to me and deals with anything I could think of, I only have a minor comment:

In the section about Member/Non-Member attributes I wonder if type checkers should be encouraged to be slightly more strict to catch likely bugs at the definition site, rather than only potentially catch them once you try to use non-members as members. This would only be a “may”, like some of the other optional features in the spec, but I think it’s at least worth recommending.

Specifically the case where you assign a callable to a name inside an enum rather than use an explicit method definition seems like it would almost always be a bug and should’ve either been wrapped in enum.member or enum.nonmember, or explicitly ignored as a “yes, i know this won’t be an enum member” in earlier Python versions.

I’ve made this mistake in the past and I never used the enum in a way where my error became apparent at runtime or using a type checker until much later in development, so catching this kind of error early seems valuable to me.

Within a type stub, members can be defined using the actual runtime values,
or a placeholder of ... can be used::

Currently, in typeshed, many enums are defined like this:

class Foo(Enum):
    ABC: str
    DEF: str

And while we should probably change many instances to use explicit values, sometimes explicitly annotating the type is necessary. For example, suppose we can’t use the actual value for whatever reason:

class Foo(Enum):
    ABC = ...
    DEF = ...

The type of Foo.ABC.value and Foo.DEF.value is undefined, as it depends on the actual values used.

1 Like

Currently, in typeshed, many enums are defined like this

Yes, I think that’s problematic because type checkers currently cannot differentiate between members and non-member attributes for these stub definitions; they’re ambiguous. I’m hoping that we can get those fixed in typeshed.

There are only 13 such enums in all of typeshed, and all are in third-party stubs. They should be easy to fix.

The type of Foo.ABC.value and Foo.DEF.value is undefined, as it depends on the actual values used.

The value of value in this case would be Any (which is how it is defined in the Enum class) unless the stub provides an explicit type annotation for _value_, like this:

class Foo(Enum):
    _value_: str
    ABC = ...
    DEF = ...

Are we looking at the same thing? I see a few stdlib instances, for example this or this rare example that uses non-“primitive” objects.

This is a good solution and we should be able to use this for the remaining cases where we can’t use the = syntax in typeshed.

Are we looking at the same thing?

Ah, I was searching for (Enum), but some of the stdlib classes use (enum.Enum), so I missed those.

In any case, it should be straightforward to convert these to conform with the proposed spec language.

1 Like

Typeshed itself is easy to change, but there is likely a lot of third-party code that also relied on this pattern and is currently accepted by type checkers.

I think your draft is still the right choice for the spec, as the alternative is ambiguous, but type checkers who currently support other patterns will have to go through a careful deprecation procedure.

1 Like

Yeah, I have concerns about that too.

Here’s a quick experiment that attempts to quantify the extent of the problem. I modified pyright to remove its special-case handling of stubs and ran the change through mypy_primer. Here are the results. This generated a number of compatibility issues, but it appears they are mostly related to typeshed stubs.

As a follow-on experiment, I temporarily modified pyright’s copy of typeshed so it conforms with the proposed spec. Here are the results. Most of the issues from the previous experiment are now gone. The remaining issues are because someone copied and vendored the inspect.pyi typeshed stub.

I’m not sure what to take away from that experiment other than that there will be some compatibility issues as you’ve predicted.

I find the “non-member attributes” phrase very confusing. On the one hand, enum classes are made of attributes, some of which are members, and the others are not; on the other hand, the attributes that are not members are typically attributes of the members:

Planet.SATURN.mass

or (from the new chapter)

class Pet(Enum):
    genus: str
    species: str

I would say that genus and species are actually member-attributes, as they won’t be available directly from Pet (at least from what we see in the snippet).

I don’t know if there’s a good solution to this confusion.

The draft currently mandates that this is an error:

WrongName = Enum('Color', 'RED GREEN BLUE')  # Type checker error

But mypy currently allows this, with reasonable behavior.

I agree that this code is confusing and poorly written, but it works at runtime, and I don’t see a strong case for why mypy should be changed to emit an error here. Can we make this error optional?

1 Like

Another case that may need some explicit language is enums of unusual types. Usually enum values are ints or strs, but they can of course be of any type. Sebastian brought up the ssl.Purpose enum as an example. Do I understand correctly that in a stub, it should be specified as:

class Purpose(_ASN1Object, enum.Enum):
    _value_: _ASN1Object
    SERVER_AUTH = ...
    CLIENT_AUTH = ...

Mypy does enforce this for other functional forms that create classes or objects: TypeVars, ParamSpecs, TypeVarTuples, NewTypes, and NamedTuples. I suspect the fact that mypy doesn’t enforce it in the Enum case is just an oversight. For consistency, I think this should be mandated in the spec, as it is in those other cases.

Or it should be removed from the spec in those other cases… But I would lean toward leaving it in.

Yes, that’s correct, assuming the type of each member’s value attribute is _ASN1Object.

I opened a PR to change mypy to error if the name for a functional Enum does not match the variable name: Error for assignment of functional Enum to variable of different name by hauntsaninja · Pull Request #16805 · python/mypy · GitHub. Subsequent type analysis proceeds as before. mypy_primer suggests it’s not uncommon for there to be a mismatch, but didn’t surface additional compelling reasons to allow the mismatch.

1 Like

I’m in agreement with Jelle here. I don’t think we should lightly start mandating that type checkers emit errors where they didn’t before and where the warnings have limited usefulness for the users of type checkers. I don’t see any particular reason to change the spec with regards to typevarlikes, NewTypes and NamedTuples – we’ve always had those rules, and the rules there promote cleaner code. But demaning that mypy start rejecting code it previously accepted feels unnecessarily disruptive here, and I’m not sure what practical benefit it would have.

I’m fine with not mandating such things, but I want to be consistent. Is there a general rule we can establish that explains (with solid justification) why name consistency is enforced in some cases but not others?

# Mypy Error
# Mandated in typing spec:
# > The argument to TypeVar() must be a string equal to the variable name
# > to which it is assigned.
T_Bad = TypeVar("T")

# Mypy Error
# Not explicitly mandated in typing spec, but implied
P_Bad = ParamSpec("P")

# Mypy Error
# Not explicitly mandated in typing spec, but implied
Ts_Bad = TypeVarTuple("Ts")

# Mypy Error
# Not currently mandated in typing spec
NT_Bad = NewType("NT", int)

# Mypy Error
# Not currently mandated in typing spec
NTp_Bad = NamedTuple("NTp", [("a", int)])

# Mypy Error
# Not currently mandated in typing spec
NTp2_Bad = namedtuple("NTp2", "a b c")

# No Mypy Error
# Not currently mandated in typing spec
E_Bad = Enum("E", "a b c")

# Mypy Error
# Not currently mandated in typing spec
TD_Bad = TypedDict("TD", {"a": int})

My preference would be to not mandate an error for any of them. These examples are weird and bad style, but their meaning is unambiguous.

1 Like

So the errors are allowed (since they could cause really confusing error messages) but the spec shouldn’t mandate them? I’d be okay with that.

4 Likes

One clear pattern is that all types that you showed for which mypy produces an error are part of the typing stdlib package, whereas Enum is not included in there, so legacy code might use a different name and mandating that would have produces errors for code that was previously perfectly fine. However, collections.namedtuple also produces an error, so that isn’t a hard rule. (and I can’t actually think of any other cases in the stdlib where this would be relevant?)

IMO, allowing the errors, but not mandating them for none-Typevar stuff is a good idea.

This feels like a linter issue, not a typing issue.