Do we need a ValueAlias in addition to TypeAlias?

Tinche · October 11, 2024, 9:45pm

For context, I’m looking at a typing issue from the attrs issue tracker. This is something that (I’m somewhat sure) used to work on both pyright and mypy, but no longer does. The issue reporter also asked pyright for their opinion.

The enum standard library module is very useful for creating singletons - a common pattern is (was?) to create a single member enum, and alias that member at module scope to something friendly. Users almost wouldn’t even need to know an enum was involved, they would just import the member.

Here’s a short snippet directly from attrs:

import enum

class _Nothing(enum.Enum):
    NOTHING = enum.auto()

NOTHING = _Nothing.NOTHING

Then, this could be used in a function signature like this:

from attrs import NOTHING

def f(a: str | Literal[NOTHING] = NOTHING):
    ....

Like I mentioned, at a certain point this stopped working. Both mypy and pyright claim that NOTHING is a variable, and so unsuitable for use with Literal. Which makes sense. So I tried changing the code to:

NOTHING: Final = _Nothing.NOTHING

Now NOTHING is definitely not a variable - it’s supposed to just be an alias to _Nothing.NOTHING. I’d expect, since Literal[_Nothing.NOTHING] is valid, Literal[NOTHING] would be valid too (since it’s equivalent to _Nothing.NOTHING). To my surprise, this doesn’t seem to be the case.

From what I can tell, it’s not really possible to “alias”/“mirror” enum members directly to module scope any more and keep their full functionality (like use in literals). This is unfortunate with regards to API ergonomics. It’s not just attrs, combining enums (and literals of enum members) in unions with other types is a nifty feature for advanced data modeling.

Can we get Final to make this possible again?

erictraut · October 11, 2024, 10:11pm

Even if a variable has a Final qualifier, it’s still a variable. The expression _Nothing.NOTHING is not a valid type expression, and NOTHING is not a valid type alias.

Here is how I recommend handling this pattern such that it is compatible with the type system.

import enum
from typing import Final, Literal

class _Nothing(enum.Enum):
    NOTHING = enum.auto()

type NothingType = Literal[_Nothing.NOTHING]
NOTHING: Final = _Nothing.NOTHING

def f(a: str | NothingType = NOTHING): ...

You might also be interested in this thread, which discusses the idea of adding a sentinel facility to the language.

mikeshardmind · October 11, 2024, 10:12pm

I think we should be able to do this, and that this is just an overly strict adherence to wording in the specification that may have been too strict as it has caused a working pattern that has been in use for years to stop working without it necessarily being an issue of type safety. Final should not even be needed here, as nothing else can be assigned to that value.

Tinche · October 13, 2024, 2:05am

Thanks, and I’ll make this change in attrs.

However, it’s unfortunate this now requires two imports, and I don’t mean just for because of this particular attrs issue but because of elegant data modeling in general.

Right, ok. How would you feel if Literal was made to accept Final variables whose value was statically known (so, themselves of type Literal)? Like this:

a: Final = 1

b: Literal[a]  # b: Literal[1]

Feels like an easy and safe change.

erictraut · October 14, 2024, 6:28am

Your proposed solution would still require two imports — one for Literal and one for NOTHING, so I don’t find that argument compelling.

If you’re concerned about import counts and don’t want to export a type alias symbol, you could export the enum class (or an alias thereof) from attrs, which would result in this usage:

from attrs import Sentinel as S
from typing import Literal

def f(a: str | Literal[S.NOTHING] = S.NOTHING):
    ...

I think that’s problematic. The phrase “statically known” isn’t well defined.

A type expression should “spell” a type in a way that is unambiguous and consistent across all static and runtime type checkers. We’ve made great strides over the past year clarifying the rules about type expressions, and this has improved consistency and clarity. What you’re proposing here would take us a step backward because a Final variable can be assigned the result of any value expression. Such expressions are not constrained by type expression rules. They can be arbitrarily complex, and type checkers can (and do) evaluate such expressions differently. The type of a Final variable is therefore dictated by type-checker-specific type inference behaviors. Consider the following three examples. Mypy and pyright interpret these differently despite both conforming with the typing spec.

t = (1, 2, 3)
x: Final = t[0]
reveal_type(x)  # Pyright reveals Literal[1], mypy reveals int

def add(a, b): return a + b
y: Final = add(-1, 2)
reveal_type(y)  # Pyright reveals Literal[1], mypy reveals Any

z: Final[int] = 1
reveal_type(z) # Pyright reveals Literal[1], mypy reveals int

We could potentially carve out special cases in the typing spec to guarantee consistency for this particular use case. For example, we could say that only expression forms that are already supported in a Literal subscript are supported (e.g. MyLiteralValue: Final = 1). Or we could say that Final must be used with an explicit Literal type declaration (e.g. NOTHING: Final[Literal[_Nothing.Nothing]] = _Nothing.NOTHING) for the resulting symbol to be usable within a Literal subscript. This would result in type checkers needing to distinguish between Final variables that can be used in Literal subscripts and those that cannot. Special-case rules like this have costs, and they tend to pile up over time and create problems when extending the type system. We’d have to ask ourselves whether this use case is sufficiently compelling to justify such special casing in the typing spec. I’m not convinced it is.

I can understand why one might want to use an enum to define a sentinel value, but a class is probably a better choice if you’re concerned about typing ergonomics.

@final
class NOTHING: ...

def f(a: str | type[NOTHING] = NOTHING):
    if a is NOTHING:
        ...

An even better solution is the one proposed in PEP 661, which was just submitted to the SC for consideration. I’d prefer to rally behind this PEP and avoid burdening the type system with additional special cases.

mikeshardmind · October 14, 2024, 9:02am

I think we can do something better defined to support the existing idom that has only recently begun to show as an error that isn’t unsafe.

When a variable may only have a single Literal value, it may be used as an alias in future Literal definitions.

This limits it neatly, an works with both Final and Literal without requiring both. All of the below may never have a different value even if assignment is technically allowed for two of them

a: Final = 1  
b: Literal[1] = 1
MISSING: Final = SomeEnum.MISSING
NOTHING: Literal[SomeEnum.NOTHING] = NOTHING

We could also go a step further and say that the inferred type without an annotation for an unannotated assignment of an enum member at a module scope must be Literal[that-enum-member]

If this hadn’t been a common idiom already, I’d be less strongly advocating this, but it has been in use across the ecosystem for years, and there is no reason why it should be unsafe, this only interacts poorly with the most current wording that is being enforced now, not any of the underlying type theory.