Using the type system to statically detect typos in Literal arguments

randolf-scholz · July 25, 2024, 3:27pm

Consider a function like scipy.integrate.solve_ivp. The parameter method has 6 known values: “RK45”, “RK23”, “DOP853”, “Radau”, “BDF” and “LSODA”.

Annotating method: Literal["RK45", "RK23", "DOP853", "Radau", "BDF", "LSODA"] is likely out of the question, since this would cause type errors whenever the argument is given dynamically.

On the other hand, it would be very desirable to flag usage when we know statically that the argument given is incorrect, such as method="RK44".

One can get close to the desired behavior with multiple overloads: Code sample in pyright playground

from typing import overload, Literal, LiteralString, Never

@overload
def solve_ivp(method: Literal["RK45", "RK23"]) -> None: ...
@overload
def solve_ivp(method: LiteralString) -> Never: ...
@overload
def solve_ivp(method: str) -> None: ...
def solve_ivp(method) -> None:
    pass

arg = input()
solve_ivp(arg)  # OK
solve_ivp("RK23")  # OK
solve_ivp("RK44")  # not OK
solve_ivp("RK45")  # OK  <- marked unreachable due to prior Never

Which could be improved further if something like the previously proposed typing.Error is adopted. However, it would still be very verbose. There should be an easier way to express the type “one of these specific literal strings or any non-literal string”, ideally one that does not require the use of overloads. With a full algebra of types, this could be expressed as (str & Not[LiteralString]) | Literal["RK45", "RK23"], but this seems far away.

As an alternative, without introducing additional types, would be to specify that unions of literal strings with str should be interpreted this way, for instance letting Literal["RK45", "RK23"] | str mean “the literal strings ‘RK45’ or ‘RK23’ or any non-literal string”. Of course, it would be very understandable if there is little appetite for such additional special casing.

srittau · July 25, 2024, 3:44pm

I’d argue that an error when supplying a dynamic string is a feature. I recommend that libraries export a type alias for “literal enums” like in this case, which can then be used instead of str by users for the library. TypeIs or TypeGuard can be used to verify that a given str is actually one of the enum values at the edge of the system.

erictraut · July 25, 2024, 3:59pm

It sounds like what you want is something like @deprecated. In fact, @deprecated might already provide a reasonable solution here. Here’s how it looks.

It might make sense for us to consider adding an @error decorator that works the same as @deprecated except that it generates a type checker error.

I agree with @srittau that it’s better in these cases to limit the call to a specified set of allowed literals rather than supporting a dynamic value, but I appreciate this isn’t always feasible when adding type annotations to existing libraries.

Tinche · July 25, 2024, 4:40pm

Echoing other folks in the thread, I’d just do this. Autocomplete and IDE popups will also be improved.

randolf-scholz · July 26, 2024, 12:47pm

Thanks for the good suggestion, Eric.

It still feels like substantial overhead due to the burden of having to add overloads. This quickly becomes problematic in several cases, such as data classes (since we do not want to manually write __init__ there), or already overloaded functions (having too many overloads quickly starts to feel unmaintainable).