Right now, there are a few type forms in the type system that act as refinement types. This is not a term that has been formally introduced to the python type system.
Refinement types are types that are constrained by their value in some manner. The examples I’m aware of of these in the type system at this point are Literal
and LiteralStr
. Neither of these actually specify a different type, but type checkers currently attempt to use solely subtyping to handle these. Both of these specify properties of the value (that it is a literal, or a specific literal, or in the case of LiteralStr
that it was constructed only of other string literals) This causes problems when interacting with variance in situations where only the type is what is cared about (See: Definition of `str.split`, `str.rsplit` and `str.splitlines` results in false positive errors · Issue #10887 · python/typeshed · GitHub)
I think it’s worth adding some language to better support this (And other potential future cases, like a potential type u8 = int v where v >= 0 and v <= 255
, which it appears certain libraries like pydantic are trying to force Annotated to do the work of)
While the potential future case definitely requires entirely separate consideration (and is probably only reasonable to allow basic comparisons to statically knowable values), there is the case of what is already currently in the type system which I’d like to take a moment to improve how type checkers handle.
The relatively simple language here would be “When something only cares about the type, not the value constraints, type checkers may allow the use of something with the correct type that has its value constrained”
The use of may
there is intentional.
It’s obvious that
x: list[str] = []
...
x.extend("This that".split())
should be fine even though “This that” here is a LiteralStr
and not a str
, all LiteralStr
are exactly of a type string, but have constraints on how the string was sourced, similarly, it should be fine to drop the constriant implicitly at sole assignment:
s: list[str] = "this that".split() # no longer list[LiteralStr]
However, the use of may allows denying some more dubious cases that have type safety issues, for example:
def x(vals: list[list[str]]):
for l in vals:
l.append(input())
def y(val: list[LiteralStr]):
x([val]) # this should error, type safety issue above, the caller of y got a non-literal added to their list
This can be done slightly more intelligently by a type checker that maintains a directed graph in memory to track type var use, or which understands which type forms involve mutability, but with the current allowed things in Literal and LiteralStr, it is trivially safe to drop value based constraints when the substitution only effects the inner most type. The dubious case above involves not only substitution of LiteralStr for str, but of list[LiteralStr] for list[str] in a way that there could still be references expecting list[LiteralStr]