Hello all! First-time commenter on a PEP here. Apologies if I’m missing context in my suggestion here. Please take this comment as eager customer feedback.
I’ve been playing around with the new TypeForm
and found myself very surprised by this:
s: str = "5"
t: type[str] = type(s) # ok
tf1: TypeForm[str] = str # ok
tf2: TypeForm[str] = type(s) # error
tf3: TypeForm[str] = t # error
I think the spec as currently written does not agree with this behavior. In particular, the spec says:
When a type expression is evaluated at runtime, the resulting value is a type form object.
str
is, by this definition, a “type form object” because it is the result of evaluating a type expression at run time. But the result of evaluating type(s)
at run time is also str
, and since it does not say anywhere in the spec that a “type form object” is defined by its provenance, it must be that the result of type(s)
is also a “type form object.” Similarly, the definition of TypeForm itself is
TypeForm
is a special form that, when used in a type expression, describes a set of type form objects.
Since type(s)
produces a type form object, it is in the same set of type form objects as str
and therefore should belong to TypeForm[str]
.
There is precedent for making provenance in type hints – LiteralString
s are the same thing as str
s at runtime, but defined specifically by their provenance.
I have two comments:
- I think the spec needs to be clarified here. Probably the easiest way to do this is to provide at least one example of the behavior I have above and compare to
LiteralString
. At the risk of opening a further can of worms, I would also suggest that LiteralType
is perhaps a better name than then TypeForm
. It’s more clear that the object in question is a “type” (not a syntactic construct), and it clarifies static provenance and not just runtime value is a part of the definition of the type.
- I would suggest that the definition of “type form object” in this spec as it is – that is, defined only but the set of runtime values that can result from evaluating type expressions – is actually more useful. What do we lose by making
type
, UnionType
, etc. assignable to TypeForm
? Unless I’m missing some important details – and I probably am! – any Python program that will run on a TypeForm
under the narrower literal definition will also run on the expanded definition, since at runtime, type(s)
and str
are indistinguishable.
As a concrete example of (2), for some time, we have had in codebase an alias AnyType = type | UnionType | NewType | GenericAlias | TypeVar | ...
to stand in for TypeForm
. It’s hard to make this list authoritative because there are a number of runtime objects like typing._LiteralGenericAlias
that are private to the standard library and that type checkers are not fond of. I was very excited to see TypeForm
appear, only to realize that it does not in fact replace this construct at all.
The motivation for limiting TypeForm
to runtime type objects with a specific provenance is, as far as I understand it, borderline circular: since type(s)
is not a legal type expression, its result cannot be a legal type form by definition. It’s not clear to me that the (very good) motivation behind LiteralString
– to catch bugs that can arise from allowing non-programmer-specified strings – is present here. Similarly, if TypeForm
objects were to be used in code that runs only during type-checking, then of course limiting it to static type expressions would be well-motivated.
I would greatly appreciate a little more motivation in the spec: what bad programs are ruled out by forbidding tf: TypeForm[str] = type("5")
?