Hi everyone,
I’m working on a proposal and a prototype to bring type-safe reflection to schema-defined types in Python. This aims to solve the “magic string” problem prevalent in libraries like SQLAlchemy, Pydantic, and various data frame tools, without incurring the implementation complexity caused by Python’s dynamic nature.
I would love to get the community’s feedback on the core concept before finalizing the PEP.
The Problem
In our project, we frequently write utility classes or functions that rely on string literals to reference object attributes. Without static reflection, we cannot validate these strings or infer the types they point to.
Consider a UserFieldGetter class designed to extract specific fields from a User. Today, we are forced to type this using str and Any, losing all type safety:
class UserFieldGetter:
def __init__(self, field_name: str):
# We cannot validate that field_name actually exists on User
self._field_name = field_name
def __call__(self, user: User) -> Any:
# The return type is unknown (Any), leading to bugs downstream
return getattr(user, self._field_name)
# This fails silently at runtime!
getter = UserFieldGetter("invalid_field")
Currently, we rely on a Mypy plugin to handle this.
The Proposal
I am proposing two new special forms for the typing module: FieldKey and FieldType.
These are restricted to “schema-defined” types:
- Standard
@dataclasses.dataclass TypedDictandNamedTuple- Classes using
@typing.dataclass_transform(e.g., Pydantic models, SQLAlchemy declarative bases)
By limiting the scope to these types, we can validly apply a “Closed World Assumption”—treating the set of fields as finite and known at analysis time—which is impossible for standard dynamic Python classes.
How it works
1. FieldKey[T]
Evaluates to a Literal union of all valid field names in T.
- If
Tis a Union (A | B),FieldKeyreturns the intersection of fields (only keys present in all members). - It respects inheritance (MRO).
2. FieldType[T, K]
Evaluates to the type of the field named K on class T.
Kmust be a subtype ofFieldKey[T].- If
Kis a union of literals, the result is the union of the field types. - It correctly handles Generics (e.g., extracting the type from
Box[int]vsBox[str]).
Example Usage
Here is how the UserFieldGetter example looks with the proposal. The type checker now validates the field name and correctly infers the return type.
from dataclasses import dataclass
from typing import FieldKey, FieldType
@dataclass
class User:
name: str
age: int
class UserFieldGetter[K: FieldKey[User]]:
def __init__(self, field_name: K):
self._field_name = field_name
def __call__(self, user: User) -> FieldType[User, K]:
return getattr(user, self._field_name)
# Usage
name_getter = UserFieldGetter("name")
age_getter = UserFieldGetter("age")
user = User(name="Hello", age=1)
val1 = name_getter(user) # Inferred type: str
val2 = age_getter(user) # Inferred type: int
# Static Error: "invalid" is not assignable to Literal['name', 'age']
invalid_getter = UserFieldGetter("invalid")
Why this approach?
Standard Python classes are open, attributes can be added at runtime, so KeyOf[Object] effectively degrades to str.
By restricting this to Dataclasses and Schemas, we leverage the existing static guarantees of these structures. This allows libraries to define type-safe APIs for filtering, ordering, and serialization without requiring custom plugins for every tool.
Prototype
To demonstrate the feasibility of this proposal, I have created a working prototype based on a fork of Pyright (disclaimer: it’s mainly generated by LLMs).
While this is not intended to be a reference implementation, it is now functional for experimentation. You can check out the branch to see FieldKey and FieldType inference in action:
Here is an example usage with the Pyright output:
from typing import TypedDict, TypeVar, FieldKey, FieldType, NamedTuple, Literal
from dataclasses import dataclass
class Foo(TypedDict):
x: int
y: str
def get_container_value[K: FieldKey[Foo]](container: Foo, key: K) -> FieldType[Foo, K]:
return container[key]
reveal_type(get_container_value(Foo(x=1, y="hello"), "x"))
# information: Type of "get_container_value(Foo(x=1, y="hello"), "x")" is "int"
reveal_type(get_container_value(Foo(x=1, y="hello"), "y"))
# information: Type of "get_container_value(Foo(x=1, y="hello"), "y")" is "str"
@dataclass
class User:
name: str
age: int
class UserFieldGetter[K: FieldKey[User]]:
def __init__(self, field_name: K):
self._field_name = field_name
def __call__(self, user: User) -> FieldType[User, K]:
return getattr(user, self._field_name)
name_getter = UserFieldGetter("name")
age_getter = UserFieldGetter("age")
user = User(name="Hello", age=1)
reveal_type(name_getter(user))
# information: Type of "name_getter(user)" is "str"
reveal_type(age_getter(user))
# information: Type of "age_getter(user)" is "int"
invalid_getter = UserFieldGetter("invalid")
# error: Argument of type "Literal['invalid']" cannot be assigned to parameter "field_name" of type "K@UserFieldGetter" in function "__init__"
# Type "Literal['invalid']" is not assignable to type "Literal['name', 'age']"
# Type "Literal['invalid']" is not assignable to type "Literal['name', 'age']"
# "Literal['invalid']" is not assignable to type "Literal['name']"
# "Literal['invalid']" is not assignable to type "Literal['age']" (reportArgumentType)
class Result(NamedTuple):
returncode: int
reason: str | None
def get_reason(result: Result) -> FieldType[Result, Literal["reason"]]:
return result.reason
reveal_type(get_reason(Result(0, "")))
# information: Type of "get_reason(Result(0, ""))" is "str | None"
T1 = TypeVar("T1")
T2 = TypeVar("T2")
@dataclass
class Box[T1, T2]:
value1: T1
value2: T2
def get_value1[T1, T2](box: Box[T1, T2]) -> FieldType[Box[T1, T2], Literal["value1"]]:
return box.value1
reveal_type(get_value1(Box(value1=1, value2="hello")))
# information: Type of "get_value1(Box(value1=1, value2="hello"))" is "int"
reveal_type(get_value1(Box(value1="hello", value2=1)))
# information: Type of "get_value1(Box(value1=1, value2="hello"))" is "str"
Feedback Requested
I am finalizing the PEP text now, but I wanted to gauge sentiment on:
- Naming: Are
FieldKeyandFieldTypeclear? IsFieldNamebetter thanFieldKey? - Semantics: Does the intersection rule for Unions (
FieldKey[A | B]only returns shared keys) align with your expectations for safety? - Annotated Types: Should
FieldTypepreserveAnnotatedmetadata (e.g.,FieldType[T, "age"]returnsAnnotated[int, Gt(0)]) or strip it down to the base type (int)? - Properties: Should
FieldKeyinclude@propertydefinitions? Currently, I have excluded them to align strictly withdataclasses.fields()and declared schema fields, but I know some serializers (like Pydantic’scomputed_field) treat them as fields. - Performance: Computing
FieldKeyfor large Unions or complex hierarchies could be expensive. Are there specific constraints or lazy-evaluation strategies type checker maintainers would recommend?
Thank you!