Consider the following example:
from typing import TypedDict, Literal, TypeAlias
class Config(TypedDict):
int_field: int
str_field: str
Environment: TypeAlias = Literal['production', 'development']
def _get_configuration(environment: Environment) -> Config:
# Get the configuration fitting for the given environment
pass
def _infer_environment() -> Environment:
# Read environment variables to infer if we are running in development or production
pass
def get_configuration_field(field_name: Literal['int_field', 'str_field']):
configuration: Config = _get_configuration(_infer_environment())
return configuration[field_name]
There are certain things that are a bit of a hassle to do typing-wise:
-
There is no easy way for me to annotate the return type of the
get_field
function:
a. I could annoatate it asint | str
, but it would suggest thatget_field('int_field')
can be treated as a str which is forbidden.b. The user could explicitly annotate their own usage:
int_field: int = get_field('int_field)
but once again, if I make a mistake and accidentally annotateint_field
as str, I would not be warned. Also, this is less convinient for whoever uses my function.c. I could use
typing.overload
in order to specify each use case like so:@overload def get_field(field_name: Literal['int_field']) -> int: pass @overload def get_field(field_name: Literal['str_field']) -> str: pass
But this is quite an amount of work to achieve something that is already specified by the
Config
class. It would also force me to do the same thing multiple times if had multiple functions returning inner fields ofConfig
class and it causes code duplication. -
Whenever I add a new key to
Config
, I need to add that key to the argument annotation ofget_field
.
In typescript, both of these problems are pretty much solved. In the following example, we can see how we can gain the ability using some of typescript’s functionalities:
interface Config {
intField: number;
strField: string;
}
const getField = <FieldName extends keyof Config>(fieldName: FieldName): Config[FieldName] =>
{
// logic
}
// Infered as number
const intField = getField('intField')
//Infered as string
const stField = getField('strField')
- The
keyof
functionality allows me to describe a union of the literal keys of the object// Equivalant of a union of the literals 'strField' and 'intField' keyof Config
- When I have an object consisting of multiple fields, I can access the type of one of its fields.
// Equivalant of the type 'str' Config['strField']
- You can bound the generic T to a union of literals, so we keep the connection between each field name and its value
If Python could have the equivalant, it would help implementing better typings, and prevent code duplications and potential bugs in current workarounds.
Now, I have some idea of how I’d want it to look in python. If (hopfully) you like the idea, there are definitely certain problems in this solution requiring discussion which I will describe later.
Right now the solution looks as follows:
-
Add a way to describe the type of TypedDict’s value or dataclass-like object’s attribute to the typing library.
from typing import FieldValue, Literal from dataclasses import dataclass class Config(TypedDict): url: str timeout: float @dataclass class Config2: url: str timeout: float # Return type inferred as str def get_url() -> FieldValue[Config, 'url']: pass # Works for dataclasses too. def get_timeout() -> FieldValue[Config2, 'timeout']: pass # Literal can be used as well def get_timeout() -> FieldValue[Config2, Literal['timeout']]: pass
-
Mutltiple literal strings/literal types can be used in order to return a union of their corresponding field types
# In both cases, return type inferred as str | float def get_config_value() -> FieldValue[Config, 'url', 'timeout']: pass def get_config_value() -> FieldValue[Config, Literal['url', 'timeout']]: pass
-
Add
FieldName
totyping
which receives a TypedDict/dataclass-like object and is equivalant to a union of its literal field names):from typing import FieldName, TypedDict from dataclasses import dataclass class Config(TypedDict): url: str timeout: float @dataclass class Config2: url: str timeout: float # In both cases, field_name's type is treated like Literal['url', 'timeout'] def validate_config_key(field_name: FieldName[Config]) -> None: pass def validate_config_key(field_name: FieldName[Config2]) -> None: pass
With the potential implentation of PEP 695, our original script can look like the following:
from typing import TypedDict, Literal, TypeAlias, FieldName, TypeVar, FieldValue
class Config(TypedDict):
field1: int
field2: str
Environment: TypeAlias = Literal['production', 'development']
def _get_configuration(environment: Environment) -> Config:
pass
def _infer_environment() -> Environment:
pass
def get_config_field[ConfigName: FieldName[Config]](
config_field: ConfigName]
) -> FieldValue[Config, ConfigName]:
configuration: Config = _get_configuration(_infer_environment())
return configuration[field_name]
There are a few problems I have with my approach I wish to discuss with you (In case you even like the idea in the first place :P)
-
Typescript syntax of using brackets
Config['url']
is much simpler and cleaner in my opinion. I thought of having a simillar thing in this solution, but I don’t know how easy it will be to implement. And it will be kind of a change in how the syntax was used so far:dict[str, int]
is still a dict.list[str, int]
is still a list.Callable[[str], int]
is still a callable. ButConfig['url']
is not a Config. It can also break certain existing classes sinceTypedDict
will need to overridedict
’s__class_getitem__
.More imporatntly, it can cause conflicts with stringified annotations.
-
I don’t really like the idea of
FieldName
andFieldValue
being with a different meanning based on the specific type (When used on TypedDicts, they refers to their keys and values but for dataclasses, they refer to attributes names and values). But I figured this feature would be useful for both, and having separate types (AttrName
,AttrValue
,ItemName
,ItemValue
) might cause confusion and over-complicate the feature. -
Relating to the previous point, I’m not sure about the names
FieldName
andFieldValue
. Pressuming I want the names to work for both dataclasses and typed dicts, so I had to avoid names containing the words “Key”, “Attribute” and “Item”. It would be less of a problem if we had separated types for dataclasses and typed dicts.Also, I avoided the usage of plural
FieldNames
andFieldValues
since they are describing the type of a single field name/value.
I’m sure there are plenty of other things to consider here. But I think this feature can help reducing code duplication and having more precise typing for libraries and such.
I’d love to hear your thoughts