Annotations using inner field names and values of TypedDicts/dataclass like objects

Consider the following example:

from typing import TypedDict, Literal, TypeAlias


class Config(TypedDict):
    int_field: int
    str_field: str


Environment: TypeAlias = Literal['production', 'development']


def _get_configuration(environment: Environment) -> Config:
    # Get the configuration fitting for the given environment
    pass


def _infer_environment() -> Environment:
    # Read environment variables to infer if we are running in development or production
    pass


def get_configuration_field(field_name: Literal['int_field', 'str_field']):
    configuration: Config = _get_configuration(_infer_environment())
    return configuration[field_name]

There are certain things that are a bit of a hassle to do typing-wise:

  1. There is no easy way for me to annotate the return type of the get_field function:
    a. I could annoatate it as int | str, but it would suggest that get_field('int_field') can be treated as a str which is forbidden.

    b. The user could explicitly annotate their own usage: int_field: int = get_field('int_field) but once again, if I make a mistake and accidentally annotate int_field as str, I would not be warned. Also, this is less convinient for whoever uses my function.

    c. I could use typing.overload in order to specify each use case like so:

    @overload
    def get_field(field_name: Literal['int_field']) -> int:
        pass
    
    
    @overload
    def get_field(field_name: Literal['str_field']) -> str:
        pass
    

    But this is quite an amount of work to achieve something that is already specified by the Config class. It would also force me to do the same thing multiple times if had multiple functions returning inner fields of Config class and it causes code duplication.

  2. Whenever I add a new key to Config, I need to add that key to the argument annotation of get_field.

In typescript, both of these problems are pretty much solved. In the following example, we can see how we can gain the ability using some of typescript’s functionalities:

interface Config {
    intField: number;
    strField: string;
}

const getField = <FieldName extends keyof Config>(fieldName: FieldName): Config[FieldName] => 
    {
     // logic
    }

// Infered as number
const intField = getField('intField')

//Infered as string
const stField = getField('strField')
  1. The keyof functionality allows me to describe a union of the literal keys of the object
    // Equivalant of a union of the literals 'strField' and 'intField'
    keyof Config
    
  2. When I have an object consisting of multiple fields, I can access the type of one of its fields.
    // Equivalant of the type 'str'
    Config['strField']
    
  3. You can bound the generic T to a union of literals, so we keep the connection between each field name and its value

If Python could have the equivalant, it would help implementing better typings, and prevent code duplications and potential bugs in current workarounds.

Now, I have some idea of how I’d want it to look in python. If (hopfully) you like the idea, there are definitely certain problems in this solution requiring discussion which I will describe later.

Right now the solution looks as follows:

  1. Add a way to describe the type of TypedDict’s value or dataclass-like object’s attribute to the typing library.

    from typing import FieldValue, Literal
    from dataclasses import dataclass
    
    class Config(TypedDict):
        url: str
        timeout: float
    
    @dataclass
    class Config2:
        url: str
        timeout: float
    
    # Return type inferred as str
    def get_url() -> FieldValue[Config, 'url']:
        pass
    
    # Works for dataclasses too.
    def get_timeout() -> FieldValue[Config2, 'timeout']:
        pass
    
    # Literal can be used as well
    def get_timeout() -> FieldValue[Config2, Literal['timeout']]:
        pass
    
  2. Mutltiple literal strings/literal types can be used in order to return a union of their corresponding field types

    # In both cases, return type inferred as str | float
    def get_config_value() -> FieldValue[Config, 'url', 'timeout']:
        pass
    
    def get_config_value() -> FieldValue[Config, Literal['url', 'timeout']]:
        pass
    
  3. Add FieldName to typing which receives a TypedDict/dataclass-like object and is equivalant to a union of its literal field names):

    from typing import FieldName, TypedDict
    from dataclasses import dataclass
    
    class Config(TypedDict):
        url: str
        timeout: float
    
    @dataclass
    class Config2:
        url: str
        timeout: float
    
    
    # In both cases, field_name's type is treated like Literal['url', 'timeout']
    def validate_config_key(field_name: FieldName[Config]) -> None:
    	pass
    
    def validate_config_key(field_name: FieldName[Config2]) -> None:
    	pass
    

With the potential implentation of PEP 695, our original script can look like the following:

	from typing import TypedDict, Literal, TypeAlias, FieldName, TypeVar, FieldValue
  
  
	class Config(TypedDict):
		field1: int
		field2: str
  
	Environment: TypeAlias = Literal['production', 'development']
  
  
	def _get_configuration(environment: Environment) -> Config:
        pass
  
  
	def _infer_environment() -> Environment:
		pass
  
  
	def get_config_field[ConfigName: FieldName[Config]](
            config_field: ConfigName]
    ) -> FieldValue[Config, ConfigName]:
		configuration: Config = _get_configuration(_infer_environment())
		return configuration[field_name]

There are a few problems I have with my approach I wish to discuss with you (In case you even like the idea in the first place :P)

  1. Typescript syntax of using brackets Config['url'] is much simpler and cleaner in my opinion. I thought of having a simillar thing in this solution, but I don’t know how easy it will be to implement. And it will be kind of a change in how the syntax was used so far:

    dict[str, int] is still a dict. list[str, int] is still a list. Callable[[str], int] is still a callable. But Config['url'] is not a Config. It can also break certain existing classes since TypedDict will need to override dict’s __class_getitem__.

    More imporatntly, it can cause conflicts with stringified annotations.

  2. I don’t really like the idea of FieldName and FieldValue being with a different meanning based on the specific type (When used on TypedDicts, they refers to their keys and values but for dataclasses, they refer to attributes names and values). But I figured this feature would be useful for both, and having separate types (AttrName, AttrValue, ItemName, ItemValue) might cause confusion and over-complicate the feature.

  3. Relating to the previous point, I’m not sure about the names FieldName and FieldValue. Pressuming I want the names to work for both dataclasses and typed dicts, so I had to avoid names containing the words “Key”, “Attribute” and “Item”. It would be less of a problem if we had separated types for dataclasses and typed dicts.

    Also, I avoided the usage of plural FieldNames and FieldValues since they are describing the type of a single field name/value.

I’m sure there are plenty of other things to consider here. But I think this feature can help reducing code duplication and having more precise typing for libraries and such.

I’d love to hear your thoughts :slight_smile:

I think there is a small mistake in the typescript code – it should be FieldName where it says ConfigField.

I wonder how common it is that someone wants to refer to a single field. The fact that typescript has this feature seems to imply it is common, but I never felt the need to do this.

On the other hand, I have faced your original problem where a function returns different types depending on the given string.

With the recently proposed TypeVarDict feature, you could solve that original problem also like this:

from typing import TypeVarDict
TD = TypeVarDict("TD", default=Config)
def get_config_field(field_name: TD.key) -> TD.value:
	configuration: TD = _get_configuration(_infer_environment())
	return configuration[field_name]

Though this function would be generic then, which isn’t really what was intended. Maybe there is a better way to make this work with fixed TypedDictss.

The other downside is that this doesn’t work with dataclasses.

Thanks for pointing out the mistake, fixed it :slight_smile:

As for the commonallity of this feature. I found myself in need of such a feature which is why im propsing it. But I think in general this would be useful for library developers to reduce code duplications /provide more accurate typing.

In fact, I have plenty of ideas I wish to transfer from typescript to python (so python can have the best of all worlds :slight_smile: ). A lot of these features could be considered quite nieche for the avarage python user, however they could be really useful for library developers that can provide better typing for their users, even if the users do not know that these advanced features are being used behind the scences.

As for the TypeVarDict feature. It seems it can provide an overall sweeter syntax compared to my solution, but only for the TypeVar case of connecting keys to values.
I will say though, that for the problem of supporting both TypedDicts and dataclass-like types, the TypeVarDict feature could go with a simillar approach as in my solution (where the keys->values connection can also be used the same way for attributes if the class being operated on is a dataclass-like type).

I’ve since come to the conclusion that your proposal is better than mine. I wrote a detailed proposal here and a core dev offered to sponsor the PEP, but I think your proposal is more flexible. Do you want to write a PEP for it? I’m pretty sure Jelle would sponsor it.

Before going ahead with a PEP… @Jelle (and others), thoughts?

The idea indeed overlaps @tmk’s proposal on the python/typing forum: your FieldName is equivalent to that proposal’s Key. However, both proposals include additional elements.

Your proposed syntax for FieldValue is problematic, because in general strings can also be types. If a type checker sees FieldValue[SomeType, "T"], does that mean the key "T" or a reference to the type parameter T?

Also, I feel FieldType would be a more appropriate name than FieldValue.

I find myself missing keyof pretty much any time that I switch from TypeScript to Python, so I would love to see a proposal in this style. I think that this would improve the usability of TypedDict in particular, especially alongside the proposal floating around for defining TypedDicts inline. (I know that keyof wouldn’t be useful for inline TypedDicts, but they are both meaningful usability improvements.)

Thank you so much for your kind words :smiley:
I would definitely want writing a PEP for it, but know that english is not my native language and since I’ve never written a PEP before, I’m unfamilliar with the process. Still, if it’s possible I would like to try.

The vagueness of having a direct string as a parameter to FieldValue is something that has bothered me as well. The reason I suggested it was for the sake of ease, with the hopes that we can resolve the ambiguity in a simillar manner as Literal type does.

T = TypeVar("T")

# Would always refer to the literal string "T"
FieldType[SomeType, "T"]

# Would refer to the type var T
"FieldType[SomeType, T]"

And yes, I took your suggestion, FieldType is way more intuitive than FieldValue. I don’t know how didn’t I think of that :slight_smile:

I’m not sure if my comment will be helpful, but I have a use case for this idea on Starlette (FastAPI dependency).

I’ve explained it here: How to convert `TypedDict` to a `dataclass_transform` behavior? · python/typing · Discussion #1457 · GitHub.