Inlined typed dicts and typed dict comprehensions

Viicos · October 23, 2024, 2:19pm

As a suggested alternative to Partial, I continued exploring the idea of defining an inlined/anonymous typed dictionary syntax, and defining the concept of a typed dictionary comprehension on top of it (as inspired by this comment).

An inlined typed dictionary would be defined following the same already existing functional syntax:

# Functional, cannot be used in a type annotation:
A = TypedDict('A', {'a': int})
# Inlined, cannot use `total`/`closed`/etc arguments:
def fn() -> TypedDict[{'a': int}]: ...

Because a name cannot be specified for inlined typed dictionaries, the result of a TypedDict[...] call should be an instance of some _InlinedTypedDict class which is a bit unfortunate.

I haven’t encountered any issues when spec’ing this syntax (it was already done by multiple people before).

Now the idea would be to define a typed dictionary comprehension syntax, as per this discussion. I encountered some issues (both from a static and runtime perspective) that I wanted to discuss.

We introduce a KeyOf special form, that would be evaluated to a Literal by type checkers (and at runtime):

A = TypedDict(A, {'a': int, 'b': str})
type KeysOfA = KeyOf[A]
reveal_type(KeysOfA)  # Literal['a', 'b']

With this KeyOf operator and by allowing TypedDict classes to be indexed ^[1], we would be able to mimic TS’s mapped types:

ImmutableA = TypedDict[{K: ReadOnly[A[K]] for K in KeyOf[A]}]

Taking some simpler examples:

T1 = TypedDict[{K: int for K in Literal['a', 'b']}]
T2 = TypedDict[{K: dict[str, A[K]] for K in KeyOf[A]}]

This raises two questions:

At runtime, we should have Literal[...] iterable. Is this going to cause any issues?
The semantics of indexable TypeDict classes are unclear: if A has total=False set, what does dict[str, A[K]] represents? With K='a', is it dict[str, NotRequired[int]]? NotRequired[dict[str, int]] (if so, how does this behave at runtime to “move” the qualifier outside the dict)? What about this comment?

The idea would also be to extend this to be used with type variables, meaning our inlined typed dictionary implementation should support further parametrization (similarly to Annotated[T, ...]):

type Partial[TD: TypedDict] = TypedDict[{K: NotRequired[TD[K]] for K in KeyOf[TD]}]

Which raises one other question:

Here, TD is a type variable instance, which means: type variables should be indexable; usable as an argument to KeyOf (what should KeyOf[TD] return?). If not possible, should we create a new type parameter with proper semantics defined? If using the PEP 695 syntax, how can they be differentiated from other type parameters?

If we want to support something similar to TS’ Omit<T, K>:

# Option 1, not clear how this behaves at runtime:
type Omit[TD: TypedDict, Keys: Literal] = TypedDict[{K: TD[K] for K in KeyOf[TD] if K not in Keys}]

# Option 2, by allowing type level operations on literals:
type Omit[TD: TypedDict, Keys: Literal] = TypedDict[{K: TD[K] for K in KeyOf[TD] - Keys}]

For option 2, does this mean we should allow type expressions like Literal['a', 'b'] - Literal['a']? What’s the type of Literal['a'] - Literal['a'] then?

Other misc. questions:

When defining a typed dictionary comprehension by iterating over the keys of another typed dictionary, should we preserve the extra items/closed specification?

I’m not trying to get an answer on all the questions, but primarily wanted to ask if it is worth pursuing with this implementation, considering all the challenges described. Maybe some of them could be solved by not trying to be backwards compatible and instead we should introduce a new syntax? Maybe we shouldn’t worry too much about runtime support (but runtime type checkers will have trouble supporting such types)?

Eric proposed using ValueOf[TD, K], which I simplified to TD[K] for clarity. ↩︎

Jelle · October 23, 2024, 2:40pm

Overall, this feels like an interesting idea and worth pursuing, but it seems difficult to get it to work elegantly at runtime.

Probably fine. Some typing objects are already iterable for Unpack reasons, but I don’t think that should cause confusion.

I think the least bad option is dict[str, int]. It loses some information but every other suggestion in the linked discussion seems worse.

I don’t know how to solve the KeyOf[TD] issue; that seems fundamentally difficult to do at runtime. To avoid making type variables subscriptable, we could use some X[A, B] syntax, maybe even using KeyOf[TD, K].

I guess we could add a restricted version of difference types, yes.

Never

Ideally yes, but that feels like another case where the dictcomp syntax won’t be a good fit.

beauxq · October 23, 2024, 3:19pm

I think Literal["a", "b"] is equivalent to Literal["a"] | Literal["b"].
If this is the case, it seems like maybe it should be generalized to all Unions being iterable.

Viicos · October 24, 2024, 9:07pm

Thanks for the answers. I’ll try to continue working on this syntax.

I would rather avoid changing the runtime behavior of typing constructs if it has no defined semantics yet.

To continue working on this syntax, I wrote a draft PEP to define the syntax of inlined typed dictionaries, along with a test runtime implementation. I’ll try to build future work on top of this PEP (If the comprehension syntax ends up being to hard to deal with, we may want to keep this first PEP).

hauntsaninja · October 29, 2024, 2:29am

There isn’t a way to PEP 649 “stringize” a comprehension, so I’d be a little wary of that / want to call it out in any potential PEP.

alwaysmpe · October 29, 2024, 8:56pm

FYI: it already is possible to iterate through the literal values at runtime and there’s at least one idiot depending on the current behaviour (me) but I’m happy to change my code. I posted it on stack overflow a while ago. It’s pretty ugly:

from collections.abc import Sequence, Iterator
from typing import Literal, get_args, TypeAliasType, cast

def get_literal_vals(alias: TypeAliasType) -> frozenset:
    def val(alias: TypeAliasType):
        return alias.__value__
    def args(alias: TypeAliasType):
        return get_args(val(alias))
    def resolve[T](alias: TypeAliasType | tuple[T, ...] | T) -> Iterator[T]:
        if isinstance(alias, TypeAliasType):
            for val in resolve(args(alias)):
                yield from resolve(val)
            return
        if isinstance(alias, tuple):
            t_seq = cast(Sequence[T], alias)
            for element in t_seq:
                yield from resolve(element)
            return
        yield alias
    return frozenset(resolve(alias))

type Doubles = Literal["ab", "de", "gh"]
type Triples = Literal["abc", "def", "ghi"]
type DT = Doubles | Triples
dt_set: frozenset[DT] = get_literal_vals(DT)

There may be some redundance in that but it works…

Viicos · October 30, 2024, 9:16am

Unless I’m mistaken, your code should be fine. What I propose is having Literal[...] iterable:

assert [el for el in Literal['a', 'b']] == ['a', 'b']

This is different from iterating over get_args(Literal['a', 'b']).

I’m not entirely familiar with the semantics of PEP 649. Is it going to be an issue because the PEP 649 implementation can’t keep the reference to Inner here?

def func():
    Inner = int

    class A:
        a: TypedDict[{K: Inner for K in Literal[...]}]

    return A

A = func()
A.__annotations__['a']

Daverball · October 30, 2024, 10:05am

I think generally any comprehension-like syntax is going to have too many shortcomings at runtime unless you can come up with a proper new syntax that is currently invalid and could be used within arbitrary subscript expressions, so that you can completely encapsulate the state the comprehension is trying to encode into a new object, rather than rely on the comprehension somehow magically returning an object that captures that state.

For backwards compatibility you could then manually construct that same object, just like you can with TypeAliasType. It might even be more useful to first come up with a backwards compatible representation that doesn’t involve anything fancy like a comprehension and then think about if and how we could make this easier to use in the future with a proper syntax extension.

I feel similarly about TypedDict literals, I think they could be a lot more powerful if they didn’t try to rely on current generics syntax in order to be backwards compatible but rather used completely new syntax. Ideally powerful enough to cover any corner-cases the class/functional syntax currently have, such as keys that aren’t valid identifier names, including good support for PEP-728. I would also try to design the syntax in such a way that adding support for different kinds of keys, e.g. int or bytes literals, would be possible in the future.

If you use completely new syntax it’s also easier to get away with allowing comprehension-like syntax within a typed dict literal, to support some of these use-cases, although I think it would be wiser to build something more generally useful and not limited to typed dict literals. Although to be fair, a lot of these use cases only make sense for structural types, so perhaps all we need is an equivalent syntax for Protocol literals. There is however still the broader use-case of mapping variadic type parameters like TypeVarTuple and ParamSpec where a comprehension-like syntax extension could help with readability.

Jelle · October 30, 2024, 2:26pm

No, that part works fine (it’s all implemented on the CPython main branch, so you can try it out).

The problem is that PEP 649 attempts to use some clever tricks to reconstruct the original source code. That doesn’t seem to be possible with comprehensions. Similarly, if a comprehension means something different to a static type checker from the expression that it evaluates to, runtime type checkers won’t be able to tell the difference.

Viicos · October 30, 2024, 2:46pm

I’m also starting to think the same (this is what I also stated in the last paragraph of the first post).

I’m assuming you are referring to PEP 764? If so, this can further be discussed when I’ll open a new discussion.

Ah yes, that must be the “stringifier” logic with the ast nodes logic. I remember about it.

Seems like we face a lot of limitations with this comprehension syntax. We could either focus on a Map special form (see this and posts below) or try figuring out a new syntax.