PEP 705 – TypedMapping

Gobot1234 · May 17, 2023, 8:00pm

Most of my reasoning for wanting this to be a separate class is down to wanting to be able to safely type files like https://github.com/Gobot1234/steam-ext-tf2/blob/main/steam/ext/tf2/types/schema.py but I do think the idea of having a separate Mapping class does have its benefits because not every single mapping is always a dict that you just want to be read only.

Although I will admit without a type guard or some form of typed_mapping_transform decorator, I don’t entirely know how useful this proposal would be as safe upcasting would be a large challenge. (I think a piece of code might help)

class ItemSet(TypedMapping, MultiDict):
    name: str
    items: MultiDict[Literal["1"]]
    store_bundle: NotRequired[str]

x: ItemSet = MultiDict(name="item", items=MultiDict(foo="1"))  # how does a type checker know this is safe?

But I do think if this were possible it would be a vast improvement to code that has to interact with mappings without special casing everything as dict.

alicederyn · May 17, 2023, 9:12pm

I’m happy to simplify the PEP back to adding readonly keys to TypedDict. It covers all the use-cases I’ve encountered. I can focus more on why having both readonly and mutable keys in the same type is valuable. (I won’t be updating the PEP for a few weeks, so anyone wishing to push back, please do.)

@erictraut I would greatly value any other feedback you have based on implementing the ReadOnly special form, if you have time to do that.

That is probably because I made a small but important mistake I meant to write:

class A(TypedDict):
  foo: int

class B(TypedDict, A):
  bar: str

class C(TypedDict, A):
  bar: int

b: B = { "foo": 1, "bar": "baz" }
c: C = { "foo": 2, "bar": 3 }
a1: A = b
a2: A = c
a1.update(a2)  # Is this allowed? b is no longer a B.

I believe pyright allows this? I’m using it via VSCode so it’s possibly a mistake by the editor or an old version, but it seems as if update is defined as taking an instance of the same type. I couldn’t find docs on the website to check this against though.

If the update method were instead defined as accepting something like:

@final
class A_UpdateArg(TypedDict, total=False):
  foo: int

then I think the issue would be fixed, and it would also be easy to extend the rule to ReadOnly keys (drop them from the type).

erictraut · May 18, 2023, 12:48am

I believe pyright allows this?

Pyright generates an error in this case.

I would greatly value any other feedback you have based on implementing the ReadOnly special form, if you have time to do that.

Sure, happy to do so. I implemented the ReadOnly proposal in pyright. Here’s the PR if you’re interested. I could merge this and release it with pyright 1.1.310 (which will be released next week Tuesday) if folks are interested in playing with it in VS Code.

I didn’t discover any new issues with the proposed design. However, I realized that one of my earlier statements was incorrect. I previously indicated that the type compatibility behavior should depend on whether the TypedDict class was marked @final. Upon further thinking, that doesn’t make sense because TypedDict is a structural type, and @final has no bearing on type compatibility when it comes to structural types.

I spent some time playing with both the TypedMapping and the ReadOnly variants. IMO, TypedMapping feels quite natural until you start to combine read-only and non-read-only fields. Then it feels really cumbersome. The ReadOnly variant feels pretty good to me, and I think it would be easy to teach anyone who is familiar with TypedDict. I encourage others to play with both variants and provide their thoughts too.

Oh, I adopted the name readonly for the new keyword parameter. I had previously suggested read_only, but I noticed that you mentioned readonly. I don’t have a strong opinion here. Perhaps there’s an existing precedent in stdlib that we can follow?

alicederyn · May 19, 2023, 8:18am

I just noticed this. I don’t think TypedDict would ever be compatible with Mapping because it has a hard requirement that the runtime type be a dict. It would be confusing for that to be conditionally dropped if all fields are readonly, and it would still not be possible to combine with other Protocols, which would limit utility. My preference would be for the use of the TypedMapping type if this feature is desired.

Me either! I just followed the convention of Django without thinking too hard about it. I think it’s fairly common for hyphenated words to be separately uppercased when turned into CamelCase but not to have an underscore when turned into snake_case, but I don’t know if there’s a core library precedent.

tmk · May 19, 2023, 12:55pm

Searching through CPython on GitHub, I found one readonly in the imaplib module: imaplib — IMAP4 protocol client — Python 3.11.3 documentation but no read_only.

erictraut · May 19, 2023, 3:55pm

All TypedDict instances are already compatible with the type Mapping[str, object].

from typing import Mapping, TypedDict

class TD1(TypedDict):
    x: int

td1: TD1 = {"x": 0}
m1: Mapping[str, object] = td1

alicederyn · May 19, 2023, 9:32pm

Ah, I see what you mean now. “Compatible with” sounded like some kind of bijective relationship.

a-reich · May 20, 2023, 4:29am

I’d like to add one perspective that may not have been brought up on this topic. In the numerical computing field and other “data” field, a lot of work uses “dataframe” or “table” structures associating a set of keys to arrays of different data types (most common framework being pandas but also dask, pyarrow, polars, somewhat xarray…) In many workflows one uses particular literal keys, and a frequent error to trip up on is using the wrong key name for that dataframe, or confusing which key has which dtype.
IMO it would be valuable if the type system could aim to address this problem by supporting a way to annotate the keys and dtypes of a dataframe, and automatically track that information through operations. Unless I’m missing something, a natural way to do this could be annotations with a TypedMapping protocol, since although a DF is not at runtime a dict, it already behaves very close to the interface of Mapping.
My point being this is a feature TypedMapping could maybe in the future provide that marking read-only fields of a TypedDict cannot.

tmk · May 20, 2023, 10:31am

I think this might be better done with a TypedDict as the bound of a TypeVar. See a proposal here: `TypeVarDict` for DataFrames and other TypedDict-like containers (also called “key types”) · python/typing · Discussion #1387 · GitHub

a-reich · May 20, 2023, 1:46pm

Isn’t the issue with that idea that a DF, as I said, is not a dict?

tmk · May 20, 2023, 2:08pm

It would work like this:

from typing import Generic, Key, TypedDict, TypeVar, Value
import numpy as np

DTypes = TypeVar("DTypes", bound=TypedDict)

class DataFrame(Generic[DTypes]):
    def __getitem__(self, key: Key[DTypes]) -> Series[Value[DTypes]]: ...

class MyColumns(TypedDict):
   a: np.int64
   b: np.float32

df: DataFrame[MyColumns] = ...

reveal_type(df["a"])  # Series[np.int64]
df["c"]  # type error

a-reich · May 20, 2023, 2:37pm

Ah, I see - yeah that could be a good solution! (of course the type checkers would have to implement the semantics of the new operators)

davidfstr · May 21, 2023, 2:29pm

Agreed.

Makes sense. I was thinking along these lines as well, although the spelling was Final in my original idea rather than ReadOnly, since that seems to match the effect of Final as currently used outside of TypedDicts.

– Specifically it appears that both the ReadOnly[] type qualifier and the readonly=True parameter are implemented. Neat.

I’ll see if I can find some time soon to carefully look over the test cases for the implementation. I find test cases often reveal corner cases that would be useful to know.

This was also true in TypedDict’s original implementation in mypy. I speculate it’s also true in the latest implementation but haven’t checked recently.

TypedMapping (if defined) to assume unknown keys are ReadOnly?

This is an interesting line of thought:

Edit: …but I’ll have to think further to determine the actual implications, since my first reaction, blurred below, makes an incorrect assumption.

The rules around TypedDicts currently assume that any unknown keys are writable (i.e. not readonly).

If we were to introduce a “TypedMapping” that is separate from a TypedDict - I’m still not sure if this makes sense - I imagine the key difference would be that a TypedMapping would assume that any unknown keys would be necessarily readonly. TypedDict would continue to assume that any unknown keys are writable.

For example:

from typing import TypedDict

class Point2D(TypedDict):
    x: int
    y: int

class Point2DWithFixedZ(Point2D):
    z: ReadOnly[int]

p = Point2DWithFixedZ(x=1, y=2, z=3)
p2: Point2D = p   # ERROR: Point2D requires all keys other than {"x", "y"} to not be ReadOnly, but Point3D defines key "z" to be ReadOnly.
p['z'] = 4  # alters an unknown key that was intended to be ReadOnly!

from typing import TypedMapping

class FixedPoint2D(TypedMapping):
    x: int  # implicitly ReadOnly
    y: int  # implicitly ReadOnly

class FixedPoint3D(FixedPoint2D):
    z: int  # implicitly ReadOnly

p = FixedPoint2D(x=1, y=2)
p2: FixedPoint3D = p  # OK: All added keys are ReadOnly
p2['z'] = 4  # ERROR: Cannot assign to unknown key "z" of FixedPoint2D which is ReadOnly

If TypedDict and TypedMapping were defined with the above behaviors for unknown keys, then it would not be possible for any particular TypedDict to inherit from a TypedMapping or visa versa while preserving type safety:

class Point2D(TypedDict):
    x: int
    y: int
    # unknown keys must not be ReadOnly because this is a TypedDict

class FixedZ(TypedMapping):
    z: int   # implicitly ReadOnly
    # unknown keys must be ReadOnly because this is a TypedMapping

class Point2DWithFixedZ(Point2D, FixedZ):  # ERROR: Cannot inherit from both a TypedDict and a TypedMapping because they have incompatible policies for unknown keys
    pass

p = Point2DWithFixedZ(x=1, y=2, z=3)
p['w'] = 0  # OK/ERROR: Ambiguous whether this should be an error or not. TypedDict would be OK. TypedMapping would be ERROR.

from typing import TypedDict, TypedMapping

class AlmostFixedPoint2D(TypedDict):
    x: ReadOnly[int]
    y: ReadOnly[int]
    # unknown keys must not be ReadOnly because this is a TypedDict

class FixedPoint2D(TypedMapping):
    x: int  # implicitly ReadOnly
    y: int  # implicitly ReadOnly
    # unknown keys must be ReadOnly because this is a TypedMapping

p = AlmostFixedPoint2D(x=1, y=2)
p['z'] = 3  # OK; unknown key of TypedDict is not ReadOnly
p2: FixedPoint2D = p  # ERROR: cannot assign a TypedDict to a TypedMapping; incompatible policies for unknown keys
p['z'] = 4  # yikes, this modifies a key of p2 assumed to be ReadOnly

class AlmostFixedPoint2D(TypedDict):
    x: ReadOnly[int]
    y: ReadOnly[int]
    # unknown keys must not be ReadOnly because this is a TypedDict

class AlmostFixedPoint3D(AlmostFixedPoint2D):
    z: ReadOnly[int]  # ERROR: cannot add ReadOnly key to TypedDict subclass

p = AlmostFixedPoint2D(x=1, y=2)
p['z'] = 3  # OK; unknown key of TypedDict is not ReadOnly
p2: AlmostFixedPoint3D = p  # ERROR: AlmostFixedPoint3D requires that key "z" be ReadOnly but AlmostFixedPoint2D requires that any keys other than {"x", "y"} (including "z") must not be ReadOnly
p['z'] = 4  # yikes, this modifies a ReadOnly key of p2

alicederyn · May 21, 2023, 3:13pm

This sounds wrong to me. Unknown keys are of unknown type, therefore there’s no way to know whether a given assignment is safe or violating the type. For example, take the following modification of your example:

class Point2D(TypedDict):
    x: int
    y: int
    # unknown keys must not be ReadOnly because this is a TypedDict

class Example(TypedDict, Point2D):
    w: str

w: Example = { "x": 1, "y": 2, "w": "foo" }
p: Point2D = w
p['w'] = 0  # Violates type declaration for w

The difference is intentional. Final means something cannot be changed at all. Read-only means it cannot be changed via this particular supertype.

ajoino · May 21, 2023, 9:28pm

I’m a typing scrub so this might be obvious, but aren’t you essentially casting an Example to a Point2D there? If that is even legal, haven’t you then “given up” the typed field 'w' and it is an Any type now?

alicederyn · May 21, 2023, 11:17pm

Yes! And this is absolutely legal: Example inherits from Point2D. Anything you can do to a Point2D must be legal if the Point2D is actually holding an Example instance – or any other possible subclass – as this is the defining characteristic of inheritance.

No, because the variable p is only holding a reference to the underlying object, which is still of type Example, and can still be accessed as such by the variable w.

ajoino · May 22, 2023, 5:52am

I failed my reading comprehension, didn’t see that Example inherited fromPoint2D

davidfstr · May 22, 2023, 10:01am

Indeed I am remembering incorrectly. According to mypy, you cannot write the value of an unknown key:

# td_setitem_unknown_key.py
from typing import TypedDict

class Point2D(TypedDict):
    x: int
    y: int
    # unknown keys must not be ReadOnly because this is a TypedDict

class Example(TypedDict, Point2D):
    w: str

w: Example = { "x": 1, "y": 2, "w": "foo" }
p: Point2D = w
p['w'] = 0  # error: TypedDict "Point2D" has no key "w"  [typeddict-unknown-key]

…however you can read the value of an unknown key with get, since there might actually be a value (from when previously manipulated as a subclass):

# td_get_unknown_key.py
from typing import TypedDict

class Point2D(TypedDict):
    x: int
    y: int
    # unknown keys must not be ReadOnly because this is a TypedDict

class Example(TypedDict, Point2D):
    w: str

w: Example = { "x": 1, "y": 2, "w": "foo" }
p: Point2D = w
print(p.get('w'))  # OK

I’ll ~~edit away~~ blur my discussion which relies on this misremembering.

spacether · May 22, 2023, 9:08pm

Can the additional equivalent syntactic forms be added for TypedMapping the same way that
TypedDict has them? This will allow for keys which are not valid python variable names. Thanks!

alicederyn · May 22, 2023, 9:28pm

Yep! Already in the PEP, first bullet point of the specification