I don't know my dumb code is readable, idiomatic or not

capymind · April 12, 2024, 1:01pm

Hi,

Assume there is a data from external api (means I cannot change the shape of it).

# data from 3rd party API call.
data = {
    "AUTHOR": "guido",
    "TITLE": "python",
    "PUBLISHED": 1991,
    "QRCODE": "...",
}

I have Book dict for internal use (within application boundary).

from typing import TypedDict

class Book(TypedDict):
    author: str
    title: str

In order to fit the external data into book for internal use, I wrote as follows:

# availble approach 1.
my_favorite_book: Book
my_favorite_book = {
    k.lower(): v for k, v in data.items() if k.lower() in Book.__required_keys__
}

# available approach 2.
my_favorite_book = Book(
    author = data["AUTHOR"],
    title = data["TITLE"],
)

The Book only has 2 keys (author, title) in this example. However, there are cases where number of keys is between 10 and 30. I chose approach 1 but I am not sure it’s readable or idiomatic… Just put every key mappings manually (approach2) is better? Or any other suggestions?

The situation where I am in is that I have to convert many fileds with uppercase in partial fields with lowercase keys. For example:
{PRICE: ..., RISK: ..., RATIGING:...} -> {price:..., rating: ...} .

Thanks for reading!!

franklinvp · April 12, 2024, 1:35pm

Things like checking that your data structure is given the right keys, not accepting or ignoring extra keys, turning to lowercase certain keys, are all runtime functionality.

TypedDict and your Book is for static type analysis. For example, declaring my_favorite_book: Book and then some tool other than Python warning you if the code ever assigns to my_favorite_book some dict that does not agree with the type. For example, here is Pylance complaining.

But it is code that runs fine in Python

I think you should rather have a dedicated class that encapsulates the functionality that you want. For example a dataclass.

from dataclasses import dataclass

@dataclass
class BookData:
  author: str
  title: str

This only accepts author and title

>>> my_favorite_book = Book(**data)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Book.__init__() got an unexpected keyword argument 'AUTHOR'

You can add the functionality to accept data and do all the processing that you need, like turning keys to lowercase. For example,

@classmethod
def from_dict(cls, env):      
  return cls(**{
    k.lower(): v for k, v in env.items() 
    if k.lower() in inspect.signature(cls).parameters
  })

Now, taking your question at face value, the first approach makes an assignment that does not agree with the type that was declared, while the second approach does. At least Pylance complains with the first and not the second.

The error message from Pylance is

Expression of type “dict[str, Unknown]” cannot be assigned to declared type “Book”

kknechtel · April 12, 2024, 2:19pm

If you do a lot of this kind of processing, you might be interested in third-party libraries such as Pydantic. Otherwise I tend to agree with @franklinvp .

capymind · April 12, 2024, 2:23pm

Hi, Franklinvp.
Thanks a lot for your detailed and kind explanations. It helps a lot.

Yeah… that’s the good part of mypy for me.
It’s just static type checker.

It’s totally my fault if I assign/update arbitrary keys.
But, it’s okay if I know what mypy does for me correctly…

(Although dataclass helps at initialization step constraining inputs, it’s totally okay such as book.flavor="sweet" after that)

To be honest, I am not good at determining which type of container is better for sepcific problem.
NamedTuple, TypedDict, Dataclass, (attr, cattr, msgspec, pydantic and many others).

I gave a shot to “just use dictionary” until I find myself a kind of dumb…

from typing import TypedDict, Any

class Book(TypedDict):
    """Book data."""

    author: str
    title: str


def create_book(
    raw: dict[str, Any],
) -> Book :
    """Creates a book given external arbitrary data."""
    required_fields = Book.__required_keys__
    lowercased_fields_in_raw = set(k.lower() for k in raw.keys())

    if not lowercased_fields_in_raw.issuperset(required_fields):
        raise KeyError("missing required keys.")

    return Book(
        **{
            k.lower(): v
            for k, v in raw.items()
            if k.lower() in Book.__required_keys__ | Book.__optional_keys__
        }
    )


external_data1 = {
    "AUTHOR": "guido",
    "QRCODE": "1234",
}

book1 = create_book(external_data1) # raise error

external_data2 = {
    "AUTHOR": "guido",
    "TITLE": "python"
    "QRCODE": "1234",
}

book2 = create_book(external_data2) # book. what I want.

Anyway, I think I have to get more experiences in Python…
Thanks again.

capymind · April 12, 2024, 2:29pm

Um… I am wrong…

 error: Unsupported type "dict[str, Any]" for ** expansion in TypedDict  [typeddict-item]

FelixLeg · April 12, 2024, 2:36pm

Would it change anything if you replace this piece:

SoonShin, Kwon (rodi):

return Book(
        **{
            k.lower(): v
            for k, v in raw.items()
            if k.lower() in Book.__required_keys__ | Book.__optional_keys__
        }
    )

to this:

tmp_dict = {
            k.lower(): v
            for k, v in raw.items()
            if k.lower() in Book.__required_keys__ | Book.__optional_keys__
        }

return Book(**tmp_dict)

capymind · April 12, 2024, 2:49pm

It emits the same error message
I think I have to take a different approach (if I want to stick to dumb principle).
Thanks you for the reply!

FelixLeg · April 12, 2024, 2:50pm

Quick question: does the error comes from Python or a static type checker?

capymind · April 12, 2024, 2:54pm

It comes from mypy. (not python)

FelixLeg · April 12, 2024, 2:57pm

Well, then there is always the cast(...) function:

import typing

return typing.cast( Book, tmp_dict )

capymind · April 12, 2024, 3:11pm

Oh my…I passed mypy.

I don’t know there is cast at all…

Docs says

When using a cast, the type checker should blindly believe the programmer.

https://typing.readthedocs.io/en/latest/spec/directives.html#cast

I think I should reconsider the way I write code before passing mypy…

Thank you!

Topic		Replies	Views
PEP 705 - TypedDict: Read-only and other keys Typing	15	2418	November 4, 2023
Plain text dictionary keys? Ideas	16	2080	July 18, 2023
Unpacking generalization for dict assignment with type conversion/data normalization Ideas	11	562	November 20, 2023
Annotations using inner field names and values of TypedDicts/dataclass like objects Ideas	8	1775	September 6, 2023
Correctly typing nested TypedDicts Python Help help , typing	2	1768	August 14, 2023

I don't know my dumb code is readable, idiomatic or not

Related Topics