Typing Unpack for **kwargs with Self or dataclass classes

Recently Unpack from typing only allow unpacking from TypedDict. How about allowing Unpack from dataclass classes with Self or its dataclass name. For example the following code. I think it looks good and straightforward. What do you think?

from __future__ import annotations

from dataclasses import dataclass
from typing import Self, Unpack


@dataclass
class Animal:
    name: str

    @classmethod
    def from_animal(cls, animal: Animal, **kwargs: Unpack[Self]) -> Self:
        return cls(animal.name, **kwargs)

@dataclass
class Cat(Animal):
    color: str

animal = Animal("Whity")
cat = Cat.from_animal(animal, color="white")

Does your example include name as part of allowed keyword arguments? Why/Why not?

Interesting.

Why? For simplicity, its the meaning of Unpack. Just unpack the dataclass and get the attribute.

Why not? Of course from the example attribute name can be obtained from animal argument.

Ok, let me rephrase that: Currently, name should not be part of kwargs since it’s invalid at runtime. But it is a member of Cat, so what rule eliminates it from the list of options?

Currently, name should not be part of kwargs since it’s invalid at runtime.

I’m not sure why name is invalid to be the part of kwargs. Because name is inherited from its superclass Animal

so what rule eliminates it from the list of options?

I can’t answer that

Try it out! Does passing name at runtime work?

This should be obvious if you know enough python and thought through your example.

Try it out! Does passing name at runtime work?

I try this change the code to something like this

cat = Cat.from_animal(animal, name="browny", color="white")

it complaining TypeError: Cat.__init__() got multiple values for argument 'name'.

so what rule eliminates it from the list of options?

Now I’m understand why you say that

I’m not an expert. I learn to create a REST API wrapper. I use dataclass for schema and TypedDict for types.py to help the wrapper. When i create both dataclass and TypedDict i noticed that it looks very similar. I feel it getting WET and error prone. Let me explain similarity and difference between dataclass and TypedDict.

Similarity

Created Using class

dataclass

from dataclasses import dataclass

@dataclass
class AuthorData:
    name: str
    address: str
    age: int

@dataclass
class BookData:
    title: str
    numberOfPage: int
    author: AuthorData

TypedDict

from typing import TypedDict

class AuthorDict(TypedDict):
    name: str
    address: str
    age: int

class BookDict(TypedDict):
    title: str
    numberOfPage: int
    author: AuthorDict

Allowing inheritance

dataclass

from dataclasses import dataclass

@dataclass
class AnimalData:
    name: str

@dataclass
class CatData(AnimalData):
    color: str

TypedDict

from typing import TypedDict

class AnimalDict(TypedDict):
    name: str

class CatDict(AnimalDict):
    color: str

Difference

Using attribute vs key

dataclass

book_data.author.name

TypedDict

book_dict['author']['name']

Pass with argument vs pass with dict

dataclass

cat = CatData("Whity", "white")
# Cat(name='Whity', color='white')

TypedDict

cat = CatDict({"name":"Whity", "color":"white"})
# {'name': 'Whity', 'color': 'white'}

Even there is asdict function in dataclasses to make it a dict.

from dataclasses import asdict

@dataclass
class ShopData:
    cats: list[CatData]

class ShopDict(TypedDict):
    cats: list[CatDict]

shop_data = ShopData([CatData("Browny", "brown"), CatData("Whity", "white")])
# Shop(cats=[CatData(name='Browny', color='brown'), CatData(name='Whity', color='white')])

shop_dict = asdict(shop_data)
# {'cats': [{'name': 'Browny', 'color': 'brown'}, {'name': 'Whity', 'color': 'white'}]}

Example when using with res.json()

data: ShopDict = res.json()
shop = ShopData(**data)

I think the code is more DRY if somehow we have a way to make TypedDict from dataclass and make it easier to map json into class. I’ve read there is new typing Unpack to unpack TypedDict that make it possible to static type check dict. How about dataclass?

data: Unpack[ShopData] = res.json()
shop = ShopData(**data)

Looks good and straightforward right?

Another example is when using **kwargs to pass argument. This explained at the top.

so what rule eliminates it from the list of options?

I’m still not sure how to solve this

Your example leaves me to wonder: why define typed dicts at all?

From my understanding, TypedDict is useful for code that is working with dicts rather than custom class instances but the devs still want to benefit from type checking. In your example, the data from the outside source is converted to instances of dataclasses, at which point typing annotations can help avoid errors. What benefit is added by the typed dicts step? Without them, the data would be converted and typed on the next line.

[edit to avoid adding a message repeating the same things]

if i could type hint the data with Unpack[ShopData] instead of ShopDict

The point was: do not type hint data, that does not add anything to the program shown.

I create a TypedDict usually because ShopData is not used with **data but pass the arguments of the dict one by one. I make code shorter just for simplicity in example. Here the full code usually i create.

data: ShopDict = res.json()
cats_data = []
cats_dict = data['cats']
for cat_dict in cats_dict:
    cats_data.append(CatData(cat_dict['name'], cat_dict['color']))

shop = ShopData(cat_data)

if i could type hint the data with Unpack[ShopData] instead of ShopDict, I no longer need to create the TypedDict that looks very similar to my dataclass.

I don’t really see the point in doing either of those things, since you didn’t actually validate that the JSON contained the data you expected it to, so you might as well keep it at dict[str, Any] and pass in the values like that. You get the same level of safety that way. Doing the implicit step, where you essentially just tell the type checker “yes, this is fine, I know I receive data with the expected schema” earlier is not better, especially if you have to lie and don’t actually guarantee correct data.

That’s where something like pydantic is a lot more useful, which will validate the JSON and directly convert it to your dataclass structure for you with dedicated methods for parsing from JSON. Generating more helpful error messages than KeyError/ValueError/TypeError sometimes at some arbitrarily deep point down the stack which just happened to coincide with when you first needed that specific attribute to be of a given type.

1 Like

Yes, it looks useless to type hint it.

I know pydantic and its awesome, i think i will use it instead of dataclass to parse json into schema. But, i like dataclass over pydantic mainly because it is already in the library, no need to install 3rd party library. I believe i am not the first person that use dataclass as schema and TypedDict as type hint to handle json in rest api wrapper.

When i learn programming, i always hear this principle “WET code and DRY code”. At first i ignore this principle until i feel tired to repetitive copy paste, and i should change many line in order to change one thing. And since that i believe DRY is better, and start refactor my code. This happen with dataclass and TypedDict, copy and paste, so i believe there is someway to make it DRY.

Honestly for me its fine, just use the Find and Replace tool, although sometimes it is just not work as expected. And my experience just can bring two example, **kwargs and res.json(). I cant found what the other common case. I think most people that working with api wrapper will use pydantic not dataclass maybe?

Thank you for this discussion. Have a nice day!