Typing issue with Generic class

gkb · April 18, 2025, 9:19am

I have the following script:

from dataclasses import dataclass
from typing import Generic, TypeVar

T = TypeVar("T")

@dataclass
class Field(Generic[T]):
    name: str
    default: T
    field_type: type[T]

f = Field(name="", default=1, field_type=str)

reveal_type(f.default)

I tried adding type hints to the class Field to detect when the type of the default argument does not match the field_type (as in the example above).
However, in the above code, mypy resolves T as object and does not throw an error.
Is there a way to enforce this?

Thank you for your help!

ImogenBits · April 18, 2025, 11:03am

One way to think about this is that mypy is completely right from a certain point of view. What the definition of Field is saying is that for some type T the default attribute is an instance of T and the field_type attribute is a class object that’s a subtype of T. If we replace T with object, then of course an int also is an instance of object and similarly str is a subclass of it.

The problem is that that’s not really what you want. What you want is that the field_type attribute specifies some class and then the default attribute has to be an instance of that class. To get that behaviour you need to either not rely on type inference, or make the inference resolve T to exactly the field_type attribute. The first is easy, just explicitly specialize the constructor like this Field[str](name="", default=1, field_type=str). The second isn’t actually possible right now. The typing spec doesn’t strictly specify how inference should be performed and there is no standardized method of influencing it. However, there is a workaround of splitting the arguments into different functions like this:

def make_field(name: str, field_type: type[T]) -> Callable[[T], Field[T]]:
    def inner(default: T) -> Field[T]:
        return Field(name, default, field_type)
    return inner

This way, type inference will first see that the make_field function is passed str and resolve T to it, the returned callable then only accepts an instance of str.

gkb · April 18, 2025, 7:25pm

Thanks for your detailed answer! I will probably use your option 1, even though I wish I could avoid having to specify the type once in the specialization and again as argument.

ImogenBits · April 19, 2025, 8:57pm

You can kind of do that. If you use the Field[str] form, you can access the class with a bit of internal typing magic. When you write Field[str](...) the object that is created will have its __orig_class__ set to Field[str], and from that you can access the type argument via get_args. So, for example, you could change your class to something like this:

@dataclass
class Field(Generic[T]):
    name: str
    default: T
    field_type: type[T] = Never

    def __post_init__(self) -> None:
        if self.field_type is Never:
            orig_class = getattr(self, "__orig_class__", None)
            if orig_class is None:
                raise ValueError
            self.field_type = get_args(orig_class)[0]

(here Never just serves as a sentinel to identify if field_type wasn’t explicitly passed to the constructor, it works since it is a subtype of every type T, but you could also replace it with something more explicit like None and adding that to the init call type hint)
I don’t think that dunder is explicitly part of the external API, but it is something certain libraries rely on, so I still wouldn’t expect its behaviour to go through major breaking changes or things like that.

gkb · April 22, 2025, 8:28pm

Sorry for taking so long, but now I finally found the time to explore your suggestions a bit more thoroughly.
I have now changed my mind and actually prefer solution 2!
What I did not like at first was the fact that I lose information about parameter names:

from dataclasses import dataclass
from typing import Callable, Generic, TypeVar

T = TypeVar("T")


@dataclass
class Field(Generic[T]):
    name: str
    default: T
    type: type[T]

def make_field(field_type: type[T]) -> Callable[[str, T], Field[T]]:
    def inner(name: str, default: T) -> Field[T]:
        return Field(name, default, field_type)

    return inner


FloatField = make_field(float)
ff = FloatField(name="name", default=1)
# Unexpected keyword argument "name"
# Unexpected keyword argument "default"

With the help of Stackoverflow I ended up with the following code:

from dataclasses import dataclass
from typing import Generic, Protocol, TypeVar

T = TypeVar("T")


@dataclass
class Field(Generic[T]):
    name: str
    default: T
    type: type[T]


class PartField(Protocol, Generic[T]):
    def __call__(self, name: str, default: T) -> Field[T]: ...


def make_field(field_type: type[T]) -> PartField[T]:
    def inner(name: str, default: T) -> Field[T]:
        return Field(name, default, field_type)

    return inner


FloatField = make_field(float)
ff = FloatField(name="name", default=1)   # works now