Today the typeshed annotation for dataclasses.field lies about its actual runtime return type, in order to “help type checkers understand the magic that happens at runtime.” Rather than returning an instance of dataclasses.Field (as it does at runtime), the function is annotated as returning T, where T is the type of its default argument, or the return type of its default_factory argument.
I think this choice, which may have made sense at one point, does not compose well with the later introduction of dataclass_transform, and specifically its field_specifiers argument.
If a third-party dataclass_transform accidentally fails to list field_specifiers, but then uses dataclasses.field as a field specifier, this is a bug: the call to dataclasses.field won’t actually be treated as a field specifier. But this bug can pass silently in an example like this, due to the incorrect typeshed annotations:
from dataclasses import dataclass, field
from typing import dataclass_transform
@dataclass_transform()
def mydc[T](cls: type[T]) -> type[T]:
return dataclass(cls)
@mydc
class Base:
hidden: None = field(init=False)
Base()
In this case, since there are no field_specifiers listed, the call to field(init=False) should not be special-cased at all; it should just be treated like any other field RHS, as a default value.
This should emit a diagnostic like “dataclasses.Field instance is not assignable to None", which would highlight the fact that it is not being treated as a field specifier.
Instead, because of the lie in typeshed, the wrong assignment passes silently (since there is no default or default_factory in the call, T resolves to Any). And use of Base will also likely not reveal the problem; the call Base() will still succeed, since even though hidden is not being treated as init=False, it is being treated as having a default.
This is not hypothetical: this exact bug exists in the flet library, and has gone un-noticed for exactly this reason.
Type checkers must already special-case all listed field specifiers (not only dataclasses.field), but only in the context of a class body in which they are a listed field specifier. This context-sensitive behavior cannot be accurately represented in typeshed. Given this, do type checkers actually gain any benefit from this lie in typeshed? Or does it only serve to mislead, when dataclasses.field is used outside of a context where it is a valid field specifier?
I’m particularly interested in feedback from type checker authors about whether this typeshed lie is necessary in some way for their type checker to work, and it would be difficult to adjust to its removal.