Dataclasses - make use of Annotated

It certainly is doable. And I suspect that maybe if Annotations existed when dataclasses was being proposed, it mightve been used instead of what we have today.

However, just because there’s a new way of doing things, I don’t think justifies making it so there’s multiple ways of representing the same information.

What’s the value add, that outweighs the cost of additional maintenance, additional type-checker code, documentation, etc…?

Again, I think what you propose is ideal in a vacuum. But it has to be worthwhile to break “there should be one, and preferably one, way of doing things”

Edit: but this is just my opinion. Please do see what others think. But he prepared with strong arguments for why it needs to exist


Maybe it can be of interest: beartype, a new runtime type checker, had also to code a special case for dataclasses because the type hints do not match de default values. See [Feature request] Automagical Import Hook · Issue #43 · beartype/beartype · GitHub

I would expect that every new type checker on the market has to make exceptions for dataclasses…

1 Like

I think it’s fairly well known in the typing community that data classes need to be special cases. Until now I hadn’t known why. IMO it should be documented somewhere - ideally in both a PEP and longer-term in the formal documentation. I don’t know if it is, though - does anyone have a pointer?

From typing perspective how dataclasses fits in is reasonably well explained by PEP 681. The introduction of PEP highlights a lot of core aspects of dataclasses and how they are exceptional typing wise. It serves as main way to help other similar libraries also be understood by type checkers mostly by saying if you are similar enough to dataclasses you can say your library implements a dataclass transform.

I’d say biggest two special aspects of dataclasses from typing view is that the decorator derives an init based on fields and the specialness of how dataclasses.field behaves. Another view is that dataclasses are a decorator that extends a class with additional methods. There’s currently no good way in type system do describe that.

For example, let’s say you have a decorator that introduces one new method to your class like

def support_metrics(cls):
  def _log_metrics(metric: str, value: float) -> None:

  cls.log_metrics = _log_metrics
  return cls

class Foo:

a = Foo()
a.log_metrics(“user_event”, 1.0) # Code is safe and fine at runtime but no type checker will understand this.

You can view dataclasses as more complex/much nicer decorator than that one. But how do you say in type system that you take a cls as input and return that same type extended with new methods? There’s currently no way to do that which is big reason why dataclasses is so special. There are various proposals/discussions for ways to support this kind of stuff but I don’t think any of them are at PEP stage.


One way would be to simply add type intersections. Then you could return the intersection of the type with the type of a protocol:

from typing import Any, Protocol, TypeVar, cast

class HasLogMetrics(Protocol):
  def log_metrics(self, metric: str, value: float) -> None:

T = TypeVar('T', bound=type[Any])

def support_metrics(cls: T) -> T & type[HasLogMetrics]:
  def _log_metrics(self, metric: str, value: float) -> None:

  retval = cast(T & type[HasLogMetrics], cls)
  retval.log_metrics = _log_metrics
  return retval

class Foo:

a = Foo()  # `a` has type Foo & HasLogMetrics
a.log_metrics('user_event', 1.0)  # Passes.

You can test this today by removing T & and seeing that the code passes (but the original type is lost).

Back in business, the initial set of changes is rather small.

Issue: Dataclasses - Support use of Annotated including `field` as metainfo · Issue #107051 · python/cpython · GitHub
Pull-Request (including documentation update): gh-107051: Dataclasses - Support use of Annotated including field as metainfo by mementum · Pull Request #107052 · python/cpython · GitHub

Best regards

I suspect this’ll require a PEP to be accepted before the behavior can be added.


Yes, this would certainly require a PEP.


For what it’s worth, Typer also supports Annotated:

1 Like

I don’t think this should go into dataclasses, but I can see why some people think it would be a good idea.

There are essentially two kinds of users of dataclasses, one that uses it to create simple record classes, and one that uses it to remove boilerplate. I am in the first camp and have found that the second way of using dataclasses is not that good.

When I say simple record classes, I mean classes that typically only contain simple types, other record types, and no container types. So I think dataclasses is good for this

from dataclasses import dataclass

class Foo:
    bar: int
    baz: str

class FooBar:
    index: int
    foo: Foo

But if one of the classes must contain a container or any other complex class I’d prefer to just hammer out the boilerplate, because in my experience that typically means the class contains (or will eventually contain) somewhat complex initialization logic. And if I have to write a __post_init__, I might as well write the __init__ and take normal control of class initialization. Keeping dataclasses simple makes it easier to keep track of the responsibility of the classes in my code: dataclasses is just a clump of data and everything else models more complex behaviour.

I also have no issue with type checkers having to special-case dataclasses. Fitting a static type system onto Python is like fitting a round peg into a square hole. Somethings aren’t modeled nicely and adding complexity to dataclasses seems like the wrong thing to do.


There is something which I may be missing in your post.

  • There are no traces of any field call in your examples.
  • The proposal has nothing to do with __init__ and/or __post_init__ which you mention
  • The proposal is not about whether the type hint is complex or not (a dataclass is a container) but as how per-field specific information is conveyed.

Thanks for the tip.

Contributing a new PEP seemed like overdoing it at this point in time, but using Python’s EAFP (Easier to Ask Forgiveness Than Permission) a PEP Draft is available at:

My point is that, in my experience, when you add “complex” fields (like containers) to a dataclass, it’s often going to be rewritten as a regular class at some point anyway, so you might as well do it immediately. That’s why you never save a field call in my example, I never use it!

I mentioned __post_init__ because that’s what you have to use to initialize a dataclass with complex initialization behaviour, and my point was that then you might as well just write the boilerplate and get it over with.

While this proposal itself doesn’t talk about the complexity of the type hint, accepting it is an implicit endorsement of using dataclasses as a boilerplate-removal tool (because it promotes the use of field), even when the class has complex behaviour. I personally don’t like this and would prefer dataclasses to be used only as simple record classes. I think we should leave the more complex behaviour for the more complex tools, like attrs and pydantic.

Fair, but no the point. field and its return value Field do exist, are used and are not going anywhere.

The proposal is about modernizing how they are used, because they have been used, they are going to be used and they will be used.

Don’t take it personally, but if you don’t like those parts of the dataclasses and the endorsements (implicit or explcit), my suggestion is to open a topic in which you advocate for the removal of the functionality including the removal of __post_init__

dataclasses were a formalization of many of the things I did for my own projects/classes and I found it fantastic to have them added to the stdlib, hence my proposal to improve them further.

1 Like

I definitely don’t want to remove existing behaviour, but that doesn’t mean I cannot be against adding new behaviour to “modernize” a feature I don’t particularly care for. That said, I won’t cry if this gets added (tho if I were a maintainer I might’d). It also adds another non-obvious way of doing what is incredibly simple already, which might confuse people. Although that might just be me. I’m also not a fan of binding the (optional) type system more strongly to the language, even stdlib modules (one should not feel forced to use types).

Anyway, I wish you luck finding a sponsor for the PEP-to-be

1 Like

Bes of luck indeed.

If a sponsor wants to show up … (no need to hold your peace forever)

Best regards and thanks to all for contributing to the thread, whether you agree/disagree with the proposal.


FWIW this makes total sense to me.

This not only simplifies the jobs of type checkers, it also clarifies one big problem that I’ve had with dataclasses:

when you use the x: int = field(...) syntax, it looks like that variable has a default value. But it may not.

Since dataclasses don’t let you create fields with default values before fields without one, the presence and absence of an = sign would have been a very obvious indicator of where the required arguments end, and non required arguments start.

Except that due to field() syntax this is not always true. With the Annotated syntax it will be true again. So a definite +1 from me.


  • Makes dataclasses more readable
  • Uses the “appropriate” language feature for the appropriate task (i don’t see how this is different from adding, say, the decorator syntax. readability counts)
  • Simplifies the type checkers

I can understand why people aren’t a massive fan of the current syntax, but as I and Eric Traut said in the other thread, this wouldn’t simplify things for type checkers. Unless and until the current way of doing things is removed – which would be hugely disruptive at this point – type checkers would simply have to add even more special-casing so that they could account for both ways of doing things. This would also be the first time we would be asking type checkers to look at the metadata provided with an Annotated type, an additional complication: currently we promise that type checkers can always treat Annotated[T, <metadata>] identically to how they treat T.


this wouldn’t simplify things for type checkers

You’re right, I didn’t consider the field properties.

This now feels like the change doesn’t belong in the current state of the dataclasses package, more like a rethinking of the package as a whole.

1 Like