Should we always use dataclasses?

Since the birth of dataclasses in Python, we have been blessed to be able to avoid boilerplate code. Dataclasses are great – but should we use it always? What would you say?

By Boštjan Mejak via Discussions on Python.org at 19Jun2022 18:29:

Since the birth of dataclasses in Python, we have been blessed to be
able to avoid boilerplate code. Dataclasses are great – but should we
use it always? What would you say?

If they are concise and expressive, use them. If they cause trouble, use
something less troublesome.

Disclaimer: haven’t got around to using any dataclasses yet, myself.

Cheers,
Cameron Simpson cs@cskk.id.au

“Dataclasses are great – but should we use it always?”

Of course not.

That’s like asking “Hammers are great – but should we always use a hammer?” Sometimes you need a screwdriver, or a spanner.

If you don’t have to support your application or library on older versions of Python, then dataclasses would seem to make sense where appropriate. If you do, you will either need to live without them or backport the module. (I imagine some enterprising person has already done this, but verifying that is left as an exercise for the reader.)

N.B. I’ve yet to actually use them. YMMV.

What is stopping you from using them? Is it the fact that you must use type annotations?

By Boštjan Mejak via Discussions on Python.org at 20Jun2022 12:51:

What is stopping you from using them? Is it the fact that you must use
type annotations?

Lack of time. And lack of use cases, in that the stuff I’m doing already
works. They may be just great, but I don’t yet need them and haven’t
made the time to learn them (yet - I will at some point). And, of
course, every new thing is a backward compatability issue - this just
means that I won’t immediately reach for the new shiny without a use
case.

I’ve no inhernet objection to using them, and (mostly) like type
annotations.

What’s your motivating use case for dataclasses?

Cheers,
Cameron Simpson cs@cskk.id.au

The main reason is convenience (or, rather, boilerplate reduction). But as I thought more deeply about this, I guess one should define a class as a dataclass if that class is small and only has a few attributes, having some 2 or 3 methods; otherwise, a regular class should be defined. That’s how I envision it.

I use dataclasses when the generated methods are beneficial. In my case these are most often the methods enabled by default:

  • __init__() - i.e. initializing most of the attributes from arguments
  • __repr__()
  • __eq__()

When I do not initialize most of the attributes from arguments or when I need to perform special actions in __init__() I usually do not use a dataclass.

Note: I often use NamedTuple. I think using plain tuples is ok only if the tuple is really simple, you work with the tuple just in few places of the code and you document it well.

1 Like

By Václav Brožík via Discussions on Python.org at 22Jun2022 08:23:

I use dataclasses when the generated methods are beneficial. In my case
these are most often the methods enabled by default:

  • __init__() - i.e. initializing most of the attributes from arguments
  • __repr__()
  • __eq__()

When I do not initialize most of the attributes from arguments or when I need to perform special actions in __init__() I usually do not use a dataclass.

Note: I often use NamedTuple. I think using plain tuples is ok only
if the tuple is really simple, you work with the tuple just in few
places of the code and you document it well.

Yah. I often subclass a namedtuple or a SimpleNamespace, depending
on the mutability requirements. Gets me a nice __str__ and the
namedtuple prevents unwanted (==> misspelled or mistaken) attributes -
quite disciplined.

Cheers,
Cameron Simpson cs@cskk.id.au