Introducing Record Types in Python

Very interesting. :slightly_smiling_face:

Are you aware of msgspec Struct?

This is very close to what you are describing, and it’s already implemented in C and is very fast. I started looking at it as I was learning about attribute validation.

Coincidentally, I had just posted about it a couple days ago to the faster-python issue tracker

Results (smaller is better):

import (μs) create (μs) equality (μs) order (μs)
msgspec 9.92 0.09 0.02 0.03
standard classes 6.86 0.45 0.13 0.29
dataclasses 489.07 0.47 0.27 0.30

As I’m reading more and more Python library code, I see a tendency to write “dataclass-like” structures via normal classes, rather than use dataclasses. I assume this is due to performance concerns?. I wonder if they’d opt to use a Struct if it was available in the std-lib, since the create/eq/order are faster than a normal class.

1 Like

I really liked the solution. But I really wanted to get away from using classes or decorators to solve everything. I know that Python is very flexible, I decided after thinking a lot about putting the 2 suggestions here on the forum with the 2 points that I understand would help the language a lot. Now, I’m going to hope this at least generates discussion. I think it’s the least I can do :slight_smile:

@brettcannon for fun, here’s a decorator wrapper for msgspec.Struct that we’ll call record

I added your replace and asdict functions. This doesn’t generate the TypedDict type hints. and I didn’t take the time to wrap my head around your __repr__ example.

I agree it’d be cool if we can just do struct Point(x: float, y: float)

For now you’ll have to do:

@record
def Point(x: float, y: float): pass

p = Point(1.0, 1.0)
print(p.asdict())  # Output: {'x': 1.0, 'y': 1.0}
new_point = p.replace(y=2.0)
print(new_point.asdict())  # Output: {'x': 1.0, 'y': 2.0}
1 Like

I actually have a prototype implementation of my idea that just needs an implementation of __repr__, then I plan to make it public (annotations and such were a little tricky).

3 Likes

I now have a pure Python proof-of-concept of my record idea. I will blog about it at some point, but since this is an active discussion I wanted to share as soon as I made the repo public.

15 Likes

Whenever I usually end up using KW_ONLY with dataclasses, it’s to make the fields that have default values be keyword-only. So, I personally would be okay with a record type syntax which didn’t give me fine-grained control over kw_only args, but which had the option to turn all fields with default values into kw_only args.

1 Like

I like it! I’d happily use this as a simpler alternative to dataclasses for a lot of my use cases. It would be convenient if it were in the stdlib[1], say as dataclasses.record. I’m not particularly convinced it needs to be syntax, but I wouldn’t mind if it was.

I’m not personally bothered by the fact that decorating a def statement defines a class. I can imagine that this might make some people uncomfortable, but it’s not that big a deal to me.


  1. I’ve still not found a way that I’m comfortable with to have a bunch of 3rd party dependencies available “by default”, so being in the stdlib is still a significant plus for me. ↩︎

4 Likes

My primary concern is it contributing to product type bloat in Python. To illustrate the point, you may want to also compare it to TypedDicts, attrs classes, Pydantic classes and msgspec classes.

At first blush I wondered about introspectability, for example cattrs needs to know if attributes have defaults or not. But I think that information is available by inspecting the __init__?

My gut feeling is that equality based on shape sounds iffy.

I’d also be curious to see how complicated would supporting generic classes be (maybe not that much?).

5 Likes

Having syntactical support (a soft keyword) might enable greater optimisations or specialisations; personally I’d be interested in this as a built-in workaround form of an immutable dictionary (as Brett noted in his struct article). But pure-Python would certainly be a lower barrier to inclusion!

A

1 Like

I’m sure it does, but I’m labelling this a proof-of-concept to see if the reaction is positive enough to propose a PEP to make this something syntactic which makes that concern go away.

Yes it is.

Maybe, but I at least want to try it for now. I come from the duck typing age of Python when typing was just about the shape of things, and so I’m not bothered by leaning into it more since __slots__ gives us a form of structural typing that can easily be introspected.

It’s not difficult so switch to an isinstance() check to be faster and rely on nominal typing for equivalence. My guess is that’s what most people are used to.

My guess is I will have to copy something over from the function to the class, much like I do for __annotations__ (plus maybe some transformation like I had to do with typing.Unpack which was honestly the hardest part of making things work).

It definitely would on top proper support by type checkers. You could also cache the hash value, not have to generate dynamic Python code, etc.

4 Likes

Why not a base class like typeddict/namedtuple

1 Like

The README in that repo explains why not namedtuple. As for TypedDict, it’s possible if I thought a dict API made sense. But since I wanted __slots_-, I don’t think it’s a good fit.

3 Likes

I like Brett’s @record decorator as a really elegant way to construct a frozen and slotted dataclass, but would argue against adding syntax for any record proposal. IMO the costs to users and learners of expanding the language with another way to define custom objects are far larger than the marginal benefits over existing and possible libraries.

15 Likes

I’m wondering how this will be added to CPython, are you planning to create a built-in module which contains your record type → records.py or something similar?

Right now there’s no plan for it to be added. I will probably post it to PyPI, write a blog post, and see what the reaction is. If it’s good enough I may pursue a PEP, even if it’s just to get it rejected so the SC can specify what would be needed to bring in alternative syntax for a simplistic class syntax. That would help guide/shutdown future discussions on this topic as it comes up often enough.

6 Likes

I like the general look of this. I would definitely prefer to see PEP-defined syntax vs the decorator, but I’d be tempted to use it even with the decorator syntax.

I particularly like the emphasis on immutability, performance (__slots__), and structural subtyping. Also the use of __annotations__ to allow runtime introspection, which has been proven to offer major ecosystem advantages.

How would tools like mypy or pyright handle what you’ve written here, or would that be a future exercise for them to properly support? This seems like a perennially tricky issue for new data representation systems - you want to get the benefit of looking like a reasonably standard class, but as far as I know there’s still no way to represent the true semantics of these sorts of things to the type checkers. In my experience, for instance, mypy is going to look at something this like a nominal class even if it’s meant to ‘act’ structurally.

1 Like

I think it’s a future exercise for the tools, but I could be wrong if there’s some mechanism I’m unaware of which would make it so you can statically define what the functions new return type is (which I doubt since even the class’ shape isn’t technically known ahead of time).

1 Like

I know some people want syntactic support but I quite like it the way it is atm. I also think something like this (if proposed and accepted) would live quite well in the dataclasses module. Not much to add beyond that, good job.

6 Likes

Would the “alternative syntax” be the struct type? Or the type proposed here and the struct idea is different? In other words, is this type(record) representing the main idea behind the struct type?