Introducing Record Types in Python

marcosstefani · September 22, 2023, 1:41pm

Dear Python Community,

I would like to propose the introduction of a new record type in Python, which would serve as an extension of the existing class. This proposal aims to simplify and enhance the way we define and work with data structures, providing a more concise and Pythonic syntax.

Proposal Overview: The proposed record type, let’s call it “Record,” would allow us to define data structures in a manner similar to classes but with several advantages:

1. Simplified Syntax: The definition of a record would be streamlined, allowing us to specify its fields directly within the class definition, eliminating the need for the @dataclass decorator. This would make our code cleaner and more readable.

pythonCopy code

record MyRecord:
    id: int
    name: str = "Unnamed"

The above code would be equivalent to the following class definition:

pythonCopy code

from typing import Optional

class MyRecord:
    def __init__(self, id: Optional[int], name: str = "Unnamed"):
        self.id = id
        self.name = name

2. Improved Representation: When we print an instance of a record, it would be displayed as a dictionary, making it more intuitive for developers. For example, instead of seeing “MyRecord(id=0),” we would see { 'id': 0, 'name': 'Unnamed' }. This enhanced readability is particularly beneficial when debugging and interacting with data.

3. Automatic __name__ Property: Records would automatically have a __name__ property that stores the name of the record, making it easier to identify the type of data structure being used.

Why Records? Introducing records into Python would enhance the language’s expressiveness and readability. It would empower developers to define data structures more concisely, allowing us to focus on our problem-solving tasks rather than boilerplate code. This proposal aligns with Python’s commitment to readability and simplicity.

Conclusion: I believe that introducing record types in Python is a natural evolution for the language. It would simplify our code, enhance data structure representation, and maintain Python’s commitment to clean and readable code.

I urge the Python community to consider this proposal and discuss the potential benefits and challenges. Let’s work together to make Python even more powerful and developer-friendly.

Thank you for your attention, and I look forward to your feedback and support.

Sincerely, Marcos Stefani Rosa

Rosuav · September 22, 2023, 1:45pm

Are you familiar with dataclasses? They offer several of the features you’re looking for, while still fundamentally being classes and thus backward compatible.

ajoino · September 22, 2023, 1:52pm

Much like Chris says, this is essentially just syntactic support for dataclasses. I don’t think that’s necessary, and I don’t think the decorator adds or removes readability.

AA-Turner · September 22, 2023, 1:53pm

This also seems similar to Brett’s earlier proposal of a struct syntax, though I believe that his didn’t have defaults.

A

marcosstefani · September 22, 2023, 2:03pm

Yes, for me the dataclass works, but it is common for Python to have to adapt through a decorator, greatly delaying its evolution as a language.

hansgeunsmeyer · September 22, 2023, 2:07pm

It seems to me that the only ‘pro’ of your proposal (assuming for the sake of argument that that is a pro) is eliminating the need for that decorator. But that really doesn’t have any impact on readability imo.
There are huge costs however - both in terms of backwards compatibility, overall type consistency (since this will now have to work together with other classes etc) and maintenance - so, I predict this proposal will not get any traction.

JamesParrott · September 22, 2023, 2:09pm

Why is the type Optional[int] when there’s no default? Shouldn’t it only be Optional if the default is None? If there’s no default, it should be a required parameter. If “id” is a special attr, that field name is already used by ORMs like Django’s.

In fact there’s a terminology clash, which could potentially cause more confusion with ORMs - in which a class represents a table, and an instance of that class represents a record in that table.

You can already write custom classes and meta classes that implement all this to your heart’s content.

What you call “enhanc(ing) the language’s expressiveness and readability”, I call bloat.

The proposed record is just a class under the hood. Call a spade a spade.

If attrs doesn’t do it already, I’d be interested if record subclasses could add or even remove default values, something that I’ve been unable to do with dataclasses that limits them for me.

Rosuav · September 22, 2023, 2:13pm

I disagree - Python can evolve WAY faster in this way than if everything had to have syntactic support! The bar for new syntax is VERY high. If we didn’t have dataclasses and someone proposed this completely different type of class, it would probably just not happen. Instead, we have the ability to (a) create the feature entirely outside of the language, (b) add it to the standard library, (c) backport it to older languages, and/or (d) test out different versions of the idea easily and without breakage. Any and all can be done, sequentially or concurrently.

brettcannon · September 22, 2023, 6:59pm

My proposal supports defaults.

I’m slowly working towards a proof-of-concept in pure Python code for my own record type, but other stuff has taken up my time lately.

Melendowski · September 22, 2023, 7:17pm

On Brett’s idea is there be any towards special benefits where it like directly translate to a struct in C with precise requirements for the type of the declare field?

Probably not because it looks like the typing is optional

ericvsmith · September 22, 2023, 7:18pm

At one point I considered adding syntactic support for dataclasses, primarily for two reasons: the startup speed that has been mentioned, and the ability to add __slots__ without the need to create another class. There would be other advantages to adding methods while the class is being created (none of which I can think of right now, but there have been issues filed where the non-obvious answer is “because the class already exists”). And maybe we could rethink if slots=True should be the default, and other design issues.

The syntax I toyed with was:

dataclass(frozen=True, <etc.>) ClassName:

Mainly because that’s a typo I make all the time.

This of course would unfairly advantage dataclasses over attrs and similar libraries. But maybe we could live with that. And now we’d have two ways to create classes, although I think it’s important that there be no new type of class being created, just another way to create the same class objects we have now.

Anyway, I’m just thinking out loud. Maybe someone has some other thoughts on it.

JamesParrott · September 22, 2023, 7:25pm

That’s more convincing. But how would creating a new keyword struct play with the core library of the same name? struct — Interpret bytes as packed binary data — Python 3.11.5 documentation

I know it’s conceptually similar to C and rust structs, but is it too late to rename it?

AA-Turner · September 22, 2023, 8:11pm

The (newish) PEG parser that Python uses has support for a concept called “soft-keywords” – this is the same as used by match/case and the new type keyword introduced by PEP 695. It means that keywords are context dependent, so the struct module would be fine.

Note Brett’s article states (emphasis mine):

the new (soft) keyword struct

A

JamesParrott · September 22, 2023, 9:04pm

Thanks - I’d forgotten about that.

brettcannon · September 22, 2023, 9:05pm

No because that wouldn’t really work anyway unless you were doing a Cython-like thing, in which case you can just require that by your tooling.

Correct, and the typing being optional is by design.

All I’ve written is a blog post and started coding up a proof-of-concept via a decorator just to show the ergonomics, so anything can change. I’m currently toying with the name record.

marcosstefani · September 22, 2023, 9:40pm

Honestly, as a lover of good OO programming and who loves Python, I would really like to see the language evolve without alternatives. I think it’s time to take ideas generated using annotations like dataclass and definitely implement them in a disruptive way. I think that if the community continues thinking about maintaining what it has as much as possible and we make adaptations, we won’t have interfaces in Python for example any time soon and consequently we will continue to fall short in terms of design patterns. I respect all opinions and understand the reason for the disagreements, I just think we need to think about improving the language above what is easier.

marcosstefani · September 22, 2023, 10:11pm

Very interesting.

ssweber · September 22, 2023, 11:10pm

Are you aware of msgspec Struct?

This is very close to what you are describing, and it’s already implemented in C and is very fast. I started looking at it as I was learning about attribute validation.

Coincidentally, I had just posted about it a couple days ago to the faster-python issue tracker

Results (smaller is better):

	import (μs)	create (μs)	equality (μs)	order (μs)
msgspec	9.92	0.09	0.02	0.03
standard classes	6.86	0.45	0.13	0.29
dataclasses	489.07	0.47	0.27	0.30

As I’m reading more and more Python library code, I see a tendency to write “dataclass-like” structures via normal classes, rather than use dataclasses. I assume this is due to performance concerns?. I wonder if they’d opt to use a Struct if it was available in the std-lib, since the create/eq/order are faster than a normal class.

marcosstefani · September 22, 2023, 11:25pm

I really liked the solution. But I really wanted to get away from using classes or decorators to solve everything. I know that Python is very flexible, I decided after thinking a lot about putting the 2 suggestions here on the forum with the 2 points that I understand would help the language a lot. Now, I’m going to hope this at least generates discussion. I think it’s the least I can do

ssweber · September 23, 2023, 1:20am

@brettcannon for fun, here’s a decorator wrapper for msgspec.Struct that we’ll call record

I added your replace and asdict functions. This doesn’t generate the TypedDict type hints. and I didn’t take the time to wrap my head around your __repr__ example.

I agree it’d be cool if we can just do struct Point(x: float, y: float)

For now you’ll have to do:

@record
def Point(x: float, y: float): pass

p = Point(1.0, 1.0)
print(p.asdict())  # Output: {'x': 1.0, 'y': 1.0}
new_point = p.replace(y=2.0)
print(new_point.asdict())  # Output: {'x': 1.0, 'y': 2.0}

Topic		Replies	Views
Can you explain this code for me(type hints)? Python Help	3	333	August 30, 2023
Now that we've promoted generics and type aliases to syntax, why not do the same for enums, structs and abstract classes? Ideas	11	1086	October 8, 2023
How to Implement Positional Type Hinting for Iterables in Python? Python Help typing	2	414	September 4, 2023
Add type aware dict to dataclass validation library Ideas	5	712	January 7, 2024
Type annotations, PEP 649 and PEP 563 Core Development	25	6666	October 4, 2023

Introducing Record Types in Python

Related Topics