Introducing Record Types in Python

Dear Python Community,

I would like to propose the introduction of a new record type in Python, which would serve as an extension of the existing class. This proposal aims to simplify and enhance the way we define and work with data structures, providing a more concise and Pythonic syntax.

Proposal Overview: The proposed record type, let’s call it “Record,” would allow us to define data structures in a manner similar to classes but with several advantages:

1. Simplified Syntax: The definition of a record would be streamlined, allowing us to specify its fields directly within the class definition, eliminating the need for the @dataclass decorator. This would make our code cleaner and more readable.

pythonCopy code

record MyRecord:
    id: int
    name: str = "Unnamed"

The above code would be equivalent to the following class definition:

pythonCopy code

from typing import Optional

class MyRecord:
    def __init__(self, id: Optional[int], name: str = "Unnamed"):
        self.id = id
        self.name = name

2. Improved Representation: When we print an instance of a record, it would be displayed as a dictionary, making it more intuitive for developers. For example, instead of seeing “MyRecord(id=0),” we would see { 'id': 0, 'name': 'Unnamed' }. This enhanced readability is particularly beneficial when debugging and interacting with data.

3. Automatic __name__ Property: Records would automatically have a __name__ property that stores the name of the record, making it easier to identify the type of data structure being used.

Why Records? Introducing records into Python would enhance the language’s expressiveness and readability. It would empower developers to define data structures more concisely, allowing us to focus on our problem-solving tasks rather than boilerplate code. This proposal aligns with Python’s commitment to readability and simplicity.

Conclusion: I believe that introducing record types in Python is a natural evolution for the language. It would simplify our code, enhance data structure representation, and maintain Python’s commitment to clean and readable code.

I urge the Python community to consider this proposal and discuss the potential benefits and challenges. Let’s work together to make Python even more powerful and developer-friendly.

Thank you for your attention, and I look forward to your feedback and support.

Sincerely, Marcos Stefani Rosa

1 Like

Are you familiar with dataclasses? They offer several of the features you’re looking for, while still fundamentally being classes and thus backward compatible.

13 Likes

Much like Chris says, this is essentially just syntactic support for dataclasses. I don’t think that’s necessary, and I don’t think the decorator adds or removes readability.

5 Likes

This also seems similar to Brett’s earlier proposal of a struct syntax, though I believe that his didn’t have defaults.

A

2 Likes

Yes, for me the dataclass works, but it is common for Python to have to adapt through a decorator, greatly delaying its evolution as a language.

1 Like

It seems to me that the only ‘pro’ of your proposal (assuming for the sake of argument that that is a pro) is eliminating the need for that decorator. But that really doesn’t have any impact on readability imo.
There are huge costs however - both in terms of backwards compatibility, overall type consistency (since this will now have to work together with other classes etc) and maintenance - so, I predict this proposal will not get any traction.

3 Likes

Why is the type Optional[int] when there’s no default? Shouldn’t it only be Optional if the default is None? If there’s no default, it should be a required parameter. If “id” is a special attr, that field name is already used by ORMs like Django’s.

In fact there’s a terminology clash, which could potentially cause more confusion with ORMs - in which a class represents a table, and an instance of that class represents a record in that table.

You can already write custom classes and meta classes that implement all this to your heart’s content.

What you call “enhanc(ing) the language’s expressiveness and readability”, I call bloat.

The proposed record is just a class under the hood. Call a spade a spade.

If attrs doesn’t do it already, I’d be interested if record subclasses could add or even remove default values, something that I’ve been unable to do with dataclasses that limits them for me.

4 Likes

I disagree - Python can evolve WAY faster in this way than if everything had to have syntactic support! The bar for new syntax is VERY high. If we didn’t have dataclasses and someone proposed this completely different type of class, it would probably just not happen. Instead, we have the ability to (a) create the feature entirely outside of the language, (b) add it to the standard library, (c) backport it to older languages, and/or (d) test out different versions of the idea easily and without breakage. Any and all can be done, sequentially or concurrently.

9 Likes

My proposal supports defaults. :slightly_smiling_face:

I’m slowly working towards a proof-of-concept in pure Python code for my own record type, but other stuff has taken up my time lately.

3 Likes

On Brett’s idea is there be any towards special benefits where it like directly translate to a struct in C with precise requirements for the type of the declare field?

Probably not because it looks like the typing is optional

1 Like

At one point I considered adding syntactic support for dataclasses, primarily for two reasons: the startup speed that has been mentioned, and the ability to add __slots__ without the need to create another class. There would be other advantages to adding methods while the class is being created (none of which I can think of right now, but there have been issues filed where the non-obvious answer is “because the class already exists”). And maybe we could rethink if slots=True should be the default, and other design issues.

The syntax I toyed with was:

dataclass(frozen=True, <etc.>) ClassName:

Mainly because that’s a typo I make all the time.

This of course would unfairly advantage dataclasses over attrs and similar libraries. But maybe we could live with that. And now we’d have two ways to create classes, although I think it’s important that there be no new type of class being created, just another way to create the same class objects we have now.

Anyway, I’m just thinking out loud. Maybe someone has some other thoughts on it.

3 Likes

That’s more convincing. But how would creating a new keyword struct play with the core library of the same name? struct — Interpret bytes as packed binary data — Python 3.11.5 documentation

I know it’s conceptually similar to C and rust structs, but is it too late to rename it?

1 Like

The (newish) PEG parser that Python uses has support for a concept called “soft-keywords” – this is the same as used by match/case and the new type keyword introduced by PEP 695. It means that keywords are context dependent, so the struct module would be fine.

Note Brett’s article states (emphasis mine):

the new (soft) keyword struct

A

3 Likes

Thanks - I’d forgotten about that.

No because that wouldn’t really work anyway unless you were doing a Cython-like thing, in which case you can just require that by your tooling.

Correct, and the typing being optional is by design.

All I’ve written is a blog post and started coding up a proof-of-concept via a decorator just to show the ergonomics, so anything can change. :wink: I’m currently toying with the name record.

2 Likes

Honestly, as a lover of good OO programming and who loves Python, I would really like to see the language evolve without alternatives. I think it’s time to take ideas generated using annotations like dataclass and definitely implement them in a disruptive way. I think that if the community continues thinking about maintaining what it has as much as possible and we make adaptations, we won’t have interfaces in Python for example any time soon and consequently we will continue to fall short in terms of design patterns. I respect all opinions and understand the reason for the disagreements, I just think we need to think about improving the language above what is easier.

1 Like

Very interesting. :slightly_smiling_face:

Are you aware of msgspec Struct?

This is very close to what you are describing, and it’s already implemented in C and is very fast. I started looking at it as I was learning about attribute validation.

Coincidentally, I had just posted about it a couple days ago to the faster-python issue tracker

Results (smaller is better):

import (μs) create (μs) equality (μs) order (μs)
msgspec 9.92 0.09 0.02 0.03
standard classes 6.86 0.45 0.13 0.29
dataclasses 489.07 0.47 0.27 0.30

As I’m reading more and more Python library code, I see a tendency to write “dataclass-like” structures via normal classes, rather than use dataclasses. I assume this is due to performance concerns?. I wonder if they’d opt to use a Struct if it was available in the std-lib, since the create/eq/order are faster than a normal class.

1 Like

I really liked the solution. But I really wanted to get away from using classes or decorators to solve everything. I know that Python is very flexible, I decided after thinking a lot about putting the 2 suggestions here on the forum with the 2 points that I understand would help the language a lot. Now, I’m going to hope this at least generates discussion. I think it’s the least I can do :slight_smile:

@brettcannon for fun, here’s a decorator wrapper for msgspec.Struct that we’ll call record

I added your replace and asdict functions. This doesn’t generate the TypedDict type hints. and I didn’t take the time to wrap my head around your __repr__ example.

I agree it’d be cool if we can just do struct Point(x: float, y: float)

For now you’ll have to do:

@record
def Point(x: float, y: float): pass

p = Point(1.0, 1.0)
print(p.asdict())  # Output: {'x': 1.0, 'y': 1.0}
new_point = p.replace(y=2.0)
print(new_point.asdict())  # Output: {'x': 1.0, 'y': 2.0}
1 Like