I actually have a prototype implementation of my idea that just needs an implementation of __repr__
, then I plan to make it public (annotations and such were a little tricky).
I now have a pure Python proof-of-concept of my record
idea. I will blog about it at some point, but since this is an active discussion I wanted to share as soon as I made the repo public.
Whenever I usually end up using KW_ONLY
with dataclasses, itās to make the fields that have default values be keyword-only. So, I personally would be okay with a record type syntax which didnāt give me fine-grained control over kw_only args, but which had the option to turn all fields with default values into kw_only args.
I like it! Iād happily use this as a simpler alternative to dataclasses for a lot of my use cases. It would be convenient if it were in the stdlib[1], say as dataclasses.record
. Iām not particularly convinced it needs to be syntax, but I wouldnāt mind if it was.
Iām not personally bothered by the fact that decorating a def statement defines a class. I can imagine that this might make some people uncomfortable, but itās not that big a deal to me.
Iāve still not found a way that Iām comfortable with to have a bunch of 3rd party dependencies available āby defaultā, so being in the stdlib is still a significant plus for me. ā©ļø
My primary concern is it contributing to product type bloat in Python. To illustrate the point, you may want to also compare it to TypedDicts, attrs classes, Pydantic classes and msgspec classes.
At first blush I wondered about introspectability, for example cattrs needs to know if attributes have defaults or not. But I think that information is available by inspecting the __init__
?
My gut feeling is that equality based on shape sounds iffy.
Iād also be curious to see how complicated would supporting generic classes be (maybe not that much?).
Having syntactical support (a soft keyword) might enable greater optimisations or specialisations; personally Iād be interested in this as a built-in workaround form of an immutable dictionary (as Brett noted in his struct
article). But pure-Python would certainly be a lower barrier to inclusion!
A
Iām sure it does, but Iām labelling this a proof-of-concept to see if the reaction is positive enough to propose a PEP to make this something syntactic which makes that concern go away.
Yes it is.
Maybe, but I at least want to try it for now. I come from the duck typing age of Python when typing was just about the shape of things, and so Iām not bothered by leaning into it more since __slots__
gives us a form of structural typing that can easily be introspected.
Itās not difficult so switch to an isinstance()
check to be faster and rely on nominal typing for equivalence. My guess is thatās what most people are used to.
My guess is I will have to copy something over from the function to the class, much like I do for __annotations__
(plus maybe some transformation like I had to do with typing.Unpack
which was honestly the hardest part of making things work).
It definitely would on top proper support by type checkers. You could also cache the hash value, not have to generate dynamic Python code, etc.
Why not a base class like typeddict/namedtuple
The README in that repo explains why not namedtuple
. As for TypedDict
, itās possible if I thought a dict API made sense. But since I wanted __slots_-
, I donāt think itās a good fit.
I like Brettās @record
decorator as a really elegant way to construct a frozen and slotted dataclass, but would argue against adding syntax for any record proposal. IMO the costs to users and learners of expanding the language with another way to define custom objects are far larger than the marginal benefits over existing and possible libraries.
Iām wondering how this will be added to CPython, are you planning to create a built-in module which contains your record type ā records.py
or something similar?
Right now thereās no plan for it to be added. I will probably post it to PyPI, write a blog post, and see what the reaction is. If itās good enough I may pursue a PEP, even if itās just to get it rejected so the SC can specify what would be needed to bring in alternative syntax for a simplistic class syntax. That would help guide/shutdown future discussions on this topic as it comes up often enough.
I like the general look of this. I would definitely prefer to see PEP-defined syntax vs the decorator, but Iād be tempted to use it even with the decorator syntax.
I particularly like the emphasis on immutability, performance (__slots__
), and structural subtyping. Also the use of __annotations__
to allow runtime introspection, which has been proven to offer major ecosystem advantages.
How would tools like mypy
or pyright
handle what youāve written here, or would that be a future exercise for them to properly support? This seems like a perennially tricky issue for new data representation systems - you want to get the benefit of looking like a reasonably standard class, but as far as I know thereās still no way to represent the true semantics of these sorts of things to the type checkers. In my experience, for instance, mypy
is going to look at something this like a nominal class even if itās meant to āactā structurally.
I think itās a future exercise for the tools, but I could be wrong if thereās some mechanism Iām unaware of which would make it so you can statically define what the functions new return type is (which I doubt since even the classā shape isnāt technically known ahead of time).
I know some people want syntactic support but I quite like it the way it is atm. I also think something like this (if proposed and accepted) would live quite well in the dataclasses module. Not much to add beyond that, good job.
Would the āalternative syntaxā be the struct type? Or the type proposed here and the struct idea is different? In other words, is this type(record) representing the main idea behind the struct type?
It is for me, yes. I still need to write a blog post about the project, but it is now up on PyPI at record-type Ā· PyPI to make it easier for people to play with.
The record
class in Java for example and the struct
in C have some similarities, but also significant differences:
Similarities:
-
Data Storage: Both records in Java and structs in C are used for storing data. They define a structure that can hold multiple fields or data members.
-
Direct Access: Fields in records and structs can be accessed directly, without the need for accessor methods.
Differences:
-
Immutability: In Java, records are immutable by default, meaning that once created, their fields cannot be modified. In contrast, structs in C are not automatically immutable and allow direct field modification.
-
Methods and Behavior: In Java, records automatically generate methods like
equals()
,hashCode()
, andtoString()
, as well as accessor methods. In C, structs do not generate methods automatically, and any functionality related to these operations must be implemented manually. -
Object-Oriented Nature: Java is an object-oriented language, and records are an extension of this paradigm. They can inherit from other classes, implement interfaces, and follow principles of inheritance and polymorphism. Structs in C are not part of an object-oriented paradigm and lack these features.
-
Static Typing: Java is a statically typed language, meaning that the data types of fields in a record are known at compile time. In C, typing is more flexible and can be used with more freedom, including creating structs with fields of different types.
Therefore, while records in Java and structs in C share the idea of data structures for storing information, they differ in immutability, behavior, object-oriented nature, and typing, reflecting the differences between the Java and C languages.
My original Idea was get the concept of Java records to Python. And have a little difference between record and struct.
Wouldnāt that be the same as dataclasses?
Yes, but I would like to have this on language and not decorators