Struct in python

Hey @brettcannon , just wondering how far you went with this; Proposing a struct syntax for Python

1 Like

I did GitHub - brettcannon/record-type: Proof-of-concept `record` type for Python and thatā€™s it. I never got enough support from the community to pursue it any farther.

3 Likes

Given the renewed interest in freezing objects for safer sharing in free-threaded environments (including low overhead sharing between sub interpreters), the idea may be worth revisiting.

6 Likes

In what form can this support be rendered? Cause I think itā€™s a feature that can really work in pythonā€™s case. Iā€™d list down the ways this can improve certain kinds of programs in python.

Personally, though I like Brettā€™s record syntax for more convenient dataclasses, I would rather see proper ADT (ā€œRust enumsā€) in Python. Iā€™m a frequent user of both enums and dataclasses, but theyā€™ve always been pretty crufty to use, and thisā€™d replace both of them with something much more natural and compact.

Perhaps we could even use the soft keyword enum for it, despite the existing module name enum? I think we can use the record-like syntax for the enum members (including its support for *args, kw_only, **kwargs, as well as presumably positional-only with / that is not mentioned in the record suggestion, though I donā€™t show any of this below):

enum Command:
  Stop
  Send(item: SomeClass)

enum Event:
  Stop
  ItemReceived(item: SomeOtherClass)
  InvalidDataReceived(data: bytes)

I currently frequently end up writing patterns just like the above but using dataclasses and inheritance:

@dataclasses.dataclass(frozen=True)
class Command:
  pass

@dataclasses.dataclass(frozen=True)
class StopCommand(Command):
  pass

@dataclasses.dataclass(frozen=True)
class SendCommand(Command):
  item: SomeClass

@dataclasses.dataclass(frozen=True)
class Event:
  pass

@dataclasses.dataclass(frozen=True)
class StopEvent(Event):
  pass

@dataclasses.dataclass(frozen=True)
class ItemReceivedEvent(Event):
  item: SomeOtherClass

@dataclasses.dataclass(frozen=True)
class InvalidDataReceivedEvent(Event):
  data: bytes

As you can see, this is a lot of boilerplate, and namespace pollution, thatā€™d completely disappear if we had native ADT enum support, so itā€™d clean my code up massively (as you can probably guess, theyā€™re usually a lot longer than this, too).

Iā€™d also like to see a way to specify a default for an enum, such that calling the enum type object with no arguments results in that value, even if it has to be done by setting a __default__ (or similarly named) attribute like this: (contrived example to show how youā€™d combine required and optional arguments in the general case; usually Iā€™d have an atom as the default and itā€™d be the first member in the enum, like Atom here, but this shouldnā€™t need to always be the case)

enum Foo:
  Atom
  Bar(required_arg: str, optional_arg: int = 4)
Foo.__default__ = Foo.Bar('default_value')

...

assert Foo() == Foo.Bar('default_value', 4)

Iā€™d like to be able to do parameter docstrings, though, and as far as I can see, the use of the function-definition-like syntax from record makes this not work:

enum Foo:
  Atom
  """
  docstring for Atom
  """
  HasArg(
    """
    docstring for HasArg
    """  # how would this get associated with `HasArg`?
    arg: str,  # should there be a comma here?
    """
    docstring for arg
    """  # if there's a comma above, then how would this get associated with `arg`, and not, say, `another_arg` below?
    another_arg: int,
    # no docstring here
  )

I know that for functions you have to describe parameters inside the docstring for the function itself, but Iā€™ve always found this inconvenient, so with dataclasses itā€™s been very useful being able to docstring them field-by-field. It would be a shame to lose this for large, complex enums.

Maybe one possibility is to allow an alternative syntax for multi-line enum members:

enum Foo:
  """
  docstring for Foo
  """
  Atom
  """
  docstring for Atom
  """
  OneLiner(arg: str, another_arg: int = 4, *, kw_only: bool)
  """
  docstring for OneLiner, this could describe parameters in the way currently done for function parameters
  """
  MultiLiner:
    """
    docstring for MultiLiner
    """
    arg: str  # no comma here!
    """
    docstring for arg
    """
    another_arg: int
    # no docstring specified for another_arg, still works fine

If we need to write something like case MultiLiner: rather than just MultiLiner: to make this more practical to parse, thatā€™d be fine too. Presumably, weā€™d need to allow the likes of /, *, *args: SomeType and **kwargs: SomeType on lines by themselves within MultiLiner: as well. Typing should of course still be optional within MultiLiner: (unlike within a dataclass) just as within the rest of an enum.

You can get pretty close by additionally using (actual) Enums and Literals and combining them with dataclasses (or any classes) in a union, instead of using inheritance with (sometimes empty) classes.

Itā€™s still a little verbose and awkward but very usable. We should probably make this approach more ergonomic rather than inventing a new one (maybe with some syntax sugar).

Edit: more details.

Maybe, but Iā€™m also not sure if there are enough users yet to convince folks its worth it.

You need to convince the community overall itā€™s worth it. Most people I have talked to about this either go, ā€œneat,ā€ or, ā€œwhy is this that much better than dataclasses and having two ways to do things?ā€

I also tried that with my friend, Dusty Phillips: GitHub - dusty-phillips/match-variant: Python variant types that work with match . Once again, need to get a large swell of the community to back the idea.

1 Like

You can get fairly close with a base class that applies @brettcannonā€™s records.record to all functions in the namespace of a subclass:

from records import record

class Records:
    def __init_subclass__(cls):
        for name, obj in vars(cls).items():
            if callable(obj) and not name.startswith('__'):
                setattr(cls, name, record(obj))

class Command(Records):
    def Stop(): '''Command Stop'''
    def Send(item: str): '''Command Send with an item'''

print(Command.Stop()) # outputs Stop()
print(Command.Send('foo')) # outputs Send(item='foo')

I donā€™t quite see the value in a default but you can implement it as a decorator like this:

def default_record(*args, **kwargs):
    def decorator(func):
        func._default_args = args, kwargs
        return func
    return decorator

class Records:
    def __new__(cls):
        return cls._default
        
    def __init_subclass__(cls):
        cls._default = None
        for name, obj in vars(cls).items():
            if callable(obj) and not name.startswith('__'):
                setattr(cls, name, new_record := record(obj))
                if default_args := getattr(obj, '_default_args', None):
                    args, kwargs = default_args
                    cls._default = new_record(*args, **kwargs)

Usage:

class Command(Records):
    @default_record(item='default')
    def Send(item: str): '''Command Send with an item'''

print(Command.Send('foo')) # outputs Send(item='foo')
print(Command()) # outputs Send(item='default')

This may be a long shot, but it would be really cool if Python supported a very concise syntax for records/structs like the following:

$ hparams = (opt='sgd' lr=0.001 bsize=32)
$ print(hparams.lr)
0.001
$ print(hparams)
(opt='sgd' lr=0.001 bsize=32)

It could also be

[opt='sgd' lr=0.001 bsize=32]

or

{opt='sgd' lr=0.001 bsize=32}

That syntax without commas looks rather unPythonic and ambiguity-prone.

In most cases youā€™d want to first define which field names are allowed for each record, and likely also the types of each field, in which case you should be looking at collections.namedtuple and typing.NamedTuple. If you donā€™t need any predefined schema, then types.SimpleNamespace should suffice.

You can also take a look at @brettcannonā€™s record-type project, which offers several additional features to make it helpful in many more use cases.

How about SimpleNamespace?

#!/usr/bin/env python3.13
from types import SimpleNamespace

a = SimpleNamespace({'opt':'sgd', 'lr':0.001, 'bsize':32} )
print(a)
print(a.opt)
print(a.lr)
print(a.bsize)

output is:
namespace(opt=ā€˜sgdā€™, lr=0.001, bsize=32)
sgd
0.001
32

It would look cleaner and closer to @Pythonistaā€™s desired syntax with keyword arguments:

a = SimpleNamespace(opt='sgd', lr=0.001, bsize=32)
2 Likes

argparse.Namespace has better performance (for some reason unknown to me) and shorter name.

ā€œNamespaces are one honking great idea ā€“ letā€™s do more of those!ā€

I, personally, would like to have performant namespace closer to the metal. I think I would use it much more if such lived inā€¦ builtins???

1 Like