Struct in python

Hey @brettcannon , just wondering how far you went with this; Proposing a struct syntax for Python

1 Like

I did GitHub - brettcannon/record-type: Proof-of-concept `record` type for Python and that’s it. I never got enough support from the community to pursue it any farther.

3 Likes

Given the renewed interest in freezing objects for safer sharing in free-threaded environments (including low overhead sharing between sub interpreters), the idea may be worth revisiting.

4 Likes

In what form can this support be rendered? Cause I think it’s a feature that can really work in python’s case. I’d list down the ways this can improve certain kinds of programs in python.

Personally, though I like Brett’s record syntax for more convenient dataclasses, I would rather see proper ADT (“Rust enums”) in Python. I’m a frequent user of both enums and dataclasses, but they’ve always been pretty crufty to use, and this’d replace both of them with something much more natural and compact.

Perhaps we could even use the soft keyword enum for it, despite the existing module name enum? I think we can use the record-like syntax for the enum members (including its support for *args, kw_only, **kwargs, as well as presumably positional-only with / that is not mentioned in the record suggestion, though I don’t show any of this below):

enum Command:
  Stop
  Send(item: SomeClass)

enum Event:
  Stop
  ItemReceived(item: SomeOtherClass)
  InvalidDataReceived(data: bytes)

I currently frequently end up writing patterns just like the above but using dataclasses and inheritance:

@dataclasses.dataclass(frozen=True)
class Command:
  pass

@dataclasses.dataclass(frozen=True)
class StopCommand(Command):
  pass

@dataclasses.dataclass(frozen=True)
class SendCommand(Command):
  item: SomeClass

@dataclasses.dataclass(frozen=True)
class Event:
  pass

@dataclasses.dataclass(frozen=True)
class StopEvent(Event):
  pass

@dataclasses.dataclass(frozen=True)
class ItemReceivedEvent(Event):
  item: SomeOtherClass

@dataclasses.dataclass(frozen=True)
class InvalidDataReceivedEvent(Event):
  data: bytes

As you can see, this is a lot of boilerplate, and namespace pollution, that’d completely disappear if we had native ADT enum support, so it’d clean my code up massively (as you can probably guess, they’re usually a lot longer than this, too).

I’d also like to see a way to specify a default for an enum, such that calling the enum type object with no arguments results in that value, even if it has to be done by setting a __default__ (or similarly named) attribute like this: (contrived example to show how you’d combine required and optional arguments in the general case; usually I’d have an atom as the default and it’d be the first member in the enum, like Atom here, but this shouldn’t need to always be the case)

enum Foo:
  Atom
  Bar(required_arg: str, optional_arg: int = 4)
Foo.__default__ = Foo.Bar('default_value')

...

assert Foo() == Foo.Bar('default_value', 4)

I’d like to be able to do parameter docstrings, though, and as far as I can see, the use of the function-definition-like syntax from record makes this not work:

enum Foo:
  Atom
  """
  docstring for Atom
  """
  HasArg(
    """
    docstring for HasArg
    """  # how would this get associated with `HasArg`?
    arg: str,  # should there be a comma here?
    """
    docstring for arg
    """  # if there's a comma above, then how would this get associated with `arg`, and not, say, `another_arg` below?
    another_arg: int,
    # no docstring here
  )

I know that for functions you have to describe parameters inside the docstring for the function itself, but I’ve always found this inconvenient, so with dataclasses it’s been very useful being able to docstring them field-by-field. It would be a shame to lose this for large, complex enums.

Maybe one possibility is to allow an alternative syntax for multi-line enum members:

enum Foo:
  """
  docstring for Foo
  """
  Atom
  """
  docstring for Atom
  """
  OneLiner(arg: str, another_arg: int = 4, *, kw_only: bool)
  """
  docstring for OneLiner, this could describe parameters in the way currently done for function parameters
  """
  MultiLiner:
    """
    docstring for MultiLiner
    """
    arg: str  # no comma here!
    """
    docstring for arg
    """
    another_arg: int
    # no docstring specified for another_arg, still works fine

If we need to write something like case MultiLiner: rather than just MultiLiner: to make this more practical to parse, that’d be fine too. Presumably, we’d need to allow the likes of /, *, *args: SomeType and **kwargs: SomeType on lines by themselves within MultiLiner: as well. Typing should of course still be optional within MultiLiner: (unlike within a dataclass) just as within the rest of an enum.

You can get pretty close by additionally using (actual) Enums and Literals and combining them with dataclasses (or any classes) in a union, instead of using inheritance with (sometimes empty) classes.

It’s still a little verbose and awkward but very usable. We should probably make this approach more ergonomic rather than inventing a new one (maybe with some syntax sugar).

Edit: more details.

Maybe, but I’m also not sure if there are enough users yet to convince folks its worth it.

You need to convince the community overall it’s worth it. Most people I have talked to about this either go, “neat,” or, “why is this that much better than dataclasses and having two ways to do things?”

I also tried that with my friend, Dusty Phillips: GitHub - dusty-phillips/match-variant: Python variant types that work with match . Once again, need to get a large swell of the community to back the idea.

1 Like

You can get fairly close with a base class that applies @brettcannon’s records.record to all functions in the namespace of a subclass:

from records import record

class Records:
    def __init_subclass__(cls):
        for name, obj in vars(cls).items():
            if callable(obj) and not name.startswith('__'):
                setattr(cls, name, record(obj))

class Command(Records):
    def Stop(): '''Command Stop'''
    def Send(item: str): '''Command Send with an item'''

print(Command.Stop()) # outputs Stop()
print(Command.Send('foo')) # outputs Send(item='foo')

I don’t quite see the value in a default but you can implement it as a decorator like this:

def default_record(*args, **kwargs):
    def decorator(func):
        func._default_args = args, kwargs
        return func
    return decorator

class Records:
    def __new__(cls):
        return cls._default
        
    def __init_subclass__(cls):
        cls._default = None
        for name, obj in vars(cls).items():
            if callable(obj) and not name.startswith('__'):
                setattr(cls, name, new_record := record(obj))
                if default_args := getattr(obj, '_default_args', None):
                    args, kwargs = default_args
                    cls._default = new_record(*args, **kwargs)

Usage:

class Command(Records):
    @default_record(item='default')
    def Send(item: str): '''Command Send with an item'''

print(Command.Send('foo')) # outputs Send(item='foo')
print(Command()) # outputs Send(item='default')