Concise syntax for attribute access and assignment: e.g., `obj.(a, b, c) = 1, 2, 3`

bombs-kim · February 19, 2024, 9:25am

In this proposal, I introduce a novel syntax aimed at simplifying multiple attribute access and assignment within objects.

TLDR

How about if we allow this?

some_obj.(a, b, c) = 1, 2, 3
some_obj.(a, b, c) = some_obj.(b, c, a)

Rather than requiring this.

some_obj.a, some_obj.b, some_obj.c = 1, 2, 3
some_obj.a, some_obj.b, some_obj.c = (
    some_obj.b, some_obj.c, some_obj.a
)

Motivation

Class definitions often entail verbose patterns for initializing instance attributes, as exemplified below:

class MyClass:
    def __init__(self, foo, bar, baz, qux):
        self.foo = foo
        self.bar = bar
        self.baz = baz
        self.qux = qux

# Or equivalently

class MyClass:
    def __init__(self, foo, bar, baz, qux):
        self.foo, self.bar, self.baz, self.qux = foo, bar, baz, qux

Both versions require repeated typing of self. , a mundane task. There could be several approaches to relieve this verbosity, but it’s not easy to design a feature that achieves that without bringing more evils. For example, one can think about simply allowing the omission of self. in the definition of __init__ special method, but this would sacrifice explicitness and create confusion

If we adopt the proposed syntax that I will describe shortly, it would make the code easier to type, more concise, and improve readability. And there won’t be much sacrifice in the simplicity of the language, I presume.

Please also, note that the proposed syntax is not only for improving the situation described above, but for more general cases. The example should only be considered as a motivational one. The propose syntax needs not be only used in class definitions. Also, I propose defining both accessing and assignment, not just assignment, will provide more consistency.

Syntax

Multiple attribute assignment

The left-hand side of an assignment statement can be extended to support multiple attributes with fewer keystrokes, using a proposed syntax as follows:

class MyClass:
    def __init__(self, foo, bar, baz, qux):
        self.(foo, bar, baz, qux) = foo, bar, baz, qux

my = MyClass(1, 2, 3, 4)
# Now, my.foo == 1 and my.bar == 2
# and my.baz == 3 and my.qux == 4

my.(bar, baz) = 5, 6
# Now, my.foo == 1 and my.bar == 5
# and my.baz == 3 and my.qux == 6

Multiple attribute access

Similarly, accessing multiple attributes of an object can be streamlined into a single expression, which evaluates to a tuple of the accessed values:

# continuing from the previous snippet
print(my.(foo, bar, baz))  # Outputs: (1, 5, 3)

a, b = "Apple".(lower(), upper())
print(a, b)  # Outputs: apple APPLE

Nesting

Nesting is supported but needs not be encouraged

The proposed syntax does not prevent you from nested attribute access and assignment, allowing for complex expressions involving objects with deep attribute hierarchies. Note that Python already allows arbitrarily deep LHS variable nesting e.g. (a, (b, (c, (d,e)))) = (1, (2, (3, (4, 5)))).

Arbitrarily deep LHS variable nesting should not be encouraged to keep the code clean, but it does not mean we have to prevent it at the grammar level and I believe we allow this in Python since there is utility in allowing it. Nesting is supported in this proposal to provide a consistent user experience, but it does not always read to easier-to-read code.

class Node:
    def __init__(self, value):
        self.value = value

root = Node(4)
root.left = Node(2)
root.(left.(left, right), right) = (Node(1), Node(3)), Node(5)

# Accessing nested attributes
print(root.(left.(left.value, right.value), right.value))
# Outputs: ((1, 3), 5)

Interpretation

In the most usual cases, the interpretation should be straightforward.

my.(bar, baz) = (5, 6)

# Is equivalent to

my.bar, my.baz = 5, 6

Edge case

However, there is one syntactic form, I can imagine, that allows more than one way of interpretation.

class YourClass:
    ...

YourClass().(foo, bar) = 1, 2

# [Option 1]
YourClass().foo = 1
YourClass().bar = 2

# [Option 2]
tmp = YourClass()
tmp.foo, tmp.bar = 1, 2
del tmp
# NOTE: In actual implementation, creation and deletion
# of the new variable should not be necessary.

The code above is actually quite meaningless as it will not bind the new instance(s) to any new variable and not many people won’t need to write the same pattern in practice. However, it should still be handled for completeness.

This proposal suggests Option 2 as the correct interpretation YourClass().(foo, bar) = 1, 2 should cause creation of only one YourClass instance, not two.

Comment

This proposal seeks to introduce a more succinct and readable syntax for handling multiple attribute access and assignment. I believe it will affect the majority of existing Python programmers and will enhance their productivity in class design and many other tasks. Additionally, learning this new syntax should not require extensive teaching resources as it is not hard to guess its interpretation in most cases.

I understand that such a modification to the language is a significant undertaking that requires careful consideration of its impact on the existing codebase, developer tools, and the broader programming community. Also, there are more details that need to be discussed. It would be much appreciated if you could provide feedback and suggestions.

TomRitchford · February 19, 2024, 9:39am

There’s already a specific proposal here to deal with the named argument duplication that’s your motivating argument.

My unsubstantiated feeling is that self.(foo, bar, baz, qux) = foo, bar, baz, qux would cause too much difficulty in the Python parser, but it’s just a hunch.

I am certainly not enthusiastic about new language changes without a really strong benefit!

If such constructors are a chore for you, you might consider using dataclasses instead. They not only write the constructor for you, avoiding all these self.foo = foo lines, but also dataclasses automatically create other methods, like comparators and hashes.

(collections.namedtuple and typing.NamedTuple could also be used to avoid writing the constructor.)

bombs-kim · February 19, 2024, 10:10am

Thank you very much for the prompt feedback. To clarify, the PEP you linked is about shortening function calling, whereas my example aimed to demonstrate the shortening of init definitions. Thus, it appears the PEP mentioned addresses a different issue.

Regarding the benefits, I believe this proposal could significantly reduce the number of lines of code, similar to the impact of star_targets the grammar upon its introduction.

# This can be turned
a = 1
b = 2
# into this
a, b = 1, 2

I believe star_target was added to the language as it was believed to have huge benefits. The two patterns would be pretty similar and I would argue that my proposal would bring similar kinds of benefits. It’s also worth noting that my proposal could actually reduce the number of tokens, unlike the star_targets example above.

Regarding the feasibility of substituting my proposal with named tuples or other types, such replacements would not be somewhat tricky to apply for general use cases. In practice, initialization definitions often include operations other than attribute setting, making it difficult to apply these suggestions. Just to name a few, I suspect the examples below would benefit from my proposal.

Additionally, please note that the proposal aims to enhance assignment statements in general, not just to enhance init method definition. It seems my choice of a motivational example has been misleading to you.

TomRitchford · February 19, 2024, 10:26am

While the two proposals are definitely not the same, I feel they overlap enough that only one is needed, but I’m very open to being wrong, something I am rather a lot.

Regarding the examples, each of them would work very well with dataclasses - simply put the remaining members in __post_init__, which is called right after the constructor goes off.

pf_moore · February 19, 2024, 10:51am

Conciseness is very rarely sufficient justification for a new language feature on its own. Typically, if you want to argue for conciseness, you should be be looking at the wider question of expressiveness - does the new feature allow developers to write clearer code that expresses their intent more accurately or understandably. Even then, it’s hard to make the case without other, more concrete benefits. Prior atr, in the form of other languages implementing a similar feature, is usually helpful, as well.

In the case of this proposal, it seems neat, but of limited value. And I’m not at all sure I find something like foo.(a, b, c) = 1, 2, 3 to be more readable than foo.a, foo.b, foo.c = 1, 2, 3. Which brings up the point that being easy to read is far more important than being easy to write. Saving a bit of time for the writer of the code, at the expense of increasing work for the reader, is almost always a bad trade-off.

The more complex examples you give don’t immediately follow from the basic description you give - your example of "Apple".(lower(), upper()) is not something I’d have expected on an initial reading of the proposal. It’s also hard to understand how it fits with Python’s existing grammar/semantics - why are lower() and upper() not being treated as calls to global functions of those names? I think you need to write up a much more precise technical specification of your proposal if you want to avoid people dismissing it as being nothing more than a typing shortcut. You’d need to do that at some point anyway, if you plan on ever implementing this proposal, and doing it now will help you clarify the details of what you’re suggesting. Of course, writing a more detailed spec doesn’t guarantee people will like the idea any more than they do now…

Rosuav · February 19, 2024, 11:27am

I love the idea, but I’m really not enthused about the syntax - dot-openparens looks like an error. That said, though, I think there’s only one meaningful interpretation of the one you’re ambiguous on:

No, it should definitely be the other option: single evaluation of the object. It’s like how 1 < spam() < 10 will only evaluate spam() once, despite otherwise being equivalent to 1 < spam() and spam() < 10. It sounds like you also had that expectation, so I’d say go ahead and lock that in as defined semantics

The biggest use-cases for this syntax do have alternatives, though. As an alternative to the __init__ example, you could use a dataclass and not assign attributes at all. I’m sure there are still plenty of places for this to be useful, though.

Question: Have you considered whether this should be extended to subscripting too? Syntactically this may be more difficult, but also, given that I’m not sold on the existing syntax, having a think about subscripting variant of the same idea might help you come up with a better syntax for attribute access too. Certainly a “broadcast” syntax would be extremely useful there, too.

Stefan2 · February 19, 2024, 1:13pm

How would you rewrite that one? The way I imagine it, I’d find it very much harder to read.

DerSchinken · February 19, 2024, 1:24pm

I am against this. Although I see its usefulness, it might lead to confusion among beginners and generally less readability. Also, it seems a bit off to me, but that’s just an opinion.

bombs-kim · February 19, 2024, 1:28pm

I imagined applying the new syntax to a part of the code and grouping only up to 3~4 at once, like I usually do for assigning values to multiple variables in a line.

I would re-write this,

        self._prog = prog
        self._indent_increment = indent_increment
        self._max_help_position = min(max_help_position,
                                      max(width - 20, indent_increment * 2))
        self._width = width

        self._current_indent = 0
        self._level = 0
        self._action_max_length = 0

        self._root_section = self._Section(self, None)
        self._current_section = self._root_section

        self._whitespace_matcher = _re.compile(r'\s+', _re.ASCII)
        self._long_break_matcher = _re.compile(r'\n\n\n+')

into this,

        self.(_prog, _indent_increment, _width)  = (
            prog, indent_increment, width
        )
        self._max_help_position = min(max_help_position,
                                      max(width - 20, indent_increment * 2))
        self.(_current_indent, _level, _action_max_length) = (0, 0, 0)

        self._root_section = self._Section(self, None)
        self._current_section = self._root_section

        self._whitespace_matcher = _re.compile(r'\s+', _re.ASCII)
        self._long_break_matcher = _re.compile(r'\n\n\n+')

chepner · February 19, 2024, 4:29pm

Beomsoo Kim:

Multiple attribute access

Similarly, accessing multiple attributes of an object can be streamlined into a single expression, which evaluates to a tuple of the accessed values:
# continuing from the previous snippet
print(my.(foo, bar, baz))  # Outputs: (1, 5, 3)

a, b = "Apple".(lower(), upper())
print(a, b)  # Outputs: apple APPLE

This is already supported by operator.itemgetter, though without including method calls. With a little work, you can use methodcaller.

from operator import itemgetter, methodcaller

print(itemgetter("foo", "bar", "baz")(my))

a, b = (f("Apple") for f in methodcaller("lower"), methodcaller("upper"))

Not terribly readable, but I don’t find the proposed syntax an improvement over

a = "Apple".lower()
b = "Apple".upper()

in the first place. Not everything needs to be refactored into the least repetitive form possible.

I find

to be far less readable than what (I assume) it replaces. I’d rather not flatten trees
to lists in my head.

root.left.left = Node(1)
root.left.right = Node(3)
root.right = Node(5)

Stefan2 · February 19, 2024, 4:36pm

I find your rewrite much harder to read. The original has all the assignment targets neatly in a vertical line. With yours, I have to also search horizontally, the lines are longer, and it just looks like a mess.

gcewing · February 19, 2024, 11:06pm

[bombs-kim] Beomsoo Kim https://discuss.python.org/u/bombs-kim
bombs-kim
February 19

In the most usual cases, the interpretation should be straightforward.

my.(bar, baz) = (5, 6) # Is equivalent to my.bar, my.baz = 5, 6|

If this expansion applies to the RHS as well, it implies that my.(bar,
baz) is equivalent to the tuple constructor (my.bar, my.baz), which is
inconsistent with your function call example :

print(my.(foo, bar, baz))|

Ths would need to be written as

departure3560 · February 20, 2024, 6:45am

I’m in mixed opinion in whether this is a good enough proposal. Which means it might be a good idea. I’ll try to defend the author.

I disagree. In this case, the new syntax is much clearer for the reader that everything being manipulated is foo’s member.
Although one might say that the below syntax achieves the same purpose, albeit with more lines:

# pretty clear that they're all foo's member
foo.a = 1
foo.b = 2
foo.c = 3

Chris Angelico:

bombs-kim:
class YourClass:
    ...

YourClass().(foo, bar) = 1, 2

# Should the above be equivalent to this? (Option 1)
YourClass().foo = 1
YourClass().bar = 2
No, it should definitely be the other option: single evaluation of the object.

Yeah, I prefer single evaluation, since it’s written like so. I feel such a throwaway class doesn’t make much sense, but what if it’s a getter (e.g. @property)?

class Foo:
# ...
    @property
    def bar(self):
    # ...

foo = Foo()

def new_syntax():
    foo.bar.(a, b) = 1, 2

# same as above
def old_syntax():
    # evaluate only once
    bar = foo.bar
    bar.a = 1
    bar.b = 2

I find it much easier to read.

randolf-scholz · February 21, 2024, 2:41pm

@bombs-kim @pf_moore One very neat use-case for this would be in combination with NamedTuple return types.

Generally speaking, functions that return NamedTuple seems preferable to those returning regular tuple, since it is a form of self-documentation. On the other hand, manually writing a NamedTuple class for every function that returns a tuple is kind of annoying.

In An idea to allow implicit return of NamedTuples It was suggested to introduce some automatism for returning NamedTuple. I showed here how it could be done with a decorator.

If NamedTuple return types were more prevalent, this feature would be much more useful for concise unpacking by name. Chris Markiewicz remarked that currently it can be done via

from operator import attrgetter
U, S, Vh = attrgetter('U', 'S', 'Vh')(mySVD(A))

Henkhogan · February 23, 2024, 7:58pm

I‘d like to see such a feature, but the boilerplate code for init can just be prevented by using dataclassea

remram44 · February 25, 2024, 3:20pm

The obvious use-case here is when you have many such assignments to do, but given enough of them you would want to go multiline again. And it’s not pretty:

# Old syntax
self.foobar = 1
self.quux = quux * 3
self.xyzzy = get_current_xyzzy()

# New syntax: names and values are far apart
self.(
    foobar,
    quux,
    xyzzy,
) = (
    1,
    quux * 3,
    get_current_xyzzy(),
)

I think if you really want to shorten the required typing, something like a with-block makes more sense. But that can be a function:

# Short form (from this discussion's title)
assign(obj, a=1, b=2, c=3)

# Long form
assign(
    foo.bar,
    quux=quux * 3,
    xyzzy=get_current_xyzzy(),
)

Maybe something to add to operator, though it can easily be rewritten.

chepner · February 25, 2024, 4:50pm

Ironically, prior art for this would be the Pascal with statement (and because of this, it took me a non-trivial amount of time to understand Python’s with statement when it first came out because of this association).

{ Foo.A := 3; Foo.B := 5 }
with Foo do
    begin
    A := 3;
    B := 5;
    end;

cben · February 28, 2024, 11:55am

See also prior discussion of same idea (at least on left side for writing attributes).

This proposal adds right side for reading attributes which is good for symmetry

But the example with function calling does need fleshing out the scoping rules and what other content is allowed. Cause most function calls require some arguments — can you use any expression inside the call? Can you use operators at top level of the tuple?

delimiter = 'n'
"Banana".(split(delimiter), replace(delimiter, '_'), removesuffix('nana') + removeprefix('B'))

There delimiter tries to refer to variable from surrounding scope instead of an attribute of “Banana” but there is no static way to know that. Would Python have to dynamically check if it exists as attribute, and if not fall back to variable? IIRC that’s what JS with statement does and it got deprecated over time. That can hide bugs as user likely had only one interpretation in mind — either they meant a variable or an attribute, but not both.
But if you don’t allow variable references, your ability to call functions and build complex expressions is limited. Which is arguably good for readability, but having to draw the line somewhere means users will need to learn that line, by trying things and failing.

IMHO the whole scoping complication makes this too complex to be worth it.
A syntax covering only obj.(attr1, attr2) is reasonably simple, but is it adding enough value alone to be worth it?

cben · February 28, 2024, 11:58am

Also, since regular tuples and lists can be nested, including as assignment targets, it’s natural to expect it’ll be OK to nest the new syntax e.g. obj.(x, (y, z))?
Is then obj.(x, [y, {z: w}]) legal? How about obj.(x[y])? The general case does tend to sneak in…