New syntax `Trailing Block` for constructing objects with complex structure

guido · December 6, 2023, 10:46pm

Maybe I missed it, but what’s wrong with writing e.g.

Node("root", [
    Node("child1", [
        Node("grandchild1"),
        Node("grandchild2"),
    ]),
    Node("child2", [Node("gc1"), Node("gc2")])  # Compact form
])

Is it the extra brackets needed? Or the order of evaluation (where child nodes are created before their parents)?

Rosuav · December 6, 2023, 11:06pm

Guido van Rossum:

Maybe I missed it, but what’s wrong with writing e.g.

Node("root", [
    Node("child1", [
        Node("grandchild1"),
        Node("grandchild2"),
    ]),
    Node("child2", [Node("gc1"), Node("gc2")])  # Compact form
])

I’ve built a few UIs in this sort of way, and it DOES work, but it’s highly restrictive. In general, it leads to a coding style in which everything is required to be an expression rather than a statement, and thus awkward ternary conditionals, map-based iteration rather than for loops, and so on, become the standard way to do things. When you have a single gigantic expression representing your whole window, it’s way too far out-of-line to break something out all the way to the very top.

(And for the record, this is a problem even in languages that allow multi-statement inline functions, which allow for a measure of flexibility even inside a single expression. In Python, that’s even more restrictive, so the problem would be exacerbated.)

guido · December 7, 2023, 1:05am

But I’d like an answer specific so from the OP.

AlfredDU · December 7, 2023, 2:29am

It is a dream for me to discuss python programming with THE CREATOR. And now it comes true!!!

Similar to Chris said, I also wrote many codes in this way on development with complex structure & UI. Sometimes it is not only the children sibling nodes, but also the attributes which have nested structure:

class Node:
    def __init__(self, name, children: List, attributes: Dict = dict()):
        ...


Node("root", [  # list of children
    Node("child1"),
    Node("child2"),
], {  # dict of attributes
    "title": TitleNode("title1", [
        ParagraphNode("p1"),
        ParagraphNode("p2"),
    ]),
    "config": ConfigNode("configure1")
})

After writing many codes in this way, I have the same strong feeling that: I am not writing python codes, but JSON expressions; and the expressiveness for the complex structure is bought from JSON, not python language itself. For the above exmaple, it is nearly the same to write

NodeStructure({
    "type": "node"
    "name": "root",
    "children": [
        {"type": "node", "name": "child1"},
        {"type": "node", "name": "child2"},
    ],
    "attributes": {
        "title": {
            "type": "title_node",
            "name": "title1",
            "children": [...]
        }
    }
})

And thus, the disadvantages of JSON data structure comes along. Just like Chris said, it is a large but single expression rather than composed statements, so

you cannot write conditional block; you have to write ternary expressions;
you cannot write loop; you have to write functools’ map;
you cannot define complex functions; you have to write lambda with limited expressiveness.
you cannot involve imtermediate variables & statements

Besides, the indent of such expression is not compulsive. You can write well indent-formatted expression (just like pprint result of JSON object), and also you can write long long inline hard-to-read codes. It is not guaranteed by the syntax checker. Critics oppose python lanugage for the principle of compulsive indent, which I think is caused by lack of understanding of the Zen & unique elegance of this language. However, to use JSON-like constructor for complex object, may somehow conflict the language’s consist style.

It is a huge honor for me to get your reply!!!

AlfredDU · December 7, 2023, 2:45am

Thank you very much, Chris! What you summarized above is worth days of my thinking.

AlfredDU · December 7, 2023, 3:39am

There are still too many extra keywords & operators.

Sometimes combination of existed syntax can achieve the same goal, but a new syntax can save a lot of work & bring higher readability.

Regarding of

Use constants to define enumerations, or use enum ( PEP 435)
Use many if and elif to define switch-case condition branches, or use match pattern (PEP 636)

encukou · December 7, 2023, 9:07am

Python is powerful enough for this:

with Node('root', parent=None) as root:
    Node.current.is_red = True

    Node('child')
    child = Node('named child')

    with Node('complex child'):
        @Node.event_handler
        def on_walk(self, distance_km):
            if distance_km > 10:
                print(f'{self.name!r} is tired!')

        Node('grand')
        Node('grand')

    Node('new child')

pprint(root)
assert root.is_red
root.children[2].on_walk(42)

output:

Node(name='root',
     children=[Node(name='child', children=[]),
               Node(name='named child', children=[]),
               Node(name='complex child',
                    children=[Node(name='grand', children=[]),
                              Node(name='grand', children=[])]),
               Node(name='new child', children=[])])
'complex child' is tired!

There’s some magic involved, but its effect is simple to explain:

Node.current depends on the enclosing with.
Node.__init__ takes a parent argument. If you omit it, Node.current is used.
Node.event_handler defines a method on Node.current.

IMO, the only thing that’s substantially less ergonomic than the OP example is that the attribute is defined with Node.current.attr = ... rather than simply attr = ....
Defining attributes by variable assignment wouldn’t be right. For example for i in range(n) sets i, but you probably don’t want it set to an i attribute. (And there’s a lot of other cases of variable assignment, not all of which are as clear-cut as for or attr=...)
This could be simplified to CURRENT.attr = ... (with a global CURRENT), at the cost of significantly more magic in the implementation. Not worth it, IMO. (See flask.g – popular, but full of sharp edge cases.)
(Of course in this particular case you could use root.attr = ..., since that node has a name.)

If I was making a library like this I’d like to be a bit more explicit and avoid magic, and make you always name nodes when you use the with statement – but that doesn’t work if you reuse names across levels:

with Node('root') as current:
    with current.add(Node('child')) as current:
        ...
    with current.add(Node('child')) as current:
        ...
    current.is_red = True  # oops! current now refers to the child!

So, I can’t find a way to avoid context – like wxWize from Andreas’ example. I used contextvars, and tried to ensure the magic remains contained to the Node class…
The other bit of magic is using a metaclass, which is needed to have Node.current rather than Node.get_current(), and also ensures instance namespace isn’t polluted unnecessarily (there’s no root.current or child.event_handler).

click for the magic

from dataclasses import dataclass
from functools import partial
from pprint import pprint
import contextvars


_USE_CURRENT_ROOT = object()

class NodeMeta(type):
    @property
    def current(cls):
        try:
            return cls._root_context.get()
        except LookupError:
            raise LookupError(
                f"no current {cls.__name__!r}, use a with statement")

    def event_handler(cls, func=None, name=None, node=None):
        # this gimmick is not necessary for the main idea
        if func is None:
            return partial(cls.event_handler, name=name, node=node)
        if node is None:
            node = cls.current
        try:
            descr_get = func.__get__
        except AttributeError:
            pass
        else:
            func = func.__get__(node, type(node))
        return setattr(node, name or func.__name__, func)

@dataclass
class Node(metaclass=NodeMeta):
    name: str
    children: list

    _root_context = contextvars.ContextVar('_root_context')

    def __init__(self, name, *, parent=_USE_CURRENT_ROOT):
        self.name = name
        self.children = []
        if parent is _USE_CURRENT_ROOT:
            parent = type(self).current
        if parent is not None:
            parent.children.append(self)
        self._reset_tokens = []

    def __enter__(self):
        self._reset_tokens.append(self._root_context.set(self))
        return self
    
    def __exit__(self, *exc_info):
        self._root_context.reset(self._reset_tokens.pop())

AlfredDU · December 7, 2023, 9:48am

Thank you for the detailed example

I don’t think this is the main difference.

I think the most significant difference, between the proposal and the approach with with statement, is that for the latter, since __enter__ and __exit__ dunder methods cannot caught local variables within with block, in order to use these local variables, you have to put them in other place (eg: class members), which may brings side effects (like threading issues). I’ve mentioned it in above posts.

This proposal offers an explicit dunder method to caught local variables. I think, maybe, this proposal conforms better to the python Zen that Explicit is better than implicit

This design (that variables within block are assigned to same-named attributes) is mainly for function attributes. As for assigning basic type values, it does not seem much different indeed, but it is the way which declarative programming pattern does.

encukou · December 7, 2023, 10:41am

Is there a case where you can’t put the extra information on the current Node instance? That shouldn’t have threading issues.
To get on the same page, could you post an example, using the syntax you’d prefer, of a case where my demo wouldn’t work for you?

ronaldoussoren · December 7, 2023, 10:55am

AFAIK Petr’s example doesn’t have a threading problem because it uses contextvars.

The status quo is IMHO better when w.r.t. to “explicit is better than implicit” because your proposal introduces new implicit behaviour where there currently is none .

The big question is still what this proposed feature would bring. It is far from clear to me that the feature you propose would give us concrete improvements in writing clean and correct code. The barrier for adding new syntax to Python is pretty high (rightfully so).

encukou · December 7, 2023, 11:05am

Well there is implicit behaviour (which new syntax could solve): the contextvar leaks to all called functions. It’s not constrained lexically to the with block. IMO, it’d be better for the framework to require that:

tree-building functions explicitly retrieve the current node (ideally as a default, like Node.__init__), and
users pass the current node to other functions explicitly.

But, the need to define the root node with an explicit parent=None limits the damage: it’s not easy to attach new nodes to unrelated trees.

Rosuav · December 7, 2023, 11:25am

That is exactly what a class block does, though. If you think about this as a with block, it’s obviously not right; if you think about it as a namespace, it’s equally obviously right.

AlfredDU · December 7, 2023, 12:36pm

I am sorry for that I was caught up in thinking of the implementation of class members. I have not used the module contextvars much, which I thought was a builtin module for async programming.

I do not deny the feasibility of using contextvars to construct nested, complex objects. What I concern is that maybe the implementation is too heavy? As I posted above

As far as I know, contextvar is still not a pure-local scope. Professional developers can handle it skillfully with thread & context model clear in mind, but that is a high standard for UI developers or other one just want to describe nested data.

AlfredDU · December 7, 2023, 1:17pm

I know the high standard for a PEP to be accepted. And I feel really so honored that people are discussing my draft in the thread.

I also know it is not easy for all of us to accept this kinda radical proposal. What I concern most, maybe the original motivation of this proposal, is that the combination of programming patterns, declarative & imperative, is surely to happen in python programming. May this proposal be accpeted, or may it be rejected, the trending is on the way. The discussion itself here is meaningful for me and the furture I foresee. Maybe, after dozens of similar proposals & PEPs , a perfect syntax proposal for declarative python programming will be found.

I was just trying to be cute… In the post I was saying that the proposal is less implicit than the approaches with with statement. I acknowledge that it is more implicit than the approaches of the status quo (imperative programming).

AlfredDU · December 7, 2023, 1:29pm

I cannot agree more, Chris!!!

Module attributes, class attributes (instead of instance attributes) are both defined by variable assignment.

TomRitchford · December 7, 2023, 1:39pm

__trailing__ seems incomplete because it could only do one of post-order (where children are initialized after their parents) or pre-order (children are initialized before their parents).

In GUIs, sometimes you need post-, sometimes you need pre-, sometimes you need both. I don’t see a way to effectively handle that.

Also, the local variables implicitly appearing in __trailing__ is against “Explicit is better than implicit.”

[You can probably stop reading here.]

I got inspiration from popular, many-stars python UI framework kivy

I’ve been writing computer programs for almost fifty years, and I’ve never run into a worse designed system than Kivy, so bringing that up is a counterargument.

As an example, simply constructing a color has global side-effects in Kivy. It took me far too long to realize this because I simply couldn’t conceive of anyone doing that until I was sitting on that line in the debugger.

The Kivy “language” is the worst part. There’s no hint of a grammar. Instead of using an existing format like JSON, Yaml or Toml, they made up yet another one. There are three different places in the language that Python language fragments can appear, each with slightly different rules. Worse, all symbols introduced in Kivy language documents are in one global namespace.

I could go on and on but I’ll stop before I froth.

I feel in the Kivy language their first idea for each feature was always, “Let’s make it a new syntax!” I disagree strongly with that philosophy.

AlfredDU · December 7, 2023, 1:49pm

It is not the design of kivy itself should be learned. It is the way of declarative programming that can be learned and has already been proved successful (eg: JSX & swiftui). It is a desirable thing that kivy tries to bring the new pattern to python world.

TomRitchford · December 7, 2023, 1:55pm

Declarative programming is great if done right - SQLAlchemy (dating five years before Kivy) is a very good example.

But SQLAlchemy did it in the Python language, so we get to keep our toolchains, linters, and type checkers.

Kivy does it in .kv documents with little chunks of Python which cannot be linted or type checked.

AlfredDU · December 7, 2023, 2:00pm

I use SQLAlchemy every often. It is a great library. I also got some inspirations from the way it composes SQL with native python.

As for kivy, as I know, a UI framework is never a easy thing.

tim-mitchell · December 7, 2023, 9:09pm

Have you thought of the method chaining approach? (i’m not sure what the right name is for it)

class Node:
    def __init__(self, *args, **kwargs):
        self.child_nodes = []

    def add_child(self, node):
        self.child_nodes.append(node)
        return self  # <-- this allows chaining


root_node = (
    Node()
        .add_child(Node()
                   .add_child(Node())
                   .add_child(Node()))
)