Simplistic block scope - a syntactic sugar

Python lacks the block scope that exists in C-like languages and is marked there by curly braces {}.

The common workaround is code like following:

def _():
    # ...
_()

In which case the contents of function _ are inside a separate block scope.

A simplistic syntactic sugar for that would be like following:

block:
    # ...

It will be approximately equivalent to the “function as block” workaround.
It will have the same known caveats of “first assignment is declaration” rule (such as need for nonlocal statement ).

Surprisingly, I cannot find any discussion about such simplistic syntactic sugar, so I decided to post it here.
If the general idea receives some traction, I (or someone else) may be motivated to write more examples here, discuss extended syntax (to extent it to situations where “function as block” workaround isn’t applicable), make test implementation, submit PEP (that would include the rationale and review of use cases), etc.

I remember watching a presentation about blocks in Python and the speaker mentioned that the with statement spawned from those discussions. Maybe you could go to the PEPs that proposed similar suggestions for previous discussions?

The last one has one particularly clear message at the end:

4 Likes

I’ve seen that particular discussion (apart from several others) before posting.
The changes to syntax discussed there (scoping rules for loops) are different from what I described here.

PEP 359 and Ad-hoc scoping to organize code is also a different idea (the “namespaces”).

The designs may be slightly different, but the reasons for proposing them (btw, you haven’t mentioned why you want block scope) and the reasons for rejecting them will likely be the same.

It can be very instructive to read past discussions and decisions before suggesting ideas that have been considered before.

4 Likes

In your original message you said:

If you already found these obviously close related proposals, please

  • link to them
  • point out how your proposal is different
  • argue against the points made in those threads.

Otherwise all you are doing is wasting other peoples time to retread the exact same discussions made before.

2 Likes

Typical case when one needs block scopes inside a function is when there are several blocks of code that use some variables that are specific to them, and it’s incovenient that these variables can “leak” – this often leads to bugs, especially when existing code is updated.

Of course, a large function can be refactored in several ways, so as to move such blocks of code to separate function scopes. But a programming language is a practical tool, so it may be preferable to have a syntactic sugar, because developer’s time is limited (and for numerous other reasons).

Some examples of code.

A function that has several blocks of code with block-specific variables:

def f_01_a():
    # Common state.
    a = ...
    b = ...
    ...

    # Block of code #1 with block-specific variables.
    m = ...
    n = ...
    tmp = ...
    # ... Doing something with these variables, then saving results to common state ...

    # Block of code #2 with block-specific variables.
    n = ...
    s = ...
    for i in ...:
        # Iteration-specific variables.
        r = ...
        z = ...
        tmp1 = ...
        tmp2 = ...
    # ... Doing something with these variables, then saving results to common state ...

    # Block of code #3 with block-specific variables.
    m = ...
    n = ...
    while ...:
        # Iteration-specific variables.
        i = ...
        k = ...
        tmp = ...
    # ... Doing something with these variables, then saving results to common state ...

It can be seen that these variables leak outside the block of code that uses them. This can lead to some hard-to-spot bugs, when e.g. a wrong variable is used in another block of code.

Typical transformation to make these blocks scoped is following:

def f_01_b():
    # Common state.
    a = ...
    b = ...
    ...

    def _():
        # Block of code #1 with block-specific variable.
        m = ...
        n = ...
        tmp = ...
        # ... Doing something with these variables, then saving results to common state ...
    _()

    def _():
        # Block of code #2 with block-specific variables.
        n = ...
        s = ...
        for i in ...:
            # Iteration-specific variables.
            r = ...
            z = ...
            tmp1 = ...
            tmp2 = ...
            # ... Doing something with these variables, then saving results to common state ...
    _()

    def _():
        # Block of code #3 with block-specific variables.
        m = ...
        n = ...
        while ...:
            # Iteration-specific variables.
            i = ...
            k = ...
            tmp = ...
            # ... Doing something with these variables, then saving results to common state ...
    _()

When one need also to isolate varibles between iterations, it can be written like:

def f_01_c():
    ...
    def _():
        # Block of code #2 with block-specific variables.
        n = ...
        s = ...
        for i in ...:
            def _():
                # Iteration-specific variables.
                r = ...
                z = ...
                tmp1 = ...
                tmp2 = ...
                # ... Doing something with these variables, then saving results to common state ...
            _()
    _()
    ...

With “simplistic block scope” syntactic sugar it would look like:

def f_01_d():
    # Common state.
    a = ...
    b = ...
    ...

    block:
        # Block of code #1 with block-specific variable.
        m = ...
        n = ...
        tmp = ...
        # ... Doing something with these variables, then saving results to common state ...

    block:
        # Block of code #2 with block-specific variables.
        n = ...
        s = ...
        for i in ...:
            # Iteration-specific variables.
            r = ...
            z = ...
            tmp1 = ...
            tmp2 = ...
            # ... Doing something with these variables, then saving results to common state ...

    block:
        # Block of code #3 with block-specific variables.
        m = ...
        n = ...
        while ...:
            # Iteration-specific variables.
            i = ...
            k = ...
            tmp = ...
            # ... Doing something with these variables, then saving results to common state ...

As example of extending syntax to situation where def _():_() is not directly applicable, the syntax could be like following, where block: is added after loop statement (in which case we isolate the variables between iterations):

def f_01_e():
    ...
    block:
        # Block of code #2 with block-specific variables.
        n = ...
        s = ...
        for i in ...: block:
            # Iteration-specific variables.
            r = ...
            z = ...
            tmp1 = ...
            tmp2 = ...
            # ... Doing something with these variables, then saving results to common state ...
    ...

(Not sure whether such syntax is possible with current lexer/parser.)

Why a syntactic sugar like block: may be preferable to def _():_()?

  1. Convenience. Apart from writing less code (and making the intent clearer), it also would include better behavior when using a debugger (stepping, local varibles, etc.).
  2. The syntax may be extended to the situations where def _():_() is not applicable.
  3. Performance issues with def _():_() inside loops.
  4. The syntax may be made to have different scope semantics than function-in-function. Including the requirement to use nonlocal statement.

This is a valid concern, thank you for pointing out.

Among the links you provided there are following ideas:

  1. Should loops be in their own scope? [poll]
    It is different from what I posted here. It is specific to loops. It also has its own topics of discussion, e.g. making it backwards compatible.
  2. Lack of block scope
    This one discusses different syntax. It proposes making blocks inside constructions like if/else, while/for scoped. That specific discussion was destined to end up fruitless, because the syntax was confusing and not backwards compatible.
  3. Ad-hoc scoping to organize code (see also PEP 359)
    This one is different from what I posted. Namespaces is separate from block scoping and a large topic on its own. The rationale is also different.

How can you save results to common state variables without shadowing them?

If we keep for block: the same scope semantics as for def _():_(), then nonlocal statement would be required for assigning to variables of outer scope.

But this is the case when a different semantics may be implemented (without breaking backward compatibility). A possible rationale for having different scope semantics is to reduce confusion (as current function-in-function scope semantics are well known for confusing newcomers).

To make new scope semantics less confusing, I can think of following ideas:

  1. (A less radical idea.) Emit error on shadowing. This would possibly need adding a keyword local opposite to nonlocal, to permit shadowing explicitly.
  2. (A more radical idea.) JavaScript-like evolution of scoping rules: new scoping rules for variables declared with a special keyword (in JavaScript is it let). This can be done in backwards compatible way.

More thoughts on the explicit variable declaration.

In both these ideas (1) and (2) we come to consider explicit variable declaration (local or let). (It should be noted that such evolution eventually happened in several programming languages that had the Python-like rule “first assignment is declaration”.) Explicit variable declaration may not be desirable in Python for ideological reasons. In the matter of backwards compatibility I think there should be no problems. Other possible concerns can be discussed.

It’s not going to happen. I also discussed an idea similar to this a while ago.

I hope perhaps they’ll implement macros in a manner that would allow you to create blocks like this, but even that is doubtful.

In the mean time, here is a solution that works pretty well:

a = 1
b = 2
class _:
    print(a, b) # 1 2
    c = 3
    b = 4
    print(a, b, c) # 1 4 3
    d = 2*c
    b = 2*d
    print(a, b, c, d) # 1 12 3 6

print(a, b) # 1 2
c = 0
d = 0

class _:
    print(a, b, c, d) # 1 2 0 0
    c = 5
c = _.c
print(a, b, c, d) # 1 2 5 0

actually it works much better for your desired block semantics than it did for mine.

If python does move towards explicit declaration, I think it should be towards let and let mut. Javascript made the mistakes of having the default declaration be mutable (rather than merely shadowable) and it’s really hard to come back from that.

But I don’t see that happening before Python4. And even then I’m not sure it’d be a good idea.
(Side note: I think it’d be cool if in Python4 we made the distinction between development/doodle code (in which the current rules apply) and a library/stable code (which would be strongly typed, and where declaring variables might make sense).)

Is it though? The only actual difference that I can see is that namespaces are named blocks, so you have to give each block a name. Right now you’re discussing how controlled leakage could work, and with named blocks it’s straight-forward, avoiding messy special variable declarations, leveraging existing and popular python-syntax instead:

def f_01_d():

    block a:
        # Block of code "a"... could actually can omit these kinds of comments, since the block's name 
        # can just describe its purpose
        m = ...
        n = ...
        tmp = ...

    block b:
        n = a.n  # for example, to get access to specific variables defined within block "a"
        s = ...
        for i in ...:
            # Iteration-specific variables.
            r = ...
            z = ...
            tmp1 = ...
            tmp2 = ...

    block c:
        m = ...
        n = b.n
        while ...:
            # Iteration-specific variables.
            i = ...
            k = ...
            tmp = ...
    return c.m
2 Likes

Why do you think let + let mut is better than const + let?
const is a more traditional keyword for that meaning, which would prevent confusion.

1 Like

The “named blocks” syntax you describle extends the scope of variables to the scope of the reference to it. Which may be undesirable (because variables may not be of use outside the block, and their memory is not realeased).

If we use syntax like

def f():
    # Unnamed block (as posted originally).
    block:
        ...
    # Named block (as you suggested).
    block a:
        ...
    # Unnamed block, workaround using currently available syntax.
    def _():
        ...
    _()
    # Named block, workaround using currently available syntax.
    class a:
        ...

there is still the issue of referring to parent scope and shadowing:

def f():
    a = 0
    def _():
        b = a  # OK
    _()
    def _():
        a = 1  # Shadowing
    _()
    def _():
        a += 1  # Error: referenced before assignment.
    _()
    def _():
        nonlocal a
        a += 1  # OK
    _()
    def _():
        b = a  # Error: referenced before assignment.
        a = 1  # Shadowing
    _()

The traditional way to solve the “referring to variable of parent scope” problem (that is, to have easily understood, non-confusing syntax) is to have explicit variable declarations.

However, the idea of “namespaces” you provided may let us think towards something like following:

def f():
    a = 0
    def _():
        __parent_scope__.a += 1 
    _()
    # a == 1

That is, a syntax to directly access the parent scope.

Because of my experience with JavaScript and Rust. People are lazy. I am too. If the mutable variant is at least as easy to type as the immutable one, people do default to using the mutable declaration. But it is super relaxed to program in a project where you know the majority of variables won’t be changed after assignment, because you can see only a minority of declarations use let mut.
So IMO you should add a bit of friction to declaring mutable variables, because it makes reading code significantly nicer.

Maybe const + let could work if people use a linter that automatically converts let to const. Maybe some people do. But I only really saw const used for global constants, where python uses ALL_CAPS.

Then it’s just an IDE setting. “l” key, “e” key, “enter”. People are even more lazy

I think you refer to PEP 340 – Anonymous Block Statements | peps.python.org, which eventually evolved into PEP 343 – The “with” Statement | peps.python.org. And previous proposals targetting the same issue were PEP 310 – Reliable Acquisition/Release Pairs | peps.python.org and PEP 319 – Python Synchronize/Asynchronize Block | peps.python.org.

All these had the motivation to have a syntax for reliable release of resources. Something similar to C#'s using, Java’s try-with-resources. This was not primarily related to scope.

The motivation of Simplistic block scope - a syntactic sugar is to prevent leak of variables inside function (so it’s different from the motivation of with). And unfortunately I couldn’t find any PEPs targetting similar scoping-related issues.