Conditional collection literals

This looks stupidly cryptic and I love it, but I don’t think I would have loved this when I first learned python.

flag = True

[
    1,
    2,
    *(3,)*bool(None),
    *(4,)*flag,
    *[5]*(not flag),
    *(6,)*(not flag),
    *["yolo"]*(flag is not None),
] == [1, 2, 4, "yolo"]

Reminds me of how “concatenate” is spelled as sum(..., ())

4 Likes

Or, better yet, avoid evaluating elem_2. The OP’s use case is a perfect fit for generators.

def gen_list():
    condition = False
    elem_2 = lambda: 2

    yield 1
    if condition: yield elem_2()
    yield 3


my_list = [*gen_list()]
print(my_list)
1 Like

Yeah this is cute and all but is definitely too cryptic to be considered a Pythonic practice. The proposed syntax is infinitely more elegant, allows skipping evaluations, and can be used for dicts and keyword arguments.

Having to write a separate generator and a yield statement for every item feels rather clunky to me.

7 Likes

@elis.byberi’s generator approach is also how I usually do it. It’s not at all clunky. Making a generator, that’s just a single line of code, and then you have the full power of Python available to you: You can add for loops and nested if statements to the generator as you need it, without having to invent special case syntax for each.

A problem with the proposed syntax is that the condition only applies to a single list element. If you want to make two consecutive elements conditional, then what? Do you repeat the condition, like this?

subprocess.call([
    'aProgram',
    '--output-folder' if output_folder is not None,
    output_folder if output_folder is not None,
    other_arg,
 ])

With a generator, you don’t have that problem:

def arguments():
    yield 'aProgram'
    if output_folder is not None:
        yield from ['--output-folder', output_folder]
   yield other_arg
subprocess.call(list(arguments()))

By the way, not that I’m recommending it, but my list*bool hack just happens to work well for this particular use case:

subprocess.call([
    'aProgram',
    * ['--output-folder', output_folder] * (output_folder is not None),
    other_arg,
 ])
1 Like

Very good point. I suppose we can add unpacking to the proposed syntax to accommodate such a use case:

subprocess.call([
    'aProgram',
    *['--output-folder', output_folder] if output_folder is not None,
    other_arg,
])

If more complex logics are needed then use a generator by all means. It’s why we have comprehensions and generator expression to cover some of the most common use cases of container construction and generator usage in a more expressive way.

3 Likes

Adding to this, conditionally omitting arguments, would probably become more frequent when sentinels are included in std. Let’s say I have authored a library function with a (intended to be) super secret sentinel that I do not want others to use, e.g. _MISSING, how does a user provide argument to this parameter conditionally?

# mylib.py
_NOT_GIVEN = object()  # replace with sentinellib when avaialble

def foo(x, *, y = _NOT_GIVEN, z):
    ...

Current ways:

# app.py
# method 1
from mylib import foo, _NOT_GIVEN  # problematic
foo(x, y=y if condition else _NOT_GIVEN, z=z)

# method 2
# most obvious but verbose
if cond:
    foo(x, y=y, z=z)
else:
    foo(x, z=z)

# method 3
# this is one class of problem the proposed method solves
# but even this could be difficult with complex signatures, e.g.
# def foo(a, b=s1, /, c=s2, *, d=s3, e): ...`
kwargs = {"z": z}
if cond:
    kwargs |= {"y": y}

foo(x, **kwargs)

Whereas, the most natural form would be:

foo(x, y=y if cond, z=z)
# or
foo(
    x,
    y=y if cond,
    z=z,
)

This generalizes to complex signatures equally well, and makes it more readable IMO.

# def foo(a, b=s1, /, c=s2, *, d=s3, e): ...
foo(
    a,
    b if cond1,
    c = c if cond2,
    d = d if cond3,
    e=e,
)
2 Likes

I’m more inclined to think that the point is that the proposed syntax doesn’t generalise well to relatively simple extensions of the intended use case. Extending the syntax doesn’t really address this concern - quite the opposite, it demonstrates that the proposal isn’t flexible enough to support similar use cases without modifying the proposal.

Existing approaches may be a little more annoying in very simple situations, but they scale much better to accommodate extra complexity. I’m much more in favour of language constructs that scale to a wide range of use cases than ones that are tailored specifically to support a particular niche well.

5 Likes

If I were to skim through this code, I’d just see five arguments being passed. It would be difficult to keep track of the truthiness of cond1, cond2, and cond3 to determine how many arguments are actually being passed to the function.

Overall, the code not only becomes harder to follow but also borders on unreadable. It would be even more difficult than using *args and **kwargs, where at least you can print the full arguments to understand what’s going on.

In this example you can clearly follow call-tree:

if cond:
    foo(x, y=y, z=z)
else:
    foo(x, z=z)

During debugging:

import logging


def foo(x, y=None, z=None):
    print(z, y, z)


cond = True
y = 2
if cond:
    logging.info("Cond is True!")
    assert y

    # debug
    print('cond is True, passing y')

    foo(1, y=y, z=3)
else:
    foo(1, z=3)
1 Like

I concede the point about debugging.

However, you picked the most complex signature to point out the flaw, but the simpler one to promote unrolled if-else.

What in your opinion is the best way to write the last case? I don’t see any straightforward way to do it. There are 3**2 = 9 possible ways to call the function based on the conditions.

In any case, I just included the last example to show how this syntax is terse and (IMO) readable. I don’t think that example was representative of standard use-cases. I think it is uncommon, and probably a bad API if the caller has to do all the condition checking.

For the other case:

foo(x, y=y if cond, z=z)

Do you really find it that difficult to read?

Again, point about debugging is well taken. But of course, in that case you wouldn’t write it that way. Any comprehension isn’t debuggable, neither are lambas, nor ternary if-else. I am not saying that makes this okay, similarly, I don’t find it to be that big of a drawback either.

Because I think it helpful, here is how it might look somewhat more realistically:

foo(
    a,
    b if b is not None,
    c = c if c is not None,
    d = d if d != 0,
    e=e,
)

Personally I find the line which supplies b annoying and confusing. The other lines I don’t have a strong opinion on.

This I think is the most readable alternative:

args = [a]
if cond1:
    args.append(b)

kwargs = {"e":e}
if cond2:
    kwargs["c"] = c
if cond3:
    kwargs["d"] = d

foo(*args, **kwargs)

I don’t find it any less or more easy to follow the call tree. Any other solution would lead to explosion of branches.

Furthermore, I feel the proposed syntax could be easier for typecheckers to verify corresponding types. Although I will admit, I am not qualified to make such claim.

1 Like

It reads as: call foo with the arguments x, y if cond (else?), and z. It’s still unclear whether the y argument will be omitted if cond is false. We need to explicitly state that, or avoid including that expression in the function call altogether.

You need to read it twice to create a clear mental concept. After reading it first, then it simply reads as:

if cond:
    foo(x, y=y, z=z)
else:
    foo(x, z=z)

The form above follows the flow and structure of at least Indo-European languages. I’m not a linguist, so I’m not sure how conditional sentences are structured in other language families.

I don’t see the point of including an if statement inside a function call. It doesn’t even save any lines of code. In real-world programming, you still need to expand it, adding debug code, assertions, logging, etc. I’m sure most users will simply comment out the debug code and leave the code in its expanded form. It sounds too good to be true. The thing is, it may seem like an optimization, but it doesn’t actually provide any real benefit in practical, everyday programming, where code often expands and requires debugging tools.

It’s probably my first post here expressing support to a new syntax proposal. I believe that this proposal adds significant value for general python audience. I just grepped through some code I have written recently, and I found at least 16 cases of such *([...] if cond else []) unpacking in {list,tuple,dict} literals, and that’s only counting those not wrapped by ruff into multiple lines (and obviously not including cases where such elements were last and I append instead). That’s quite a lot, and I’d be glad to see similar syntax implemented. I’m doing both scientific code and backend development, and I found this pattern in both kinds of projects.

I see some problems with the current suggestion of stmt if cond:

  1. It’s very similar to the ternary, so my eye immediately starts looking for else
  2. It’s utterly incomprehensible when put on one line. [1, "foo", "bar" if foo is not None, "qux"] already requires a lot of mental effort to parse, usually commas are not attracting attention.

Point 2 is probably a simple linter rule (“always wrap lines when using conditional collection literals”), so not so bad. Point 1 may be a barrier for newcomers, but a couple of well-written StackOverflow answers should be good enough to solve that as well.

2 Likes

I generally am fine with how things are, though for fun: what about this syntax?

A = [
    1,
    2,
    3 if use_three else ...,
    4,
]

To be clear: use ellipsis to say continue on without this item.

Yep! That’ll work just fine. It’ll then put Ellipsis into your list for each of the items you want to remove, which you can then filter out with a quick step at the end. This is an entirely viable option; costs you one extra step, but otherwise it’s fairly tidy.

That’s exactly how I’ve been relunctantly doing it myself. The proposed syntax is both more readable to me and more efficient.

I think we can remedy both concerns with a leading keyword and a forced indented block instead:

{
    'a': 1,
    if bar is not None:
        'b': 2,
        'c': 3,
    'd': 4,
}

This allows insertion of multiple items on one condition without the need for an unpacking operator too, although unpacking can still be allowed:

[
    '-l',
    if bar is not None:
        '--',
        *map(str, numbers),
]

The only downside I can think of is that an implementation will require a bigger change to the parser because currently there’s no indentation enforcement whatsoever inside brackets.

1 Like

That does look pretty neat for the simplest cases, but:

  1. It can’t be generalized for unpacking.
  2. It can’t be generalized for multiple items on one condition.
  3. It can’t be generalized for a tuple inside a subscript, or the existing Callable[..., str] would become Callable[str] (unless the implementation goes to extreme lengths to preserve the fact that the item value of ... is derived from a ternary operator).

I think I’ve personally seen enough real-world usage of these “a little more annoying” existing approaches to know that even the simple use cases alone aren’t a small niche.

But it would indeed be wonderful to be able to scale the idea to a wider range of use cases as you suggest, and I would love to hear your feedbacks to the block-based syntax I proposed above, which can help pave the way to an even more scalable language construct with more flow controls:

  1. elif/else clauses:
[
    'run',
    if environment == 'dev':
        '--whitelist',
        developers,
    else:
        '--blacklist',
        banned_users,
]
  1. for and while loops:
''.join([
    while stream.next_token: # a stream that can't be iterated over
        stream.fetch_next(),
])
  1. Nested blocks: All of the flow controls can be nested with a further indented block to allow more complex logics.

To avoid ambiguity, each item should be on a separate line and ends with a comma, and a literal tuple value must be enclosed in parentheses.

And finally, the ultimate generalization of the idea would be to allow any statement inside brackets–try-except, with, def, class, and even assignments and function calls–with the only syntax different being that an expression that ends with a comma means an output of the value as an item (or items, if there’s an unpacking operator) for the enclosing brackets. This approach may sound audacious but might actually be easier to implement because the grammar inside brackets can be made identical to the normal grammar except for expressions that end with a comma.

I hope you find this generalization to scale to a wide enough range of use cases. :slight_smile:

1 Like

No, its not fine. It would - as opposed to the proposal of the OP - break backwards compatibility. And: why do you need else ....? You also write in comprehensions as

    (i for i in range(10) if i%2 == 0)

and not as

    (i for i in range(10) if i%2 == 0 else ...)

Also this is quite a big change as Paul Moore and Aleksander Wąsowicz have already pointed out:

It is the same reason why in Python labmdas cannot have multiple statements. Indentation and newlines inside of expresisons have no meaning in Python. The tokenizer will not even emit tokens for them! Changing this would be a big and breaking change. Ben, if you want that, you would need to specify your proposal exactly.

You generally do not need any new syntax for choosing alternatives

[ 'run', '--whitelist' if environment == 'dev' else '--blacklist' ]

If you have several things to unpack:

[ 'run', *(('--whitelist', developers) if environment == 'dev' else ('--blacklist', banned_users))]
1 Like

Did you read my entire post, or stop there and respond?