One of the annoyances when I jumped from Perl to Python many years ago was the lack of a qw
-equivalent operator/literal that would make a common pattern of defining a long list of words/identifiers clean and easy, where the Perl expression:
qw(foo bar baz)
is semantically equivalent to the list:
"foo", "bar", "baz"
Wouldn’t it be nice if we can take advantage of one of the currently meaningless pairs of symbols, say <
and >
, to denote a list literal where barewords are parsed not as names but as strings, with commas being optional?
Taking opcode.py
as an example, instead of writing:
__all__ = ["cmp_op", "stack_effect", "hascompare", "opname", "opmap",
"HAVE_ARGUMENT", "EXTENDED_ARG", "hasarg", "hasconst", "hasname",
"hasjump", "hasjrel", "hasjabs", "hasfree", "haslocal", "hasexc"]
we can write:
__all__ = <cmp_op stack_effect hascompare opname opmap
HAVE_ARGUMENT EXTENDED_ARG hasarg hasconst hasname
hasjump hasjrel hasjabs hasfree haslocal hasexc>
Furthermore, Perl also allows keys in hashes (what we call dicts in Python) to be barewords when they conform to the rules of an identifier, so:
%hash = (foo => 3, bar => 8, baz => 5);
is semantically equivalent to:
%hash = ("foo" => 3, "bar" => 8, "baz" => 5);
So perhaps we can make <
and >
denote a dict as well, where bareword keys are parsed as strings, and while we’re at it, make commas optional as well when keys are on separate lines.
Taking opcode.py
again as an example, instead of writing:
_cache_format = {
"LOAD_GLOBAL": {
"counter": 1,
"index": 1,
"module_keys_version": 1,
"builtin_keys_version": 1,
},
"BINARY_OP": {
"counter": 1,
},
...
}
we can write:
_cache_format = <
LOAD_GLOBAL: <
counter: 1
index: 1
module_keys_version: 1
builtin_keys_version: 1
>
BINARY_OP: <
counter: 1
>
...
>
or make it even cleaner by going full-blown YAML-like by making identation imply dict nesting:
_cache_format = <
LOAD_GLOBAL:
counter: 1
index: 1
module_keys_version: 1
builtin_keys_version: 1
BINARY_OP:
counter: 1
...
>
And while we’re at it, we can generalize indentation-implied nesting to lists as well, so that this dict of lists from _opcode_metadata.py
:
_specializations = {
"RESUME": [
"RESUME_CHECK",
],
"TO_BOOL": [
"TO_BOOL_ALWAYS_TRUE",
"TO_BOOL_BOOL",
"TO_BOOL_INT",
"TO_BOOL_LIST",
"TO_BOOL_NONE",
"TO_BOOL_STR",
],
...
}
can be optionally written as:
_specializations = <
RESUME:
RESUME_CHECK
TO_BOOL:
TO_BOOL_ALWAYS_TRUE
TO_BOOL_BOOL
TO_BOOL_INT
TO_BOOL_LIST
TO_BOOL_NONE
TO_BOOL_STR
>
Note that only bareword keys are parsed as strings, while dict values are parsed normally as expressions.
Lastly, we can allow items and keys to be optionally quoted for more flexibility:
commands = <
add
find
"list-servers" # we can arguably make quotes optional here too since dashes
"list-clients" # don't make these words syntatically ambiguous
remove
>
By removing the noises of quotes and commas from the literals it makes these definitions more readable and easier to maintain.
The current workaround for defining a long bareword list is to use textwrap.dedent
and str.splitlines
on a docstring:
commands = textwrap.dedent(```\
add
find
list-servers
list-clients
remove
```).splitlines()
But still it would be cleaner if Python supported the usage natively.