My goal
My goal is to inspire people about what macros could do, and what macros could look like.
And to open up a conversation about what we’d like macros to look like in Python.
I’m not able to implement this myself, and I’m not expecting anyone to implement this for me.
But maybe we’ll eventually have better macros for having had this discussion.
Last time I posted in Ideas, and people thought I was rude because my idea wasn’t yet ready to be a PEP. I’m posting this in Help. If you think it deserves to be in Ideas, please move it. If not, please help me make this idea better
Python Idea - Latex style macros
I have seen a couple of ideas and problems discussed that could potentially be solved by macros.
So far, most attention has gone to Rust-style macros, as with # PEP 638 – Syntactic Macros, proposed in 2020, which currently has status “draft”.
I think it could be good to also consider another source of inspiration, Latex. Specifically, I would propose copying the following:
- Macros are identified by the character
\
. - Macros specify some pattern of text replacement
- Macros can be defined similar to Latex’
\NewDocumentCommand
, as specified in the xparse documentation. - When a Python source file is run or imported, all the macros are resolved to create an “expanded script file”, and this file is parsed by the current python interpreter.
Motivation
The following are some problems that could potentially be “solved” by macros:
PEP 505 – null-coalescing via ?.
, ??
, and ??=
(and ?[]
)
nested creation of dictionary keys
Introduce a “bareword” list/dict literal
Pseudo-Uniform Function Call Syntax
Assure keyword
Method for interpolate “normal” string like an f-string
c-strings
Deferred Evaluation (somewhat hackishly)
PEP 671 – Syntax for late-bound function argument defaults(extremely hackisly)
f-strings as docstrings: github implementation stackoverflow question about the lack-of-a-feature
There was also an adjacent discussion on discuss.python.org recently: DSL Operator – A different approach to DSLs
More generally, macros are an extremely powerful tool, and people are bound to find applications of them once they occur within python.
I do envisage them primarily as personal tools for customising Python to your preferences, that others can ‘disable’ by resolving them all, which would result in a clean macro-free python file.
Further details of implementation
arg specs (1)
To solve the problems listed above cleanly, macros need (at the very least) access to the preceding expression*, the following word*, expression*, line, or block, and to any text contained in user-specified boundary tokens.
Traditionally a latex macro is defined as
\NewDocumentCommand{\macro_name}{argment specs}{macro pattern}
The latex tradition is to use []
for optional arguments, and \NewDocumentCommand
is a mouthful, so we could modify this to
\NewMacro{\macro_name}[preceding arg specs]{following arg specs}{macro pattern}
arg specs (2)
In the phrase \macro a.b.c
, a
would be the first word, and a.b.c
would be the first expression. One could argue that I should call the former a token, or that I’m using the word expression incorrectly. This is meant as the opening for a constructive construction, so I’m open to ideas for better naming.
arg specs (3)
xparse uses extremely terse notation for the argument specs, which takes a little getting used to. For the sake of making this easier to read without having to memorise a list of symbols, lets write the argument specs as for example word($func)
, block($bl)
, and tokens""($string)
, with the $-variable a thing that the macro can act on. (In latex these are refered to inside the macro as #1
, #2
, etc, which won’t work well for a variety of reasons.)
Explanation via example: PEP 505
If you want to be able to write a if a is not None else b
succinctly, PEP 505 proposes a ?? b
. With these macros you could define
\NewMacro{\??}[expression($X)]{expression($Y)}{($X if $X is not None else $Y)}
so that c = d + a \?? b / 2
is equivalent to c = d + (a if a is not None else b) / 2
.
Equally for =??
you could define
\NewMacro{\=??}[expression($X)]{expression($Y)}{
if $X is None:
$X = $Y
}
((Note here the spacing could be ambiguous. One could foresee problems with the above definition and a use as
def f(x=None):
x \=?? 3
which would with a poor implementation resolve as
def f(x=None):
if x is None:
x = 3 # Error
but I believe that problem can be solved.))
If you want an easy way to write a.b.c if (a is not None and a.b is not None) else None
, as I understand one of the possible implementations of the proposed ?.
operator to be, that could be defined via
\NewMacro{\?.}[expression($X)]{word($Y)}{($X.$Y if $X is not None else None)}
so that a\?.b\?.c
is expanded into
((a.b if a is not None else None).c if (a.b if a is not None else None).c is not None else None)
which I believe is equivalent if slightly less efficient.
Explanation via example: nested creation of dictionary keys
what was requested was a way to do
tree = {}
tree.set(('customer', 'id', 'name'), "York")
tree['customer']['id']['name'] == "York"
Possible implementation with macros:
\NewMacro{\.set}[expression($dic)]{tokens()($args)}{
do_set($dic, $args)
}
def do_set(d: dict, keys: tuple, value: Any)->None:
*most, last = keys
for key in most:
d = d.setdefault(key, {})
d[last] = value
return
then
tree.\set(('customer', 'id', 'name'), "York")
resolves into
do_set(tree, ('customer', 'id', 'name'), "York")
which does the requested operation.
One could argue this is cheating, and the macro should actually resolve to
tree.setdefaul("customer", {}).setdefaul("id", {})["name"] = "York"
or
tree.setdefaul("customer", {}).setdefaul("id", {}).__setitem__("name", "York")
or even
tree.setdefaul("customer", {}).setdefaul("id", {})
tree["customer"]["id"]["name"] = "York"
I find this difficult, both with latex-inspired macros and with rust-inspired macros, because the pattern .setdefaul("key", {})
gets interrupted at the end, and because working with vectors of indeterminate length is hard in both Latex and Rust. But any system of python macros needs to be able to accomplish tasks like this. In a fair manner, without growing too complicated in syntax.
A solution I can come up with involves the Latex E/e argspec, which tests whether a particular symbol is present. Let us write this argspec here as E{=}($is_end)
, which means $end
is True
if =
is present, and False
otherwise.
(This is a genuinely useful and powerful construct. In plain Latex it’s quite common and useful for \macro
and \macro*
to mean something subtly different for example.)
Now we can define:
\NewMacro{\set}{tokens[]($key) E{=}($is_end) expression($maybe_value)}{
\if ($is_end) {.__setitem__($key, $maybe_value)}
\else {.setdefaul($key, {})\set}
\fi
}
which should resolve tree\set["customer"]["id"]["name"] = "York"
into
tree.setdefaul("customer", {}).setdefaul("id", {}).__setitem__("name", "York")
Explanation via example: bareword list/dict
The OP of this topic wanted an easier way to write for example
__all__ = ["cmp_op", "stack_effect", "hascompare", "opname", "opmap",
"HAVE_ARGUMENT", "EXTENDED_ARG", "hasarg", "hasconst", "hasname",
"hasjump", "hasjrel", "hasjabs", "hasfree", "haslocal", "hasexc"]
and suggested
__all__ = <cmp_op stack_effect hascompare opname opmap
HAVE_ARGUMENT EXTENDED_ARG hasarg hasconst hasname
hasjump hasjrel hasjabs hasfree haslocal hasexc>
We can already write
__all__ = '''cmp_op stack_effect hascompare opname opmap
HAVE_ARGUMENT EXTENDED_ARG hasarg hasconst hasname
hasjump hasjrel hasjabs hasfree haslocal hasexc'''.split()
but the problem is that then not all IDEs parse it correctly.
This is one of the motivations why I suggest that eg PyLance should view the expanded script file, rather than the raw script (with macros in it).
It would seem like a good solution to me to include the \Eval
macro, which evaluates python code but is not allowed to resolve imports (because macros with access to eg sys
are too scary), so that you could simply write
__all__ = \Eval{'''cmp_op stack_effect hascompare opname opmap
HAVE_ARGUMENT EXTENDED_ARG hasarg hasconst hasname
hasjump hasjrel hasjabs hasfree haslocal hasexc'''.split()}
or if you want to hide the .split()
method,
\NewMacro{\"""}{tokens{SELF}{"""}($text)}{
\Eval{"""$text""".split()}
}
__all__ = \"""cmp_op stack_effect hascompare opname opmap
HAVE_ARGUMENT EXTENDED_ARG hasarg hasconst hasname
hasjump hasjrel hasjabs hasfree haslocal hasexc"""
The OP of that thread also wants to be able to write
_specializations = {
"RESUME": [
"RESUME_CHECK",
],
"TO_BOOL": [
"TO_BOOL_ALWAYS_TRUE",
"TO_BOOL_BOOL",
"TO_BOOL_INT",
"TO_BOOL_LIST",
"TO_BOOL_NONE",
"TO_BOOL_STR",
],
...
}
as something like
_specializations = <
RESUME:
RESUME_CHECK
TO_BOOL:
TO_BOOL_ALWAYS_TRUE
TO_BOOL_BOOL
TO_BOOL_INT
TO_BOOL_LIST
TO_BOOL_NONE
TO_BOOL_STR
>
I don’t see a quick & easy way to do that with macros. My experience tells me a macro could achieve it, but that it will be ugly/complicated code.
But I do think enabling people to embed yaml into their python code would be a good thing because it makes python work for more people.
If other people need to work on it later and they don’t like the embedded yaml, it should be possible for them to resolve the macros, and continue working in plain python.
Explanation via example: Pseudo-Uniform Function Call Syntax
Could be implemented as
\NewMacro{\.}[expression($object)]{word($function) tokens(){$args}}{
$function($object, $args)
}
then for example
"abcde"\.len() == len("abcde")
Explanation via example: Assure keyword
Here this system of macros runs into a spot of trouble.
I thought I had a solution, but don’t think there’s much you can do with macros that goes beyond the functionality that you can achieve with
def assure(maybe_none):
if maybe_none is None:
raise ValueError
return maybe_none
I mean you could define a macro so that you can call it as
a = \assure f(b)
instead of
a = assure(f(b))
but that doesn’t help the type checker.
Explanation via example: Method for interpolate “normal” string like an f-string
The problem is you have a string like f"I want a {robot} brain"
that you want in multiple places.
The solution is to just use a macro instead of an assignment, and it works perfectly.
\NewMacro{\my_fstring}{}{f"I want a {robot} brain"}
x = \my_fstring
...
y = \my_fstring
There’s potential for a little awkwardness because the macro has to be defined at the global level of your file, so you can’t locally create it inside a function. But I think it’s worth the trade-off.
Explanation via example: Deferred Evaluation (somewhat hackisly)
Is honestly the same as the above. You want an expression that evaluates to x+2*y
whenever you assign to it? \NewMacro{\macro}{}{(x+2*y)}
. It won’t have all the behavior that people who ask for deferred evaluation want, but honestly don’t get the impression that there is an agreed upon proper definition of “deferred evaluation”. And with macros, fans of the concept can figure out what they want, modify the behaviour of the macro, and eventually maybe they’ll have something with well-defined behaviour that they’re happy with and they can tell the rest of us about.
Explanation via example: PEP 671 – Syntax for late-bound function argument defaults(extremely hackisly)
\NewMacro{\late}[word($argname) tokens:=($type_hints)]{tokens()($late_bound expression) line($rest_of_line)}{
$argname : $type_hints | None = None $rest_of_line
\newline
if $argname is None: $argname = $late_bound
}
so that
def f(a: str, b: list = \late [a]):
...
resolves into
def f(a: str, b: list | None = None):
if b is None: b = [a]
...
and because macros resolve one at a time, from left to right,
def f(a: str, b: list = \late [a], c: list = \late []):
...
resolves into
def f(a: str, b: list | None = None, c: list | None = None):
if c is None: c = []
if b is None: b = [a]
...
so that does mostly work.
There is a ‘problem’ that
def f(a: str, b: list = \late [c], c: list = \late [a]):
...
works but
def f(a: str, b: list = \late [a], c: list = \late [b]):
...
resolves into
def f(a: str, b: list|None = None, c: list|None = None):
if c is None: c = [b]
if b is None: b = [a]
...
which does not.
I am also noticing here that there should be a good way to have optional arguments for macros. So that you could design a macro that work regardless of whether the late-bound variable has type hints. Latex has a pretty good system, but sadly I broke it in my attempt to remove the magic symbols such as o
, O
, m
, R
etc.
Explanation via example: f-strings as docstrings
The most common situation where this is desirable is when you have some CONSTANT that you want to mention in the docstring.
With my proposed system you could do
\NewMacro{\MyConstant}{}{100}
MY_CONSTANT = \MyConstant
def f():
\Eval(f"docstring that mentions {\MyConstant}.")
which admittedly is much more awkward than what I expected to end up with.
You could also write (to use a classic Latex pattern):
\NewMacro{\MyConstant}{E{*}($is_string)}{
\IfBooleanTF{$is_string}{"100"}{100}
}
MY_CONSTANT = \MyConstant
def f():
"docstring that mentions "\MyConstant*".")
which resolves into
MY_CONSTANT = 100
def f():
"docstring that mentions ""100"".")
but maybe (especially if you’re allowed to invent new macro rules) there are better solutions to be found.
I envision macros only being used as tools of convenience
Macros are dangerous. There are good reasons why Latex wasn’t adopted as a multi-purpose programming language, even though it is Turing complete. When I look back at my old latex files, they make sense to me, but they’re not production code, and a lot of them could never be production code.
Someone pointed out in another thread about macro’s that there is a danger that use of macros transforms code from something that makes sense for everyone into something that only makes sense to the writer, because macros make a language so customizable. With copilot being on the rise, this is especially important.
On the balance I think this is the responsibility of individuals and organizations. You can already write python that is incomprehensible to most readers. For example by creating inheritance brambles. (I’m tempted to share a code base I saw recently, but I don’t want to embarrass/anger the author.)
But it would probably be best practice to use no more than 1 macro per file, and/or to make sure that future programmers can convert the whole thing to the “extended script file”, forget about the macros, and continue from there. In other words, (in most cases,) after resolving the macros you should still have good (if slightly repetitive) Python code.
The potential role of (these style of) macros in evolving Python syntax
Python develops conventions.
Like
import numpy as np
occurs in almost all code that import numpy.
If a significant fraction of python users converges on the same macro, I think that that’s a sign that it should be considered for general syntax.
At the same time, I don’t expect that to happen quickly. And if (within a coding community) everyone knows what \set
or \|>
means, then that macro escapes the “this is a personalization that you should not expect other people to understand” niche. And then the prolonged use of that macro isn’t a big problem.
It would still be annoying that that macro gets resolved when you go through a resolve-all-macros process, so it would still be valuable to integrate such a macro into the general python syntax.