These approaches are already used e.g. jax uses the former and sympy uses approaches more like the latter. One thing that cannot be done this way is to intercept builtin operations e.g.:
In [1]: import sympy.abc as cas
In [2]: expr = cas.x**2 + cas.y**2 - 1/3
In [3]: expr
Out[3]:
2 2
x + y - 0.333333333333333
It would be nice to be able to control the division e.g. to preserve the exact fraction 1/3
rather than have it evaluate to a float. SageMath uses a “pre-parser” for this sort of thing.
Greg’s suggestion would mean that:
expr = cas$(x**2 + 3*x - 1/3)
would become
expr = cas.sub(cas.add(
cas.pow(cas.name("x"), cas.int(2)),
cas.mul(cas.int(3), cas.name("x"))),
cas.truediv(cas.int(1), cas.int(3)))
Then cas.truediv
can control how to evaluate 1/3
. Similarly it could intercept something like 100**100**100
.
Another example is if want to have things like if
e.g.:
expr = cas$(x if y > x else y)
There is currently no way to avoid eager evaluation with something like that because the expression y > x
cannot remain symbolically unevaluated e.g. with sympy:
In [7]: x if x > y else y
...
TypeError: cannot determine truth value of Relational: x > y
Similar problems of eager evaluation arise with operators like and
and or
and builtin functions like max
.
Some applications like numba.jit
use ast-manipulation to avoid the eager evaluation problems but ast-manipulation is fragile. A macro-based approach as suggested by Greg can provide a more robust and ergonomic version of ast-manipulation where you can build your own tree rather than needing to use the unstable trees produced by the ast module. You would need more than just inline expressions for it to be usable by the likes of numba but the suggestion as it stands would work for e.g. numexpr.
There is some overlap here with t-strings. One thing that t-strings provide that would also be needed here is the ability to interpolate local variables into the expression rather than conjure up new symbols:
e1 = cas$(x**2 - 1)
e2 = cas$({e1}**2 + 2*y)
With t-strings I think you could do that as
e1 = cas(t'x**2 - 1')
e2 = cas(t'{e1}**2 + 2*y')
Conceptually and in terms of implementation I prefer the “this is a symbolic macro” approach rather than “this is parsing a string”. Already the syntax highlighting looks off with t-strings because it is rendering as a string and the runtime implementation that needs to parse the strings would be less efficient.