Currently, we have only unsigned imaginary literals with the following semantics:
±a±bj = complex(±float(a), 0.0) ± complex(0.0, float(b))
While this behaviour is well documented, most users would expect instead here:
±a±bj = complex(±float(a), ±float(b))
i.e. that it follows to the rectangular notation (e.g. Complex number - Wikipedia) a+bi
(or a+bj
) for complex numbers. I think it’s a POLA violation in the Python language. Things are little worse, because in the language itself there is a some “brain split”: in the repr()
output we instead follow to the rectangular notation.
Here few examples
- signed zero in the real part
>>> complex(-0.0, 1.0) # (note funny signed integer zero)
(-0+1j)
>>> -0+1j
1j
>> -(0.0-1j) # "correct" representation with Python numeric literals
(-0+1j)
>>> -(0-1j) # also "correct"
(-0+1j)
- signed zero in the imaginary part
>>> complex(1.0, -0.0)
(1-0j)
>>> 1-0j
(1+0j)
>>> -(-1 + 0j) # "correct"
(1-0j)
Apparently, the complex.__repr__()
uses a different meaning for the j
symbol. It’s not the same as 1j
literal. And also we have another (related) problem: the eval(repr(x)) == x
invariant is broken for the complex type. Quoting from the docs:
For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval(); otherwise, the representation is a string enclosed in angle brackets
But (-0+1j)
is not an object with the same value as complex(-0.0, 1.0)
. Neither complex(1.0, -0.0)
and 1-0j
have same value.
Yet another instance of this is in the sphinx docs for complex class and in its docstring as well:
class complex(real=0, imag=0)
…
Return a complex number with the value real + imag*1j or …
Simple counterexamples
>>> complex(-0.0, -0.0)
(-0-0j)
>>> -0.0 + (-0.0)*1j
(-0+0j)
>>> complex(-0.0, 0.0)
(-0+0j)
>>> -0.0 + 0.0*1j
0j
Again - here our docs live with a wrong assumption, that we have complex literals and real + imag*1j
is a representation of the complex number in the rectangular form.
On a first sight, this is a very minor issue. Clearly, it affects only “corner cases” - when either real or imaginary part of the complex number is -0.0
(signed zero). On another hand, it’s a limitation, that bite us already in the stdlib docs, see the note about branch cuts: we are forced to use here a verbose complex(-2.0, -0.0)
-like constructions, instead of using literals (like -2-0j
, that we could expect in mathematical texts). It’s not because we can’t express same number with the current imaginary literals. But would be an expression like -(-2+0j)
transparent to readers? Or -(-0.0 - 0j)
, where using floats in the real part is required? These “corner cases” are common in fact, because we want to talk about behaviour of functions on branch cuts, and not surprisingly there is a long (not exhaustive) list of recurring issues:
- Inconsistent complex behavior with (-1j) · Issue #84450 · python/cpython · GitHub - was most helpful for me
- Addition/subtraction clear sign from signed 0j · Issue #107854 · python/cpython · GitHub
- edge case when parsing complex numbers · Issue #105027 · python/cpython · GitHub
- negative zero components are ignored in complex number literals · Issue #70026 · python/cpython · GitHub
- Bogus parsing of negative zeros in complex literals · Issue #66738 · python/cpython · GitHub
- Complex number representation round-trip doesn't work with signed zero values · Issue #61538 · python/cpython · GitHub
Maybe we can do better?
Solution
Lets use complex literals (like Scheme, since r3rs) instead, i.e.
bj = complex(0.0, b)
±a±bj = complex(±a, ±b)
where a
(nonzero) and b
are floating point literals (or a decimal integer literal for b
).
While this will make tokenization more complex, with the above change we could fix the eval(repr)
issue without changing the repr
output at all (well, except maybe in the case of a signed zero real component) or arithmetics for mixed operands.
And this replacement for the imaginary literal will match the common mathematical notation. I believe this is most transparent solution for our end users of the complex type (i.e. doing math). No changes on their side, unless they are using funny notation -(-0.0 - 0j)
to represent the “corner case” complex(0.0, -0.0)
.
Edit: More detailed formalization of the above proposal, based on the discussion. With some code.
Perhaps, it would be cleaner if I emphasize that the proposal is restricted to Add/Sub’s (BinOp) with special arguments (second is an imaginary literal and the first is ±int or ±float literal. (We could also discuss if we can redefine also unary Sub of an imaginary literal.) For n-ary ± we should keep current evaluation rules, i.e. a±b±c±d=(((a±b)±c)±d)
. If you want to place a complex literal somewhere between - use parentheses! After all, maybe they are for purpose in the complex.__repr__()
output?
Here is an example of the AST transformation that does above.
from ast import *
from ideas import import_hook
class ComplexLiteralTransform(NodeTransformer):
def visit_BinOp(self, node):
match node:
case BinOp(Constant(x), Add(), Constant(complex(imag=y))):
match x:
case int(x) | float(x):
x, y = map(Constant, [float(x), y])
return Call(Name('complex'), [x, y], [])
case BinOp(Constant(x), Sub(), Constant(complex(imag=y))):
match x:
case int(x) | float(x):
x, y = map(Constant, [float(x), y])
return Call(Name('complex'), [x, UnaryOp(USub(), y)], [])
case BinOp(UnaryOp(USub(), Constant(x)), Add(), Constant(complex(imag=y))):
match x:
case int(x) | float(x):
x, y = map(Constant, [float(x), y])
return Call(Name('complex'), [UnaryOp(USub(), x), y], [])
case BinOp(UnaryOp(USub(), Constant(x)), Sub(), Constant(complex(imag=y))):
match x:
case int(x) | float(x):
x, y = map(Constant, [float(x), y])
return Call(Name('complex'), [UnaryOp(USub(), x), UnaryOp(USub(), y)], [])
return self.generic_visit(node)
def visit_UnaryOp(self, node):
match node:
case UnaryOp(USub(), Constant(complex(imag=x))):
return Call(Name('complex'), [Constant(0.0), UnaryOp(USub(), Constant(x))], [])
return self.generic_visit(node)
def transform_cl(tree, **kwargs):
tree_or_node = ComplexLiteralTransform().visit(tree)
fix_missing_locations(tree_or_node)
return tree_or_node
def add_hook(**kwargs):
return import_hook.create_hook(hook_name=__name__,
transform_ast=transform_cl)
Alternative C version (a draft, no error checks, etc): GitHub - skirpichev/cpython at complex-literals-with-usub.
With André Roberge’s https://github.com/aroberge/ideas:
$ python -q -m ideas -a cl-transform
Ideas Console version 0.1.5. [Python version: 3.12.0rc1+]
ideas> 1-0j
(1-0j)
ideas> 1+0j
(1+0j)
ideas> -0j
-0j
In fact, I think we can consider (±a±bj)
to be the true complex literal. Whereas a feature that we can omit parentheses sometimes (e.g. for simple assignment like x=1+2j
) - a syntactic sugar.
Alternative
We also could solve the problem, using additional complex subtype (see this), the imaginary class (like does e.g. the C11 standard, annex G).
There will be new special rules for mixed arithmetics (see section 5 of the annex G for details), e.g:
float + imaginary = complex(float.real, imaginary.imag)
New rules, however, alter only cases where mixed operands will have nans, infinities or signed zeros in their components.
No new literal types, no changes in parsing of source code or altering the complex.__repr__()
(just as in the above solution), but a “little” new thing:
>>> type(3.14j)
<class 'imaginary'>
On another hand, as it was mentioned by Serhiy Storchaka and Mark Dickinson in the issue #84450, the new type could solve other “gotchas”. For example, currently in Python:
>>> complex(0, math.inf) * 1
(nan+infj)
will be
>>> complex(0, math.inf) * 1
infj
because multiplication of a complex to a real (or to an imaginary number) will be componentwise. For same reasons, ±1j
will be a correct rotation in the complex plane (multiplying any complex number z
, not just finite, by 1j
4 times exactly recovers z
).
Edit: avariant of above is a special treatment in arithmetic ops for complex(0, imag)
instances without introduction of a new type.
Other
Finally, I would also mention attempts to solve only the eval(repr)
issue for the complex type.
First, we could use the “verbose” form in the repr()
output like complex(real, imag)
(obviously, this was too verbose for Guido). A variant of: using this form of the repr()
format only for complex numbers with signed zeros in components.
Alternatively we could use “hackish” form like -(-2+0j)
for our “corner cases”, like did Serhiy Storchaka in the pr #19593.
Both solutions make the repr()
output even less uniform than now (currently we sometimes omit parens).