Currently, we have only unsigned imaginary literals with the following semantics:

``±a±bj = complex(±float(a), 0.0) ± complex(0.0, float(b))``

While this behaviour is well documented, most users would expect instead here:

``±a±bj = complex(±float(a), ±float(b))``

i.e. that it follows to the rectangular notation (e.g. Complex number - Wikipedia) `a+bi` (or `a+bj`) for complex numbers. I think it’s a POLA violation in the Python language. Things are little worse, because in the language itself there is a some “brain split”: in the `repr()` output we instead follow to the rectangular notation.

Here few examples
1. signed zero in the real part
``````   >>> complex(-0.0, 1.0)  # (note funny signed integer zero)
(-0+1j)
>>> -0+1j
1j
>> -(0.0-1j)  # "correct" representation with Python numeric literals
(-0+1j)
>>> -(0-1j)  # also "correct"
(-0+1j)``````
1. signed zero in the imaginary part
``````   >>> complex(1.0, -0.0)
(1-0j)
>>> 1-0j
(1+0j)
>>> -(-1 + 0j)  # "correct"
(1-0j)``````

Apparently, the `complex.__repr__()` uses a different meaning for the `j` symbol. It’s not the same as `1j` literal. And also we have another (related) problem: the `eval(repr(x)) == x` invariant is broken for the complex type. Quoting from the docs:

For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval(); otherwise, the representation is a string enclosed in angle brackets

But `(-0+1j)` is not an object with the same value as `complex(-0.0, 1.0)`. Neither `complex(1.0, -0.0)` and `1-0j` have same value.

Yet another instance of this is in the sphinx docs for complex class and in its docstring as well:

class complex(real=0, imag=0)

Return a complex number with the value real + imag*1j or …

Simple counterexamples
``````>>> complex(-0.0, -0.0)
(-0-0j)
>>> -0.0 + (-0.0)*1j
(-0+0j)
>>> complex(-0.0, 0.0)
(-0+0j)
>>> -0.0 + 0.0*1j
0j``````

Again - here our docs live with a wrong assumption, that we have complex literals and `real + imag*1j` is a representation of the complex number in the rectangular form.

On a first sight, this is a very minor issue. Clearly, it affects only “corner cases” - when either real or imaginary part of the complex number is `-0.0` (signed zero). On another hand, it’s a limitation, that bite us already in the stdlib docs, see the note about branch cuts: we are forced to use here a verbose `complex(-2.0, -0.0)`-like constructions, instead of using literals (like `-2-0j`, that we could expect in mathematical texts). It’s not because we can’t express same number with the current imaginary literals. But would be an expression like `-(-2+0j)` transparent to readers? Or `-(-0.0 - 0j)`, where using floats in the real part is required? These “corner cases” are common in fact, because we want to talk about behaviour of functions on branch cuts, and not surprisingly there is a long (not exhaustive) list of recurring issues:

Maybe we can do better?

## Solution

Lets use complex literals (like Scheme, since r3rs) instead, i.e.

``````bj = complex(0.0, b)
±a±bj = complex(±a, ±b)``````

where `a` (nonzero) and `b` are floating point literals (or a decimal integer literal for `b`).

While this will make tokenization more complex, with the above change we could fix the `eval(repr)` issue without changing the `repr` output at all (well, except maybe in the case of a signed zero real component) or arithmetics for mixed operands.

And this replacement for the imaginary literal will match the common mathematical notation. I believe this is most transparent solution for our end users of the complex type (i.e. doing math). No changes on their side, unless they are using funny notation `-(-0.0 - 0j)` to represent the “corner case” `complex(0.0, -0.0)`.

Edit: More detailed formalization of the above proposal, based on the discussion. With some code.

Perhaps, it would be cleaner if I emphasize that the proposal is restricted to Add/Sub’s (BinOp) with special arguments (second is an imaginary literal and the first is ±int or ±float literal. (We could also discuss if we can redefine also unary Sub of an imaginary literal.) For n-ary ± we should keep current evaluation rules, i.e. `a±b±c±d=(((a±b)±c)±d)`. If you want to place a complex literal somewhere between - use parentheses! After all, maybe they are for purpose in the `complex.__repr__()` output?

Here is an example of the AST transformation that does above.
``````from ast import *
from ideas import import_hook

class ComplexLiteralTransform(NodeTransformer):
def visit_BinOp(self, node):
match node:
match x:
case int(x) | float(x):
x, y = map(Constant, [float(x), y])
return Call(Name('complex'), [x, y], [])
case BinOp(Constant(x), Sub(), Constant(complex(imag=y))):
match x:
case int(x) | float(x):
x, y = map(Constant, [float(x), y])
return Call(Name('complex'), [x, UnaryOp(USub(), y)], [])
match x:
case int(x) | float(x):
x, y = map(Constant, [float(x), y])
return Call(Name('complex'), [UnaryOp(USub(), x), y], [])
case BinOp(UnaryOp(USub(), Constant(x)), Sub(), Constant(complex(imag=y))):
match x:
case int(x) | float(x):
x, y = map(Constant, [float(x), y])
return Call(Name('complex'), [UnaryOp(USub(), x), UnaryOp(USub(), y)], [])
return self.generic_visit(node)

def visit_UnaryOp(self, node):
match node:
case UnaryOp(USub(), Constant(complex(imag=x))):
return Call(Name('complex'), [Constant(0.0), UnaryOp(USub(), Constant(x))], [])
return self.generic_visit(node)

def transform_cl(tree, **kwargs):
tree_or_node = ComplexLiteralTransform().visit(tree)
fix_missing_locations(tree_or_node)
return tree_or_node

return import_hook.create_hook(hook_name=__name__,
transform_ast=transform_cl)``````

Alternative C version (a draft, no error checks, etc): GitHub - skirpichev/cpython at complex-literals-with-usub.

With André Roberge’s https://github.com/aroberge/ideas:

``````\$ python -q -m ideas -a cl-transform
Ideas Console version 0.1.5. [Python version: 3.12.0rc1+]
ideas> 1-0j
(1-0j)
ideas> 1+0j
(1+0j)
ideas> -0j
-0j
``````

In fact, I think we can consider `(±a±bj)` to be the true complex literal. Whereas a feature that we can omit parentheses sometimes (e.g. for simple assignment like `x=1+2j`) - a syntactic sugar.

## Alternative

We also could solve the problem, using additional complex subtype (see this), the imaginary class (like does e.g. the C11 standard, annex G).
There will be new special rules for mixed arithmetics (see section 5 of the annex G for details), e.g:

``float + imaginary = complex(float.real, imaginary.imag)``

New rules, however, alter only cases where mixed operands will have nans, infinities or signed zeros in their components.

No new literal types, no changes in parsing of source code or altering the `complex.__repr__()` (just as in the above solution), but a “little” new thing:

``````>>> type(3.14j)
<class 'imaginary'>``````

On another hand, as it was mentioned by Serhiy Storchaka and Mark Dickinson in the issue #84450, the new type could solve other “gotchas”. For example, currently in Python:

``````>>> complex(0, math.inf) * 1
(nan+infj)``````

will be

``````>>> complex(0, math.inf) * 1
infj``````

because multiplication of a complex to a real (or to an imaginary number) will be componentwise. For same reasons, `±1j` will be a correct rotation in the complex plane (multiplying any complex number `z`, not just finite, by `1j` 4 times exactly recovers `z`).

Edit: avariant of above is a special treatment in arithmetic ops for `complex(0, imag)` instances without introduction of a new type.

## Other

Finally, I would also mention attempts to solve only the `eval(repr)` issue for the complex type.

First, we could use the “verbose” form in the `repr()` output like `complex(real, imag)` (obviously, this was too verbose for Guido). A variant of: using this form of the `repr()` format only for complex numbers with signed zeros in components.

Alternatively we could use “hackish” form like `-(-2+0j)` for our “corner cases”, like did Serhiy Storchaka in the pr #19593.

Both solutions make the `repr()` output even less uniform than now (currently we sometimes omit parens).

5 Likes

@skirpichev Thank you for bringing this discussion to python-ideas!

Under your proposal, I assume that `1.0 - 0j` would be interpreted as `complex(1.0, -0.0)` (rather than `complex(1.0, 0.0)` as it is now). That’s all well and good, but how would each of the following be interpreted under your proposal?

``````(1.0) - 0j
+1.0 - 0j
0.0 + 1.0 - 0j
float(1) - 0j
x=1.0; x - 0j
``````

If those aren’t all interpreted the same way as `1.0 - 0j` then we’ve lost referential transparency and code becomes harder to refactor and reason about.

``````Python 3.11.4 (tags/v3.11.4:d2340ef, Jun  7 2023, 05:45:37) [MSC v.1934 64 bit (AMD64)] on win32
>>> eval(repr(-0.0j)) == -0.0j
True
>>> eval(repr(1-0.0j)) == 1-0.0j
True
>>> eval(repr(1-0j)) == 1-0j
True
>>> eval(repr(0.0-0j)) == 0.0-0j
True
>>> eval(repr(1-0j)) == 1-0j
True
>>> eval(repr(-0+1j)) == -0+1j
True
>>> eval(repr(-0.0+1.0j)) == -0.0+1.0j
True
>>> eval(repr(1.0-0.0j)) == 1.0-0.0j
True
>>> (-0+1j) == complex(-0.0,1.0)
True
>>> (1-0j) == complex(1.0,-0.0)
True
>>> complex(-0.0, -0.0) == (-0-0j)
True
>>> complex(-0.0, -0.0) == -0.0+(-0.0)*1j
True
>>> complex(-0.0, 0.0) == (-0+0j)
True
>>> complex(-0.0, 0.0) == -0.0+0.0*1j
True
>>> 0.11111111111111111123456789
0.1111111111111111
>>> 0.11111111111111111123456789 == 0.1111111111111111
True
``````

It seems the invariant holds and the values are considered the same. That it doesn’t always look the same seems unavoidable and is also the case for other literals like float numbers.

1 Like

Would `1 - 2 + 3j` become `1 - complex(2, 3)`?

1 Like

True. BTW, I think we shouldn’t omit also the alternative proposal (imaginary class). It wasn’t clearly rejected in mentioned issues and it is supported beyond C and C++ languages (e.g. in Go too).

I think, they will be same.

1. `(1.0) - 0j` will be parsed as a `Sub` of a `float(1.0)` and `0j==complex(0.0, 0.0)`.
2. `+1.0 - 0j` as a complex literal == `complex(1.0, -0.0)`.
3. `0.0 + 1.0 - 0j` as a `Sub(Add(0.0, 1.0), 0j)`.
4. `float(1) - 0j` - a variant of (1), there are an integer literal `1` and `0j`.
5. `x=1.0; x - 0j` - again a variant of (1): same literals.

Formal syntax could be found in the R7RS standard, sec. 7.1.1, p. 62 (obviously, we will exclude `@` notation for polar form and special handling for nan/inf literals).

In some sense it’s true, they are same wrt the `==` op. Yet the `complex(-0.0, 0.0)` and `complex(0.0, 0.0)` are not same objects just as `-0.0` and `0.0` (they behave differently):

``````>>> from math import copysign
>>> copysign(1.0, +0.0) == 1.0
True
>>> copysign(1.0, -0.0) == 1.0
False
>>> 0.0 == -0.0
True``````

Rather `Add(Sub(1, 2), 3j)`.

Clearly, this is a subjective judgment. One argument is that a number of recurring issues in our bugtracker. Other argument is that users of the complex type (and the cmath library) are coming with some background in mathematics, while we can’t assume they are familiar with some other computer language.

1 Like

How about `1 - 2+3j` Or `z - 2+3j`? Wouldn’t you argue that the user meant `2+3j` as one complex literal?

That’s no different from any other misuse of whitespace, like writing `2 * 3+4` and expecting a result of 14.

Don’t forget that the repr for a complex number includes surrounding parentheses. The order of operations would remain correct:

``````>>> x = 3+4j
>>> y = 5
>>> x * y
(15+20j)
>>> eval("%r * %r" % (x, y))
(15+20j)
``````

So this is only going to be a problem when the parens are omitted, and no worse a problem than anywhere else.

1 Like

But `3+4` is not a literal. And to be clear: I’m not saying that that `2+3j` should be treated as one literal there. I’m asking whether Sergey thinks it should. Based on what they wrote, I think they might. And I’m wondering what criteria they apply to decide.

I would agree with @Rosuav in the first case. Second should be parsed as `Add(Sub(z, 2), 3j)`. I admit, parsing with imaginary literals (present state of art) is much simpler.

Evaluation order in `+`?

You forgot my leading sentence: “Clearly, this is a subjective judgment.” Would you argue instead, that people will learn complex analysis from Python docs? What is your guess about user expectations?

I didn’t forget that. But you stated it not as “subjective judgement” or as a “guess” but as a fact.

What’s my guess? I don’t have one. I don’t have enough data to make one.

These arguments also seem similar in the usual misunderstandings with float numbers. They will have to learn a bit about Python and numerical computing at some point if they are curious about such details.

Even `1000` is not the same object as `eval(repr(1000))`. `id(1000)` gives different result.

Ok actually I do have a guess now: most users … don’t care :-). Or never even notice.

Thank you for good presentation @skirpichev.

I am personally a great fan of the imaginary class. It looks simpler and more coherent in comparison with alternative solutions. Most changes are local to 1-2 classes:

1. A new `imaginary` class. Making it a subclass of `complex` makes many things easier. It needs to overload a bunch of methods: `__new__`, `__repr__`, `__reduce__`, `__neg__`, `__add__`, etc.
2. The `complex` subclass only needs a tiny tweak in `__repr__` (to represent the real negative zero as `-0.0` instead of `-0`) and specialize arithmetic operation if other operand is a real number.
3. The parser needs an update to produce imaginary instead of complex numbers.
4. Few parts of the compiler that expect an exact complex type (such as `_PyCode_ConstantKey()`).
5. A new `marshal` protocol to support imaginary values. The marshal format was not changed for many years, it is a good opportunity to add other minor features, such as the support of `slice` objects.

And most of the rest should just work.

I see the main obstacle to this idea is that the imaginary type will not survive storing in `array.array` and NumPy arrays. Accordingly, the results of some operations on bare Python numbers and NumPy arrays will be different.

4 Likes

Don’t you think it will break more things? I.e. not just `float-0j`, but also other cases of mixed operands (real op complex, imaginary op complex).

That’s a simple (few lines) change in the `parsenumber_raw()`, that’s why it wasn’t mentioned.

Not sure I got you. The stdlib’s `array` type don’t support now complex numbers.

My major concern with this solution is that this new class looks to be alien to the numeric tower (PEP 3141)…

Hardly most users who care about using complex numbers in python don’t have some expectations on how this stuff should work…

While that PEP is relevant and useful prior art, it’s not clear to me that it conflicts with the addition of `Imaginary `.

`Decimal` is noted by the PEP as existing outside of the tower already. And part of what’s being defined there is how to make new, well behaved numeric types, like `Imaginary`.

Can the change be made easier on numpy if Imaginary is added separately before it is used by complex? Addition of Imaginary with any other numeric can be defined to produce a complex number as the result.
And then complex can change to use imaginary internally later.

I’m not sure if that helps, or helps enough, to be worth the added complexity of introducing it more slowly.

Sounds like trading the astonishment that there are no complex literals for the astonishment that `+` and `-` in `a+bj` are no longer the operators `+` and `-`.

And unlike in `1.0e+3` there is no `e` that tells you ahead of time that the `+` is not the operator.
In `a+bj` one would need to wait until the `j` at the end.

2 Likes

Isn’t `2*1+1j` the problem with complex laterals?

Perhaps, it would be cleaner if I emphasize that the proposal is restricted to Add/Sub’s (BinOp) with special arguments (second is an imaginary literal and the first is ±int or ±float literal. (We could also discuss if we can redefine also unary Sub of an imaginary literal.) For n-ary ± we should keep current evaluation rules, i.e. `a±b±c±d=(((a±b)±c)±d)`. If you want to place a complex literal somewhere between - use parentheses! After all, maybe they are for purpose in the `complex.__repr__()` output?

AST-transformation to formalize this a little (and/or to play with)
``````# cl-transform.py

from ast import *
from ideas import import_hook

class ComplexLiteralTransform(NodeTransformer):
def visit_BinOp(self, node):
match node:
case BinOp(x, op, Constant(y)) if isinstance(op, (Add, Sub)) and isinstance(y, complex):
y = y.imag
y = Constant(y) if isinstance(op, Add) else UnaryOp(USub(), Constant(y))
match x:
case Constant(x) if isinstance(x, (int, float)):
return Call(Name('complex'), [Constant(x), y], [])
case UnaryOp(op, Constant(x)) if isinstance(op, (UAdd, USub)) and isinstance(x, (int, float)):
if isinstance(x, int) and x == 0:
x = float(x)
x = Constant(x) if isinstance(op, UAdd) else UnaryOp(USub(), Constant(x))
return Call(Name('complex'), [x, y], [])
return self.generic_visit(node)

def transform_cl(tree, **kwargs):
tree_or_node = ComplexLiteralTransform().visit(tree)
fix_missing_locations(tree_or_node)
return tree_or_node

return import_hook.create_hook(hook_name=__name__,
transform_ast=transform_cl)``````

With André Roberge’s https://github.com/aroberge/ideas:

``````\$ python -q -m ideas -a cl-transform
Ideas Console version 0.1.5. [Python version: 3.12.0rc1+]

ideas> (1-0j)
(1-0j)``````

In fact, I think we can consider `(±a±bj)` to be the true complex literal. Whereas a feature that we can omit parentheses sometimes (e.g. for simple assignment like `x=1+2j`) - a syntactic sugar.

That’s somewhat an implementation-dependent feature:

``````Python 3.9.16 (7.3.11+dfsg-2, Feb 06 2023, 16:52:03)
[PyPy 7.3.11 with GCC 12.2.0] on linux
>>>> id(1000)
16001
>>>> id(eval(repr(1000)))
16001
``````

While signed zeros - feature of the IEEE 754.

As well as for imaginary literals I think - no. See above formalization with the AST transformation. Should be `Add(Mul(2,1), complex(0, 1))`.

1 Like

I was reading through this discussion trying to understand what the source of the problem might be. I’m not sure I have been able to follow everything so, if possible, I’d like to ask for a brief summary. Presumably all the arithmetic operations between floats are “fine”, so I struggle to understand how it is possible that suddenly things break when dealing with a pair of floats.

As for the idea of introducing a dedicated type for “imaginary numbers”, I don’t quite see what the need would be. Even in mathematics there isn’t generally a need to define/use the set of imaginary numbers. What problems would this new type solve?

1 Like