Add complex literals?

Background (you probably already know this part, but others reading this thread may not): CPython, along with most other flavours of Python, represents floats using the IEEE 754 binary64 format, and complex numbers as pairs of floats (one for the real part, one for the imaginary part). One of the oddities of IEEE 754 floating-point arithmetic formats is that they have two zeros: negative zero and positive zero, which for clarity I’ll try to remember to write as -0.0 and +0.0 in what follows (though I’ll inevitably forget some of the + signs). The two zeros compare equal, and in most situations there’s no practical need to worry about the difference.

If you’re working with complex numbers but you don’t care about the signs of zeros in your real and imaginary parts, you can stop reading at this point. The “problem” being discussed here only arises when you start caring about those signs. (For some reasons why you might care about those signs in the context of complex arithmetic, take a look at Kahan’s “Much Ado About Nothing’s Sign Bit” paper.)

Assuming you’re still reading, try the following: enter 1.0 - 0j into a Python prompt, and hit return. Here’s what you’ll see:

>>> 1.0 - 0j
(1+0j)

The intention is to produce a complex number with real part 1.0 and imaginary part -0.0, but instead, as the repr indicates, we get a complex number with real part 1.0 (good!) and imaginary part +0.0 (what?!). This is essentially the problem under discussion.

So why do we get that +0.0 imaginary part? We’re subtracting a complex number (note that 0j is already of type complex, with real part +0.0 and imaginary part +0.0) from a float (1.0). Python first converts the float to type complex, then operates on the real parts and the imaginary parts separately. For the real part we get 1.0 - +0.0 = 1.0. For the imaginary part, we get +0.0 - +0.0 = +0.0 (because that’s what IEEE 754 specifies for the usual roundTiesToEven rounding direction - actually, that’s the result we get for all rounding directions other than roundTowardNegative).

At this point it looks as though there’s an easy fix: let’s just special-case mixed-type float and complex addition and subtraction operations to not promote float to complex first - that way, we don’t have to invent an imaginary part for the float operand. For example, if f is a float and z = complex(x, y) is a complex number, we can evaluate f - z as complex(f - x, -y), f + z as complex(f + x, y), etc. For the particular case above, this would give us the expected result.

Unfortunately, this turns out to be only half a fix. Consider a slightly different case:

>>> (-0.0) + 1j
1j

Here the intended result is a complex number with real part -0.0 and imaginary part 1.0. In this case it’s the real part that’s the problem: we get a real part of +0.0 where we were hoping for -0.0.

And this time the root cause of the issue is slightly different from before. It’s not the promotion of float to complex that’s the problem - it’s that 1j is already a complex number with real part +0.0 and imaginary part 1.0. Now when we add the real parts to get the real part of the result, we’re doing (-0.0) + (+0.0), which again under IEEE 754 rules gives +0.0.

Hence the proposed solution: if 1j simply didn’t have a real part - if Python had an ‘imaginary’ type, and 1j were an instance of that type - we could again special case the addition of a float to an imaginary to give the expected complex result.

It’s a fairly elegant solution, and it comes with a whole lot of other benefits, too (to take just one example, multiplication by 1j has better properties, for example, so that multiplication by 1j twice is exactly equal to negation, signed zeros and all, and so that x + y * 1j also produces exactly complex(x, y)).

I’ve never tried to push this suggestion for Python, for two main reasons: (a) it’s a fairly large and involved change in comparison to the size of the problem it’s solving - it feels like a case of “Purity Beats Practicality”, and (b) the WG14 C standards committee already tried this with C99 (see C99 §7.3.1p3 and C99 Annex G), adding optional imaginary types and a macro I, with the intent that x + y * I would indeed produce a complex number with real part x and imaginary part y (with nans, infinities, signed zeros all behaving as expected). But adoption of the C99 imaginary types has been disappointing: last time I checked, none of clang, gcc and MSVC implemented those types. I feel that if we were to try to introduce an imaginary type in Python, we should at least first try to understand why the major compilers weren’t interested in implementing these in C.

Addendum: I’ve concentrated on the problems with arithmetic operations above, but the other aspect that confuses people (and the issue that @skirpichev is focusing on) is the string representation. Consider:

>>> z = complex(-0.0,  1.0)
>>> z
(-0+1j)

Here the representation of z that’s printed at the prompt is doing its best to indicate to the user that z has real part -0.0 and imaginary part 1.0. That part’s fine. The bad part is that the representation is a valid Python expression, and that when evaluated that Python expression doesn’t recover the original complex number exactly: the real part is now +0.0 rather than -0.0:

>>> (-0+1j)
1j

@P403n1x87 Does the above help?

17 Likes

@mdickinson, thank you for a very detailed summary. My 2c below.

Not sure about MSVC (from the docs it looks - not), but it does work in the gcc a long time, e.g.:

    _Complex double x = 1.0 - 0.0*I;
    printf("%+f %+f\n", creal(x), cimag(x));
#ifdef __GNUC__
    x = 1 - 0.0j;
    printf("%+f %+f\n", creal(x), cimag(x));
#endif

The standard part in this example works with the clang as well, but the gnu extension (j-suffix) seems to be broken on clang-14 (prints +0.0). (Not necessary this means a good quality of mathematical functions in above implementations.)
Edit: on another hand, you are right - the _Imaginary keyword is not supported.

Try the second example. :slight_smile: That’s the one that needs imaginary types, and it looks like gcc still doesn’t have them.

Code:

#include <complex.h>
#include <stdio.h>

int main(void) {
    double complex z = -0.0 + 1.0 * I;
    printf("Real part: %f\n", creal(z));
}

For me, under Clang, this prints: Real part: 0.000000 - the sign is lost, just as in Python. Here’s a Godbolt link for gcc 13.2 on Intel 64: Compiler Explorer

Yep :frowning: I already got a difference wrt the annex G for x=-0.0*I. It seems, float ± float*I works, because the gcc/clang do proper specialization just for one mixed case: complex & float pair.

Absolutely, thanks @mdickinson. So in a nutshell, the issue arises from the fact that -0.0 + 0.0 = 0.0 and that there are no literal equivalents for complex(-0.0, 1) and complex(1, -0.0), which results in surprising results in certain cases.

I agree that an imaginary type would be an elegant way of solving this problem. However, in my opinion, this new type would only be useful for solving this particular problem. I don’t really see anybody really making use of it (and that’s maybe why there is a slow adoption in C implementations). Not only this, but I think it might actually create new issues with equally “astonishing” results. For example, with 1j now an instance of imaginary, if we let z = -0.0 + 1j, then isinstance(z, complex) = True, but isinstance(z, imaginary) = False. However, abs(z.real) = 0.0, which indeed makes z an imaginary number.

Personally, I would be in favour of @skirpichev’s proposal of special-casing Add and Sub (and complex.__repr__ if required) to make (a+bj) the literal equivalent of complex(a, b).

As for the other operations that are fixed by the new type, it seems to me that those are probably “bugs” in the current implementation that could be fixed by adjusting e.g. the multiplication operator between complex numbers (I believe the current results stem from using the generic multiplication expression, which would involve computations like math.inf * 0.0 = nan) to treat cases with abs(_.real) == 0.0 and abs(_.imag) == 0.0 in a special way.

1 Like

Right, this is @skirpichev’s “complex literal” proposal, as I understand it. Assuming that this is the only change being proposed (i.e., no changes to semantics of general addition / subtraction operations between complex and float quantities), this would make (-0.0 + 1j) and I = 1j; (-0.0 + I) give different results - the former would be recognised as a complex literal at parse time and give complex(-0.0, 1.0), while the latter would be computed at run time as a float-plus-complex addition and give complex(0.0, 1.0). And various things in-between like (-0.0) + 1j or (-0.0 + +1j) might give either of those results, depending on the exact details of the proposal.

For me, this level of subtlety in behaviour, and the loss of referential transparency, make the complex literal proposal a no-go - I don’t want to work in a version of Python where -0.0 + 1j and I = 1j; -0.0 + I aren’t interchangeable.

2 Likes

This is currently the case with complex, no? complex(-0.0, 1) is not the same as (-0.0 + complex(0.0, 1)). So this is not an issue with literals.

That’s true, but for me that’s much less surprising than having a situation where given x = <expr1>; y = <expr2>; x + y (where <expr1> and <expr2> are expressions of your choice), I can’t then simply substitute for x and y and write <expr1> + <expr2> (with appropriate levels of extra parentheses as needed) without incurring a subtle change in semantics. If I has been defined as I = 1j and I have an expression that uses I, I think it would be extremely surprising if the value of that expression changed as a result of replacing I with 1j.

One other note: while it’s certainly worthwhile to explore the possible solution space, at some point any serious proposal is going to have to wrestle with the issue of backwards compatibility (see PEP 387). Changes to parsing logic (e.g., as required by @skirpichev’s proposal, or the proposal to introduce an imaginary literal and making 1j parse to something of type imaginary instead of type complex) should be somewhat manageable via __future__ imports. But making changes to the behaviour of infix operators in a backwards compatible way is really hard; I have no idea how one would do it without some fairly serious changes to the internal machinery. @skirpichev’s proposal at least has the advantage that it doesn’t entail any such behaviour changes. The imaginary literal proposal would require changes to existing mixed-type float-add-complex operations, if it were to achieve its goal.

1 Like

Just like isinstance(1.0, int) is False. (But 1.0.is_integer() is True, and z.is_imaginary() could be as well.)

I admit, this seems to be a crucial counterexample. While technically it’s possible to “workaround” this with some AST transformer (just not a stateless, see e.g. this, and not without a performance loss) - it would be hard to explain without a proper imaginary type.

The new type seems to be only the option :frowning: One argument for it is the c99 standard (and c11) itself. The Python syntax is so close to C at this level (operator precedence, evaluation order) that I doubt we can invent a something new here. Unlikely a poor adoption is a consequence of a bad design.
Edit: related LLVM issue (with a link to a gcc equivalent) - Imaginary types are not supported, violating ISO C Annex G · Issue #60269 · llvm/llvm-project · GitHub

I don’t think there is a solution without breaking some “corner cases” (simplest example is using “signed” integer 0 in complex.__repr__()). But for such situations people use the complex class constructor, not fancy tricks with literals like -(-0.0-0j) or -(-1+0j).

But very likely that poor adoption indicates that the problem this solves is very rare in practice.

4 Likes

Under the special-parsing proposal, would -0 written without a floating point be handled symmetrically between real & imaginary parts?

  • Today, -0 means integer zero (there is only one), which is different from -0.0 being negative float zero. [and this can never change]

  • Whereas -0j is exactly the same as -0.0j because there are no “imaginary integers”, j suffix forces it to be a complex whose both parts are always signed.
    However, neither form parses the sign as part of the literal, it’s not semantically (-0.0)j. Both forms parse as essentially -complex(0, 0): first take a complex +zero literal, then the unary - operator is applied to both parts. The result is complex(-0.0, -0.0) — note how both parts got negated :frowning:

    • Is the - operator applied at runtime?
      I think semantically it could be? ast reports UnaryOp(op=USub(), operand=Constant(value=0j))
      but in practice it’s optimized by constant folding: dis.dis(lambda: -0j) reports a single LOAD_CONST of the final complex(-0.0, -0.0) value.

    • [Incidentally the result again has wrong repr (-0-0j) which is same as (+0-0j) and evaluates to complex(+0.0, +0.0). But repr is fixable separately, here I’m asking only about POLA parsing.]

  • OTOH in complex() notation both arguments are [normally] int/float so _complex(-0, -0) is same as complex(+0, +0)! Both require decimal point at least complex(-0., -0.) to get a sign.

So, if you were to parse ±a ±bj as a single complex literal, would the j suffix now also confers “sign preservation” to the ±a part whether it’s spelled -0 or -0.0?
I’d say that aspect becoming symmetric between real & imaginary would be one more wart removed :+1:.
But would it be surprising compared to non-complex literals or (-0) + (-0j) which I suppose must retain old behavior, or to complex() notation? :thinking:
Just raising the question, I don’t have a formed opinion here…

Smaller question: imaginary -0j alone

If we make signs part of the syntax, then I suppose -0j would parse as complex(+0.0, -0.0)?
That’d cause a different small anomaly: -0j would become different from -(0j). I don’t think python had any such cases so far, because syntactically minus was never part of any literal, it was always an unary op, right?

The imaginary class idea does better here — 0j would be imaginary(+0.0) having no real part, and -0j would be imaginary(-0.0) still with no real part. -X ≡ -(X) invariant gets preserved.

1 Like

my previous assumption is unfounded:

If we make signs part of the syntax , then I suppose -0j would parse as complex(+0.0, -0.0)?

I was thinking Bj must mean (+0.0 + Bj) for any B, and it’d be strange if only -0j would have negative real part.

But what I missed is that today, parsed as a negation operator, minus affects the sign of the real part not just for -0 but for any B ≤ -0.0 value too:

>>> (lambda z: (z.real, z.imag))(-2j)
(-0.0, -2.0)
>>> (lambda z: (z.real, z.imag))(-1j)
(-0.0, -1.0)
>>> (lambda z: (z.real, z.imag))(-0j)
(-0.0, -0.0)
>>> (lambda z: (z.real, z.imag))(0j)
(0.0, 0.0)
>>> (lambda z: (z.real, z.imag))(1j)
(0.0, 1.0)
>>> (lambda z: (z.real, z.imag))(2j)
(0.0, 2.0)

This is actually probably good. I like the “-X ≡ -(X)” invariant. :balance_scale:

And this behavior could be preserved by a [±A] ±Bj parser treating signs as syntax, just as well as by a run-time imaginary class.

Well, since you ask:

Python 2.7.18 (default, Dec 12 2022, 03:19:42) 
[GCC Apple LLVM 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> z1, z2 = -0j, -(0j)
>>> z1.real, z2.real
(0.0, -0.0)

That was fixed, kinda sorta by accident, in Python 3.2. See ast_for_factor unary minus optimization changes AST · Issue #53257 · python/cpython · GitHub for the gory details.

I agree that -X and -(X) being indistinguishable is a good property to have. :slight_smile:

1 Like

This is more or less irrelevant for the proposal, but I think -0j should behave as it would be an imaginary type instance. As you could see, the AST transformer example (see updated post) overrides the visit_UnaryOP() to do this. In this way we could “fix” the current behaviour to be like if we would have true imaginary literals.

Unfortunately, as it was pointed above by Mark: only in cases where we have an explicit imaginary literal in an expression. This will not work for examples like (-0.0 + 1j) vs i = 1j; (-0.0 + i). Or, rather, it will not work without some backtracking of assignments, etc. Looks too complicated for me c.f. the alternative.

After some thinking, @mdickinson, I’m not sure I got this point. What is a major obstacle here? It will be hard to explain new rules? (But now e.g. we are forced to have wrong statements in docs, like complex(r, i)==r+i*1j.) I doubt this change is too complex to implement/review.

Or the new type is too much? In fact, I should mention that we can avoid this. Slight modification of the “imaginary type” solution is that we could treat specially complex(0, float) (as well as float) instances in complex arithmetic ops. Is this better or worse?

The proposals here seem a bit like excessive complexity to me for the problem being solved. Negative zeros are problematic and it would have been better if they had either never been invented or otherise been invented properly. A proper implementation of signed zeros needs more than one sign bit because there are 3 cases:

  • Negative zero (-0)
  • Positive zero (+0)
  • Actual zero (0)

Kahan’s suggested use case for negative zeros is handling branch cuts but it is never possible to handle those in general if you don’t have all three types of zero: in some cases (0) should be treated like (-0) and in others as (+0). If you had all three zeros then the arithmetic for -0 and +0 could be defined symmetrically:

(-0) + (0) -> (-0)
(+0) + (0) -> (+0)
(-0) + (+0) -> (0)

I haven’t worked through this in detail but I think that most of the subtleties that cause the particular problems discussed above stem in some sense from this inability to distinguish (0) and (+0). We have signed zeros because it was decided to try and do something useful with the free sign bit but more than one bit would have been needed to make this work properly.

3 Likes

I’m no IEEE-754 expert, so I might not understand what you’re talking about, but I’ve never felt that floats do 0 wrong, apart from presentation where -0.0 can be confusing. Seems like adding a 3rd kind of 0 could make things even more complicated. Python definitely shouldn’t do anything beyond what the standards say.

1 Like

Actually, the signed zero component - just one example where we have problems. See Invalid "equivalents" of the complex type constructor in docs · Issue #109218 · python/cpython · GitHub. So, all this is not just for signed zeros.

I don’t see a problem, but in the documentation, where the + in a+bi is being confused with the , in the complex number (a,b)\in\mathbb{C} (as sets equal to \mathbb{R}^2). Even this whole thread is motivated by that confusion. When one thinks of the binomial representation of a complex number as a+bi as the same as (a,b) there is an identification happening. However, when working with floating point numbers instead of \mathbb{R}, there is no such identification between the two.

In my opinion, all is needed is to avoid in the documentation referring to complex numbers as a+bj and perhaps a better choice of string representation for complex numbers that forces the reader to think away from the + (and -), which should keep playing the role of the operator, and not part of a literal or a string representation.

1 Like

No major obstacle (except possibly backwards compatibility); just a lot of work. Introducing a new builtin type is certainly PEP territory.

This is a bit hand-wavy. What would the precise rules for complex + complex addition be under this special treatment? I doubt that it can be done in a way that doesn’t just shuffle the surprises around. In particular, I’d find it rather surprising if complex(a, b) + complex(c, d) weren’t exactly equivalent to complex(a + c, b + d).

1 Like