Include `math.sign`

Dear Oscar,

Thank you for your message. I appreciate your sincerity and understand the concerns regarding AI in modern development. I’m glad we could clear this up.

I’m preparing a detailed technical code audit for the forum to provide further clarity on the module’s architecture for everyone interested.

Best wishes,
Alexandru.

1 Like

Then I think you’re going to have to drag the Steering Council into it. Python doesn’t have a uniform, coherent model for polymorphism, apart from classes implementing their own methods, and it’s not sign()'s job to solve that.

Now that math.integer exists, it’s mostly true that the functions that “belong in” math always return floats, but you don’t like that either.

I’m more “perfect is the enemy of the good”.

It’s not like numpy is consistent about matching input type either:

>>> import numpy, fractions
>>> q = fractions.Fraction(-30, 8)
>>> q
Fraction(-15, 4)
>>> numpy.sign(q)
-1
>>> type(_)
<class 'int'>

It would annoy people too if that returned a float or a Fraction. Proof: it would annoy me, to start with :wink:

Much less do I want to see math.sign(42) returning a float.

What do you suggest math.sign(42) return? Really? And then how do you propose that be generalized, in an implementable way, to all “numeric types”? It could be hacked to grow sufficient knowledge of all the core numeric types, but not for custom types. The latter requires some explicit way for types to “opt in” (typically by supporting a class method with some distinguished name, dunder or not).

The thing that makes other functions reasonably friendly to math’s pseudo-polymorphism hacks is that they always return floats. There are a few exceptions, like sumprod(). But even then, any float in the input will force a float result.

>>> import math
>>> math.sumprod([1, q], [2, q])
Fraction(257, 16)
>>> math.sumprod([1.0, q], [2, q])
16.0625
>>> math.sumprod([1, q], [1+1j, q])
(15.0625+1j)
>>> math.sumprod([1, q], [Decimal(1), q])
Traceback (most recent call last):
  ....
TypeError: unsupported operand type(s) for +: 'decimal.Decimal' and 'Fraction'

Floats are central to math’s view of the world. An “all things to everyone” sign() is a bad fit to that.

So, ya , my compromise is to let math.sign do what it’s good at: floats and only floats. Perfection is plain out of reach without more years of debate.

1 Like

OTOH … that can be compromised away too, via a different kind of ugliness: let math.sign() always return a float. Then “works for anything that can be converted to a float” works the same way as nearly all of math’s polymorphism hacks.

If math.integer wants a different kind of result, fine, it can supply its own sign function. At the cost of annoying people who value “one obvious way to do it”.

2 Likes

NumPy doesn’t have a dtype that can handle Fraction so this is fallback generic code (dtype=object)

In [11]: np.array([q])
Out[11]: array([Fraction(-15, 4)], dtype=object)

In [13]: np.sign(-3.0, dtype=object)
Out[13]: -1

So in the dtype=object case (any possible unconvertible types) I guess there is a generic fallback that goes with int but from numpy’s perspective this is outside of its type system. The code that implements this cannot work reliably for different types of inputs so I would have just made it a type error to use sign with dtype=object much like for np.cos:

In [15]: np.cos(q)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
AttributeError: 'Fraction' object has no attribute 'cos'

The above exception was the direct cause of the following exception:

TypeError                                 Traceback (most recent call last)
Cell In[15], line 1
----> 1 np.cos(q)

TypeError: loop of ufunc does not support argument 0 of type Fraction which has no callable cos method

I also would not bother trying to get a .cos() method here since there is no established convention for such methods (even if arb has them).

The generalisation is that if the types have a __float__ method then it uses that. That is easily implementable and understandable and more well defined than attempting a generic implementation or trying to dispatch over types.

1 Like

I thought you were determined that

def sign(x: T) -> T:

be met. Apparently not :wink:.

I agree it’s the best fit to what’s already in place, although it’s quite a compromise. People will be surprised by that sign(giant_int) blows up, and especially so because numpy.sign(giant_int) returns the dead-obvious result.

And it doesn’t appear to match what anyone else does, although it would match JavaScript’s Math.sign() it it went on to keep the sign of 0.0.

JS’a sign() also blows up if you pass a giant int, although they’re currently arguing about adding a new BigInt.sign function

It’s somehow disappointing when Python has to take JS as its model :wink:

2 Likes

I guess this is not obvious but the C implementation I showed above does actually handle many different input types via the __float__ method. It just always uses that and then returns a float. This is also what most of the functions in the math module do.

Here is a little demonstration:

>>> import math
>>> from sympy import sqrt
>>> e = 1 + sqrt(2)
>>> e
1 + sqrt(2)
>>> math.sign(e)
1.0

So we can pass a SymPy symbolic expression in and we can get a float out. This works because the C code uses argument clinic to get an input of type double which uses PyFloat_AsDouble which in turn looks for both __index__ and __float__ methods and even has special hooks to call those methods more efficiently for types that are implemented in C.

The __float__ method is the de facto standard method for interoperability between real numeric types in Python and is widely implemented and is basically the only usable part of the Real ABC if you actually want to work with multiple numeric types.

1 Like

If you have a function with signature

def sign(x: T | others) -> T:

then you can also use it as if it were a function

def sign(x: T) -> T

by simply only passing in arguments of type T. If you don’t pass anything else in then the | others part of the signature is irrelevant.

In generic code you want to allow many possible choices for T but whatever that choice is you will be very strict about ensuring that all of your variables are of that type. If T is float then all variables of type T need to be strictly of type float (an int is unacceptable). If T is Fraction then … and so on.

If math.sign sends float to float then I can write generic code that uses either math.sign with floats or np.sign with numpy arrays and so on. If math.sign sends float to int then that doesn’t work any more. That is the point I was making about T -> T being the necessary signature.

1 Like

The “Absurdity” of Quality: Code Audit of csignum-fast v1.2.2

Let’s try to understand why my code “for something as trivial as a signum function” seems too big.

I’ve performed a line-by-line audit of the 340 lines of my code to show exactly what they do.

The result: The core math is indeed just 4 lines. 43 lines are empty: being a human person☺, I want to read my code with some level of comfort. The other 293 Lines of Code (LOC) are what transform a school exercise in Python — def sign(x): return (x > 0) - (x < 0) — into a Gold Edition software tool.

High-Level Breakdown (by line categories)

Category Lines Description
Safety & Guards 102 Handling NaN and type errors; memory leak prevention
Documentation 52 Full-line comments (moreover, 22 lines of code contain internal comments)
Performance Tuning 52 Static variables; optimization hints for compilers (42 of 52)
Readability 43 Empty lines for clarity and readability
Module Framework 32 Declarations; module structure and its registration in Python
Argument Processing 28 Handling one positional and three keyword arguments
Feature Logic 27 Handling preprocess and codeshift keyword arguments
Core Logic 4 return (x > 0) - (x < 0)
Total: 340

If you are interested in full line-by-line audit log, see “Anatomy of csignum-fast v1.2.2” below.

A Note on Micro-Optimizations

One might ask: “Why even use switches instead of multiplications?”

Because when provided with the optimization hints used in this code, modern compilers (GCC/Clang) generate much more efficient branchless machine code for switch blocks than for standard arithmetic multiplications. Every CPU cycle counts in this “Gold Edition.”

Fast statistics

I’m thrilled to see such interest in the project! 2,200+ downloads of v1.2.2 in its first
48 hours (Jan 5-6, 2026) prove that the community values high-performance, safe, and well-documented C-extensions.

Thank you for your support!

Anatomy of csignum-fast v1.2.2

Lines Technical Description Category
1 Core Python API header inclusion (#include <Python.h>) Module Framework
10 Static variable declarations for internal optimization Performance Tuning
3 Function signature and scope delimiters {} Module Framework
[ 14 Subtotal: Global scope and module-level infrastructure ]
4 Positional argument validation (strict single-argument check) Argument Processing
5 Memory initialization for positional and keyword arguments Argument Processing
19 Keyword argument parsing and validation logic Argument Processing
1 Reference tracking variable to prevent memory leaks Safety & Guards
27 preprocess logic: tuple validation and argument swapping Feature Logic
9 Initial NaN detection via float conversion and C++ validation Safety & Guards
2 Declaration of used variables (flags & results) Module Framework
1 gt = (x > 0) via Python Rich Comparison protocol Core Logic
1 Persistence of GT state for error and NaN handling Safety & Guards
1 lt = (x < 0) via Python Rich Comparison protocol Core Logic
1 res = (long)gt - lt; — Final signum calculation Core Logic
31 Comprehensive error & NaN state analysis and encoding Safety & Guards
1 Memory cleanup and resource release before return Safety & Guards
1 Final return (applying codeshift if necessary) Core Logic
59 Exception factory: forming and raising Python-level errors Safety & Guards
[ 177 Subtotal: Functional logic (“The Working Core”) ]
26 Module registration: PyModuleDef, method tables, and PyInit Module Framework
[ 203 Subtotal: Pure CPython implementation code ]
42 Low-level compiler hints and optimization instructions Performance Tuning
[ 245 Subtotal: Code (excluding comments and empty lines) ]
52 Full-line comments Documentation
[ 297 Subtotal: Lines of Code (LOC) ]
43 Empty lines for structural clarity and readability Readability
[ 340 Grand Total ]
1 Like

@acolesnicov can you optimise the C code I showed above?

I traced MPFR history back to commit c79e31968, which says “This function is provided for compatibility with mpf”. The mpf type has very different semantics from the IEEE floating-point types. E.g. there is no signed zero, exponents are unbounded.

So, I would guess, that mpf_sgn() uses int return type just because the mpf has no signed zero. And for consistency with other library functions (mpz/mpq_sgn).

That’s more simple to explain, as docs says:

It implements a multiprecision equivalent of the C99 standard.

And there is no sign() function in the Annex G of the standard. (Though, signum() function has ISO/IEC 10967-3.)

1 Like

JavaScript’s Math.sign() is the only implementation I’ve seen that cares about the sign of a zero (everyone else returns plain positive 0 for \pm 0 inputs), which includes numpy, which started life in a world with signed zeroes. As I’ve said before, I see no reason for a sign function to care about the sign of a zero.

Too glib :wink: This is GMP: if they think something is useful, they implement it, regardless of what standards say.

And even if useful, they may think z/abs(z) is a more obvious way to spell “vector with the same phase ending at a point on the unit circle”.

Regardless, nothing on the table here yet addresses complex.sign() (coercion to float will blow up).

1 Like

I’m not really following this. So far as it goes, I see nothing in it to favor T=float over, say, T=int. If sign returned int and also accepted int (which it surely would). it fits the above too.

There are, of course, other reasons to prefer T=float.

In prior messages, the “| others” part was conspicuous by absence.

+1

The JS case seems here by accident, as this language has “just numbers” (well, now also BigInt’s). If we have no integer type, but two zeros — there is no much choice.

BTW I found sign() specified with {-1,0,+1} domain image in ISO/IEC 10967-1 (see §5.2.2 and §5.2.7). This standard says that “operations in in §5.2.2 shall not distinguish -0 from +0” if implementation conforms to the IEC 559.

Different people are behind of GMP and MPFR/MPC.

Though, there are plans to implement few more special functions, beyond the standard, like mpc_cbrt() or mpc_erf(). But no mpc_sgn().

1 Like

I would prefer to have sign return an int because I wouldn’t like sum([sign(x) for in in vals]) == len(vals) to check 9.99999 == 10 and fail.

From numpy experience, I also would prefer nans to pass through. Let suppose I create an image from a computation of some value, I want pixels with color which is blue when positive, red when negative, and purple when zero. I typically wish any nan that would have come from improperly formatted data (e.g. out of acceptable bounds) will be distinguishable from zero values properly computed, and probably i want to display white pixels, so that it looks as an incomplete image.

And I do not think the sign of complex type is useful (neither have a mathematical definition), the ‘angle’ (or ‘phase’) is usually the quantity of interest.

1 Like

Indeed, I’d be interested to see a use case for that. Along with a justification for calling that operation “sign” rather than something more descriptive of what it actually does.

The complex sign in numpy actually does x/abs(x), or cos(angle(x)) + 1j*sin(angle(x)).
I guess this ensures x == abs(x) * sign(x) remains truthy.

The Inner Logic of csignum-fast

From the very beginning, this project was designed to be type-agnostic. You won’t find any literal < or > operators in my C code. All comparisons are performed using the internal methods of the argument itself—its native __gt__, __lt__, and __eq__ implementations. In CPython, this is achieved via PyObject_RichCompareBool(x, Py_zero, Py_GT) and its counterparts for LT and EQ.

This approach works for any numeric type right out of the box. My sign function seamlessly handles Decimal, Fraction, and sympy numbers, even though I didn’t write a single line of specialized code for them. If someone implements an UltraDecimal or ExtraFraction a hundred years from now, this program won’t need a single change, as long as those types follow the numeric protocol.

The Trade-off: There is a minor overhead for converting Py_zero (Pythonic int(0)) to the type of x. A small price to pay for universal type independence.

The “Ternary Logic” Challenge

If you use PyObject_RichCompareBool for comparisons, I have two pieces of news for you.

The Good News: The function returns a standard C int. Therefore, the calculation of (x > 0) - (x < 0) is performed without additional type casting or complex conversions.

The Not-So-Bad News: The result can be 1 (True), 0 (False), or -1 (Error). This is tricky because, without proper handling, you might get results like +2 or -2, or even conclude that sign("error") = 0. However, it’s “not-so-bad” because the argument itself handles the type validation. If the argument cannot be compared to zero, it will report the error to you.

To handle all possibilities, I implemented an Index of State (state_idx):

  1. I normalize the comparison result to 2 (True), 1 (False), or 0 (Error) by simply adding 1 to the output of RichCompareBool.
  2. I multiply the results of three comparisons (GT, LT, EQ). The result is either 0 or a power of two: 0, 1, 2, 4, 8.
  3. I then apply a bitwise & 3 operation (which is equivalent to % 4 but significantly faster) to determine the final state.

State Index Interpretation:

  • Index = 2: The ideal case. One “True” (2) and two “False” (1). The argument is a valid number; the result is gt - lt.
  • Index = 1: Three “False” results. This suggests the input might be NaN. I perform an additional x != x check to confirm.
  • Index = 0: (The mod 4 result of 0, 4, or 8). This indicates either a comparison error (0) or a logical contradiction (multiple “True” flags). This is treated as a non-numeric type error, and an exception is raised.

Micro-Optimizations: Switches vs. Multiplications

Modular arithmetic tells us that (A * B * C) mod 4 == ((A * B) mod 4) * C mod 4. In the code, I replaced “heavy” multiplications with a switch, bitwise shifts (<< 1), and bitwise AND (& 3).

There are two such switches: for lt and for eq. If the value of the multiplier is 0, I go directly to error handling. If the value is 1, I don’t need to multiply, and I simply issue break;. To multiply by 2, I use bitwise shift, and then bitwise AND to truncate index mod 4.

Multiplication consumes more CPU cycles than these bitwise operations and jumps. By providing explicit optimization hints (accounting for 12% of the total code), I ensure the compiler generates highly optimized machine code. Furthermore, this branchless/low-jump sequence is more likely to stay in the CPU cache, further accelerating execution.

For more technical details, please refer to the comments in the csignum-fast source code.

1 Like

Your implementation uses Argument Clinic to pre-cast everything to double. This is good for the math module where only floats are expected. You are protected from TypeError.

My csignum-fast was designed with a different philosophy: Type Independence.

To optimize this specific snippet for mathmodule.c, you could shorten the if/else chain with copysign(1.0, x), but you’d still need to handle x == 0 and NaN.

The ‘optimization’ in my code isn’t about the speed of a single double comparison — it’s about the zero-cost abstraction for Decimal, Fraction, etc., without casting them to float. My 340-line code is not protected from inappropriate arguments, but I have built this protection for my user - my user is protected. That’s why these 340 lines of code, while the calculations themselves are extremely simple.

1 Like

Sure there is! It’s trivial either way; the difference between

if (x == 0.0) return x; /* preserve sign */

// or

if (x == 0.0) return 0.0; /* lose the sign */

Standards just make stuff up too :wink: This version of a draft of that says something very different:

https://www.open-std.org/jtc1/sc22/wg11/docs/n507.pdf#:~:text=The%20bulk%20of%20ISO%2FIEC%2010967-1,be%20efficiently%20realized%20in%20software.

The operation signum returns a floating point 1 or −1, depending on whether its argument is positive (including zero) or negative (including negative zero

Long, long ago, standards thought their job was to reconcile existing practices.

Caution is prudent, but not justified in context. In Python, a float can represent all ints with absolute value \leq 2^{53} exactly, and {+. -. * } on them is also exact provided the result stays in that wide range.

The inputs here (barring NaNs) are always in {-1.0, 0.0, +1.0} so you’d need len(vals) > 2**53 to get in trouble. Ain’t gonna happen.

All right - an actual use case! Thank you :smiley:

2 Likes