Reliable way to detect whether the C version of `decimal` is installed

tim.one · May 2, 2024, 8:29pm

This should be shallow to someone . Long story short: _pylong.py has a very fast int->str conversion function that relies on the C version of the decimal module for speed. But if the C version isn’t available, CPython falls back to using _pydecimal.py instead. This isn’t just slower, it doesn’t work at all. Because _pydecimal itself does str(int) on very big ints, which CPython asks _pylong to handle, which in turn asks _pydecimal to help, which in turn does str(int) on a giant int, which in turn … unbounded mutual recursion is possible.

So the idea is to supply a different (although much slower, still much faster than quadratic-time) int->str function in _pylong, and use that instead, but only when the C decimal isn’t present.

So the question is how to spell this test inside _pylong.py:

import decimal
if we have the C version of decimal: # how to spell this?
    # the current code
    def int_to_str(n)::
        ...
else:
    # a new decimal-free implementation
    def int_to_str():
        ...

I believe hasattr(decimal, '__libmpdec_version__') works for this, but if there’s a more principled way I’m all ears.

steve.dower · May 2, 2024, 8:38pm

I’d go with this:

import decimal
try:
    import _decimal
except ImportError:
    # decimal-free implementation
else:
    # the current code

Re-importing _decimal should be a dict lookup if it’s there. Might be a bit slower if it’s not (do we cache import failures?), but everything afterwards is also going to be slower, so I’m sure nobody will mind.

steve.dower · May 2, 2024, 8:40pm

Another alternative would be to add some kind of _IMPLEMENTATION = "native" or _IMPLEMENTATION = "_pydecimal" to decimal.py depending on which path it takes. I believe we do something like that somewhere else, but I don’t recall where.

tim.one · May 2, 2024, 8:41pm

Thanks! Speed doesn’t matter here - it will only be done once, when _pylong is first imported.

tim.one · May 2, 2024, 8:59pm

BTW, turns out that _pydecimal.py also has a ‘libmpdec_version’ attr, so my original idea was no good. Which I’ll spin as a positive, validating my intuition that a better approach was needed .

oscarbenjamin · May 2, 2024, 9:00pm

I don’t know the answer to your question about checking for the C implementation but thanks to you and others for working on this. I’m glad to see that there is improvement in asymptotic performance for this (and other integer operations).

I don’t want to derail this thread but I do want to ask one question and then leave it there: is this fast int -> str conversion still expected to be gated by sys.set_int_max_str_digits()?

Experience over time (for those of us using large integers) is that the int -> str direction is the more problematic part of limiting the size of integers in binary/decimal conversion.

tim.one · May 2, 2024, 9:09pm

The limit set by that is enforced by CPython’s internal handling of str(int), before any conversion is attempted. So, yes, a “too large” int will raise an exception, and the _pylong function won’t be called.

You could override that by importing the undocumented _pylong yourself and calling its functions directly, but that’s in “consenting adults” territory.

steve.dower · May 2, 2024, 9:12pm

I thought int -> Decimal was equally slow, but we decided that it was likely to be intentional and didn’t need protection the same way that int -> str does? (And Decimal -> str is fast, so it isn’t limited, so int -> Decimal -> str avoids all limits.)

tim.one · May 2, 2024, 9:31pm

Yes, decimal.Decimal(int) even now remains quadratic-time in int.bit_length(), and no limits are imposed.

Ironically enough, _pylong.py contains an internal int-:>Decimal function that’s very fast. Indeed, that’s how it implements str(int):

def int_to_decimal_string(n):
    """Asymptotically fast conversion of an 'int' to a decimal string."""
    return str(int_to_decimal(n))

But after something like an hour of thrashing, I didn’t find a way to get the C implementation of decimal to use it.

BTW, str(a_decimal) is extremely fast - linear in the number of decimal digits. For the same reason that, e.g., hex(int) is extremely fast.