Faster float / string conversion (Ryu)

From my understanding, ryu and dragonbox achieve their performance while honoring that. Paraphrasing from the dragonbox abstract:

criteria:

  1. Information preservation
  2. Minimum-length output
  3. Correct rounding

ryu and dragonbox (among others) satisfy these

You seem to feel strongly about this. Maybe you are interested in submitting a PR?

1 Like

That could just mean that str is also slow.

Given that json took 79 seconds with floats and 18 seconds with ints, but orjson took only 5 seconds with either, it looks like a mix of reasons, with float indeed contributing by far the most to the slowness.

Just for avoidance of doubt, it’s worth pointing out that these proposals would not do away with the existing dtoa.c code, which handles more than just the shortest-string float-to-string conversions: dtoa.c also provides support for {:.e} and {:.f} formatting, and is used for the current round implementation (including the corner case where the second argument to round is negative - e.g., rounding to the nearest hundred / thousand / million), as well as for correctly-rounded conversions in the other direction, from strings to floats. So complete dtoa.c replacement would be a substantially bigger project than introducing Ryu (or dragonbox, or whatever the latest state-of-the-art code is) for faster shortest string float->str conversion.

My guess - on the basis that most data should be read more often that it’s written - is that to see significant improvements we’d also need a faster correctly-rounded str->float conversion.

A point made in the dragonbox abstract is that floating-point I/O is largely asymmetric. Going from float binary to decimal string is more complex, since there is a large space of candidate outputs to chose from (trailing zero’s, varying exponent, etc.). So an algorithm might handle that slowly (dtoa).

“Most data should be read more often that it’s written” is reasonable, but is a view centered on data-storage, and specifically the client’s view. For example, a server publishing read-only records over the network in json, or some other text format, is doing much more number-to-string than string-to-number.

Anyway, I wouldn’t suggest taking an effort like this on (replacing the float repr implementation) unless there was evidence showing that a significant portion of real-world applications would benefit.

Interestingly, the frepr package, cited in this thread, monkey-patches PyFloat_Type.tp_repr to achieve its speedup. While it’s using a much slower algorithm (the cited 8x speedup is more like 2x for small-exponent values), and this approach doesn’t work for PyPy, it gave me a few ideas:

  • update this package to use ryu or dragonbox
  • add an instrument-only mode. Wrap the existing repr(), but collect stats on call count and time spent in the function. Then anyone can determine exactly how a 15 or 20x speedup of this function would affect their app, without correctness concerns.
1 Like

we’d also need a faster correctly-rounded str->float conversion.

If you talk about string → float conversion too … could such be of help?

didn’t try but sounds promising
[ edit ] probably it’s already in or will come shortly:
It is part of the standard C++ library under Linux (as of GCC 12);
( Daniel Lemire, Computer Science Professor ), [ /edit ]

1 Like