Given that json took 79 seconds with floats and 18 seconds with ints, but orjson took only 5 seconds with either, it looks like a mix of reasons, with float indeed contributing by far the most to the slowness.
Just for avoidance of doubt, itâs worth pointing out that these proposals would not do away with the existing dtoa.c code, which handles more than just the shortest-string float-to-string conversions: dtoa.c also provides support for {:.e} and {:.f} formatting, and is used for the current round implementation (including the corner case where the second argument to round is negative - e.g., rounding to the nearest hundred / thousand / million), as well as for correctly-rounded conversions in the other direction, from strings to floats. So complete dtoa.c replacement would be a substantially bigger project than introducing Ryu (or dragonbox, or whatever the latest state-of-the-art code is) for faster shortest string float->str conversion.
My guess - on the basis that most data should be read more often that itâs written - is that to see significant improvements weâd also need a faster correctly-rounded str->float conversion.
A point made in the dragonbox abstract is that floating-point I/O is largely asymmetric. Going from float binary to decimal string is more complex, since there is a large space of candidate outputs to chose from (trailing zeroâs, varying exponent, etc.). So an algorithm might handle that slowly (dtoa).
âMost data should be read more often that itâs writtenâ is reasonable, but is a view centered on data-storage, and specifically the clientâs view. For example, a server publishing read-only records over the network in json, or some other text format, is doing much more number-to-string than string-to-number.
Anyway, I wouldnât suggest taking an effort like this on (replacing the float repr implementation) unless there was evidence showing that a significant portion of real-world applications would benefit.
Interestingly, the frepr package, cited in this thread, monkey-patches PyFloat_Type.tp_repr to achieve its speedup. While itâs using a much slower algorithm (the cited 8x speedup is more like 2x for small-exponent values), and this approach doesnât work for PyPy, it gave me a few ideas:
update this package to use ryu or dragonbox
add an instrument-only mode. Wrap the existing repr(), but collect stats on call count and time spent in the function. Then anyone can determine exactly how a 15 or 20x speedup of this function would affect their app, without correctness concerns.
weâd also need a faster correctly-rounded str->float conversion.
If you talk about string â float conversion too ⌠could such be of help?
didnât try but sounds promising
[ edit ] probably itâs already in or will come shortly:
It is part of the standard C++ library under Linux (as of GCC 12);
( Daniel Lemire, Computer Science Professor ), [ /edit ]