Faster float / string conversion (Ryu)

Converting between floating point numbers and strings is apparently surprisingly often a bottleneck (for example in JSON parsing / serialization and similar areas.)

CPython seems to use dtoa for this, but a new algorithm Ryu is apparently much faster.

Microsoft C++ standard library developers recently reported using it and “the speedups are indeed massive due to algorithmic improvements of Ryu.”

Would this be something CPython should adopt as well?

1 Like

Do you have any benchmarks that show a bottleneck in dtoa.c? There’s lots of other overhead in Python. Before we start optimizing we should understand where the time is currently spent.


Thanks @petersuter for the links and raising the question. Looking at Ryu’s repo and associated paper it does look like an interesting approach.

I agree with Eric that it would be helpful to know any benchmarks.

@vstinner FYI.

1 Like

Unfortunately I have no benchmarks, and I agree that would be needed first. Thanks for your thoughts.

Ulf Adams claims in his paper “We present Ryu, a new routine to convert binary floating ¯
point numbers to their decimal representations using only
fixed-size integer operations, and prove its correctness. Ryu¯
is simpler and approximately three times faster than the
previously fastest implementation.”

It performs better than dtoa on some inputs but the output is less human readable. For example dtoa and serde_json prefer to print 0.3 as 0.3, while Ryū currently prints it as 3E-1.

Input f64 dtoa Ryū
0.1234 47 ns 66 ns

2.718281828459045 64 ns 40 ns

1.7976931348623157e308 73 ns 42 ns

Benchmark commit: dtolnay/dtoa@655152b

I’d definitely like to hear @mdickinson’s opinion on this topic.

An advantage of Ryū besides speed is simplicity. dtoa.c is a hairy piece of code, and it’d be nice to have something simpler and modern instead. On the other hand, dtoa.c is very battle-tested at this point.


FYI, there is a package make repr(float) faster.

1 Like

I’d be happy to see the change, provided it’s well tested. I’ve never been much of a fan of the complexity of dtoa.c. But dtoa.c won’t go away entirely: IIUC, Ryū is only for float-to-string conversions, right? And for performance, I’d consider that the less important direction - e.g., reading text files of numeric data happens more often than writing such files, and anyone with serious amounts of numeric data should be writing in a binary format rather than converting to string anyway.

One unusual aspect of dtoa.c is that it supports fixed-point formatting (i.e., %f-style formatting) with a negative precision; that facility is used by Python for doing correct rounding with a negative second argument. It’s not clear to me at a glance whether Ryū supports this. Can anyone confirm or deny?

Finally, I’d caution that replacing dtoa.c won’t be a small amount of work.


I’m a bit worried by this part of the README:

Output can be slightly different from the native function, due to floating-point rounding

The shortest-string algorithm requirements are well-defined, so there’s exactly one “correct” answer for that algorithm (and absent bugs, that’s what repr produces). If frepr is giving different results, that doesn’t give me confidence that it’s a good replacement. At a minimum, I’d want to know more about what these different results are and under what circumstances they can occur.

My understanding is that underlying library (Google’s double-conversion) always rounds away from 0:

The buffer will choose the representation that is closest to
‘v’. If there are two at the same distance, than the one farther away
from 0 is chosen (halfway cases - ending with 5 - are rounded up).

I don’t know if that is a concern for us or not.

But anyway that is for frepr, but the Ryu library seems faster yet.