Faster float / string conversion (Ryu)

Converting between floating point numbers and strings is apparently surprisingly often a bottleneck (for example in JSON parsing / serialization and similar areas.)

CPython seems to use dtoa for this, but a new algorithm Ryu is apparently much faster.

Microsoft C++ standard library developers recently reported using it and “the speedups are indeed massive due to algorithmic improvements of Ryu.”

Would this be something CPython should adopt as well?

1 Like

Do you have any benchmarks that show a bottleneck in dtoa.c? There’s lots of other overhead in Python. Before we start optimizing we should understand where the time is currently spent.

2 Likes

Thanks @petersuter for the links and raising the question. Looking at Ryu’s repo and associated paper it does look like an interesting approach.

I agree with Eric that it would be helpful to know any benchmarks.

@vstinner FYI.

1 Like

Unfortunately I have no benchmarks, and I agree that would be needed first. Thanks for your thoughts.

Ulf Adams claims in his paper “We present Ryu, a new routine to convert binary floating ¯
point numbers to their decimal representations using only
fixed-size integer operations, and prove its correctness. Ryu¯
is simpler and approximately three times faster than the
previously fastest implementation.” https://doi.org/10.1145/3192366.3192369

It performs better than dtoa on some inputs but the output is less human readable. For example dtoa and serde_json prefer to print 0.3 as 0.3, while Ryū currently prints it as 3E-1.

Input f64 dtoa Ryū
0.1234 47 ns 66 ns

2.718281828459045 64 ns 40 ns

1.7976931348623157e308 73 ns 42 ns

Benchmark commit: dtolnay/dtoa@655152b

I’d definitely like to hear @mdickinson’s opinion on this topic.

An advantage of Ryū besides speed is simplicity. dtoa.c is a hairy piece of code, and it’d be nice to have something simpler and modern instead. On the other hand, dtoa.c is very battle-tested at this point.

FYI, there is a package make repr(float) faster.

1 Like