Int/str conversions broken in latest Python bugfix releases

Is it possible to quantify the reasoning behind the 4300 limit? It’s a peculiar number, which suggests either that it was the output of a computation - or was a SWAG that was made peculiar to make people think it was the output of a computation :wink:.

How fast is fast enough? For the 10-million character string "9" * 10_000_000, the asymptotically better str_to_int() in Neil’s PR today is better than 16x faster. the difference between roughly 400 and 24 seconds.

But that’s more than “a few megabytes”. How many megabytes are the implicit limit? On the 3-megabyte string "9" * 3_000_000, str_to_int() is better than 10x faster, about 36.5 seconds down to 3.5. Since we can already squash about 700 4300-character strings into 3 million bytes, presumably burning a second in all is not “a DoS vulnerability”. But is 3.5 seconds really that much worse?

We have an asymptotically much better still version of str->int, but the overheads are so high that it’s still slower than Neil’s current str_to_int() on a 10-million character input. It’s twice as fast at 100 million characters. CPython’s current str() takes well over 10 hours to convert it.

3 Likes