Int/str conversions broken in latest Python bugfix releases

Steve, I should have spelled this out: the regexp example I gave takes time quadratic in the length of the user-supplied string. The same O() behavior as int(user_supplied_str), although the regexp spelling has a much higher constant factor. There is no DoS argument you can make about one that can’t be made about the other, because they’re both ways of applying standard Python operations to user-supplied strings, with the same asymptotic behavior in the length of the strings.

# try at 5,000
$ py -m timeit -s "s = '9' * 5000" "int(s)"
2000 loops, best of 5: 112 usec per loop

$ py -m timeit -s "import re; s = '9' * 5000" "re.search('9*8$', s)"
10 loops, best of 5: 24 msec per loop

# try at 50,000 - about 10**2 = 100 times slower
$ py -m timeit -s "s = '9' * 50000" "int(s)"
20 loops, best of 5: 10.1 msec per loop

$ py -m timeit -s "import re; s = '9' * 50000" "re.search('9*8$', s)"
1 loop, best of 5: 2.39 sec per loop

A difference is that a hostile user has to find a vulnerable regexp in the stdlib, and contrive a bad-case input for it. Which I expect they’ll do, now that I’ve mentioned it :wink:.

8 Likes