Iāve been thinking about this also for things like Spyder, Jupyter etc. One problem with that approach though is that then code that seems fine in one context fails in another.
That was my initial thinking, but it may be more practical to acknowledge that when a user is sitting in front of a terminal entering code interactively that a higher limit makes sense as a way to reduce friction.
For notebooks, as those are higher level and have more control over their environment, one could even imagine automating the choice up front and recording it via an initial explicit sys.set_int_max_str_digits call cell in the notebook. Or automagically suggesting inserting an appropriate one upon encountering a ValueError from an attempted int repr?
This is algorithm 1.26 from the reference cited above:
from functools import lru_cache
@lru_cache
def integer_log(y: int, x: int):
r = e = 0
while y >= x:
d = x
m = 1
while y >= d:
y, rem = divmod(y, d)
r = r or rem
e += m
if y > d:
d *= d
m *= 2
return e
# Algorithm 1.26:
#@profile
def format_int(A: int, B: int = 10) -> str:
"""Format integer A as a string in base B"""
if A.bit_length() < 1000:
# Here we use str for the base case but really this should be something
# that works for arbitrary bases using the standard O(n^2) algorithm.
# The threshold should be tuned.
return str(A)
else:
# find k so that B**(2*k-2) <= A < B**(2*k)
e = integer_log(A, B)
if e % 2 == 1:
k = (e + 1) // 2
else:
k = e // 2 + 1
#assert B**(2*k - 2) <= A < B**(2*k)
Bk = B**k
Q, R = divmod(A, Bk)
r = format_int(R, B)
q = format_int(Q, B)
pad = '0'*(k - len(r))
return ''.join([q, pad, r])
Unfortunately it doesnāt seem to perform as well as algorithm 1.25 and seems to have quadratic run time. For large inputs half of the time is in divmod and half in integer_log. Improving integer_log might not be difficult but improving divmod seems harder. Iām wondering if divmod has the complexity I thought it hadā¦
Please keep hyperbole out of this discussion. Python has not become unusable. The maintainers clearly care about the success and safety of the language.
Itās possible to participate in this discussion without adding negativity or stress. Please show the same understanding and patience that the maintainers are showing in their responses.
Having a context that can locally override the max_str_digits setting would be nice. Draft PR that still needs a bunch of polish yet. With it, you can do:
with _pylong.localcontext() as ctx:
ctx.max_str_digits = 0
n = (1 << 100_000) - 1
s = str(n)
Likely _pylong wouldnāt be a public API. Also, you probably want an easy way to go to unlimited mode, e.g.
to me, the thing thatās really frustrating about this change is that it really seems like the breakingness of this change could have been fully avoided by just making the function faster. just moving this from an n^2 algorithm to an n*log(n)^2 one would be enough that it would be a completely infeasible DoS vector. given this, and the relatively low severity of this attack (no known targets in 2 years of it being known about), it seems very odd that a decision was made to urgently make breaking changes in a patch release rather than just fixing the algorithm.
Would you be willing to be personally responsible for any issues arising from a bug in the implementation of this algorithm, especially across all current usage of Python?
What if certain numbers cause incorrect predictive branching on the CPU, leading to runaway slowdown or hardware crashes. What if certain compilers make incorrect assumptions, causing race conditions on memory write, leading to incorrect results?
There is a huge benefit to using the existing implementation, as more than the explicit testing, millions of users have implicitly tested many scenarios.
This consideration applies less for unreleased versions of Python.
Thatās not the goal. The quadratic-time behavior of decimal_string <-> int conversions has been an ongoing background complaint for decades. With an odd mix of complainants: newbies, and number theory researchers .
Follow the links starting with what Neil gave, and youāll find issue reports going back to long before DoS complaints became a thing, and a trail of abandoned code.
The number-theory researchers should install gmpy2. Python will never compete with that on speed. And, follow all the code needed, and GMP may well have more lines of C code (+ platform-specific assembler) supporting āstate-of-the-art fastā conversions than Pythonās whole code base. We donāt have a small army of numeric expert volunteers willing to take on such a maintenance burden, but GMP does.
Whatās different now, which Neil is pursuing, is leaving asymptotically faster algorithms in Python, called by the C code when arguments get ātoo bigā. That enormously lowers the bars for designing, correct coding (including avoiding all of the near-infinite possibilities for refcounting, memory management, and error-checking mistakes in C), readability, and maintenance burden. In return, no, except in rare cases these will certainly be slower than they could be in C. Donāt care. If we can reduce a thing from O(n**2) to (say) O(n**1.6), thatās far more valuable than cutting a constant factor of (say) 2.
Whether it will be enough to revert the DoS mitigation changes, I have no idea. How fast is fast enough? Thereās no limit on the size of Python ints or strings, except for the amount of VM available to store them, and even a linear-time algorithm may be called a āDoS vulnerabilityā if the inputs are fat enough. The algorithms in question will never be as fast as linear-time, and not even if GMP is used.
Iāve been looking at writing algorithm 1.25 from above in C for CPython. Itās actually not difficult to write that algorithm in C with the PyLong API. The difficult thing (at least from my perspective) is just that the existing code is somewhat spaghetti-like and needs a refactor:
Itās taking me some time to figure out exactly what each macro etc in the existing code does so I can be sure what the implications of any potential change are.
I like the idea of making it possible to fix things like this in Python but with the limit set at 4300 that suggests that the main target for improvement is in the small-ish to intermediate range of integer sizes and at that level I expect that a C implementation is still going to be significantly faster than a Python one.
Why? Looks like the Python code Neil has (str_to_int() in gh-90716) now is already faster, although likely the same asymptotic behavior (basically the same algorithm, but top-down recursive instead of bottom-up). Although thatās limited to decimal string inputs, because nobody in real life gives a rip about strings in bases other than 10 and powers of 2 (and the latter are already fast).
That limit doesnāt make sense to me, and I havenāt been able to find the reasoning behind it. Here with the currently released 3.10.6 on my box:
$ py -m timeit -s "s = '9' * 4300; i = int(s)" "int(s)"
5000 loops, best of 5: 84.8 usec per loop
$ py -m timeit -s "s = '9' * 4300; i = int(s)" "str(i)"
1000 loops, best of 5: 271 usec per loop
How can an operation that can be performed well over a thousand times per second be considered a vector for āa DoS attackā?
But itās also so small that asymptotic behavior becomes increasingly irrelevant. In general, the fancier the algorithm, the higher the overheads to keep track of it all. Thatās why, e.g., nobody on Earth uses FFT-based integer multiplication until operands get āveryā large.
For smaller inputs, I expect the code in Neilās PR gets more attractive, because it doesnāt incur the overheads of building temporary lists. It builds a dict to memoize the powers actually needed, but thatās O(log log n) space (tiny).
BTW, thereās āa reasonā for that: because itās restricted to base 10, it can compute base ** i as (5*2) ** i = (5**i) << i and shifts are very much faster than multiplying bigger integers. Thereās no end to pursuing speed in these areas, so no matter what you do itās going to expand, and expand, and expand. Another reason to keep it, if at all possible, in Python instead of in C.
Picture a CPU core handling N things second, which can include a few megabytes of data which could be something full of a thousand integer strings (ie: JSON). Itās still possible to get something to spin an entire CPU core for a second in an application like that even with the 4300 digit limit in place. What isnāt possible anymore is getting it to spend infinite time by sending the same amount of data. With mitigation in place an attack now need to scale the data volume linearly with desired cpu wastage just like other existing boring avenues of resource waste.
Thanks @oscarbenjamin for raising this issue on Discuss.
RedHat is a global company valued at USD $34 billion (with a B) when IBM purchased them three years ago. Letās not blame overworked volunteers here.
If this were a high priority, or even a medium priority, for RedHat, the CVE details would have been populated less tardily.
This vulnerability is in a class of attack that has public knowledge for over a decade: pass a huge chunk of data to the application and try to DOS it or overflow a buffer. The specific vulnerability (quadratic behaviour of str ā int conversions) has been public for at least four years. RedHat reserved the CVE 30 months ago, and the Python security team has known about it for two years.
Similar XML DOS attacks have been known before 2009, although Iām not sure whether specific str ā int conversion attacks would work.
I donāt buy the need to push out a fix to this without public discussion. This class of vulnerability was already public, and has been for years.
Even if it turns out that the chosen solution is the best, or only, solution, the secrecy and ātrust us, we know bestā from the security team was IMO not justified.
I feel that string hashing attacks are much more serious. Applications can, if they choose, perform length checks on data much more easily than they can recognise hash collisions. And yet we discussed hash randomization publicly in 2011 and 2012 before pushing it out. If we could discuss that publicly, why couldnāt we discuss this?
It seems that the threat here was in no way severe and urgent enough to justify the lack of community discussion.