Int/str conversions broken in latest Python bugfix releases

tiran · September 13, 2022, 10:52am

4,300 happens to be small enough to have int(s) in under 100 usec on modern hardware and large enough to pass unit tests of popular projects that were in our internal test sytems. My initial value was 2,000, which is easier to remember and more than large enough for a 4k RSA key in decimal notation. We later bumped it up to 4,300 because the test suites of some popular Python projects like NumPy were failing with 2,000.

steven.daprano · September 13, 2022, 11:53am

Neither 10 nor 100000000 is big enough to be affected by this fix. They are not even close to the number of digits where this fix kicks in, but the exponent 10**100000000 alone takes over five minutes to compute on my PC, so that’s already a DOS.

If you think that’s a problem, what if they enter “4^4^4^4”? The answer has approximately 8.1e153 digits.

On-line calculators are faced with challenges beyond merely str/int conversion, and need their own solutions.

srittau · September 13, 2022, 12:01pm

I have probably typed variations of the following code hundreds of times, never even considering that this might cause a DOS attack:

try:
    x = int(request.args["foo"])
except (ValueError, KeyError):
    raise BadRequest(...)

I very much welcome this fix.

Rosuav · September 13, 2022, 12:52pm

On my computer, it takes about a million digits for that to become measurably slow (although I’d accept the argument that 100K digits would be slower on a busier or lower-end system). Do you have any sort of request size limit?

TBH I think the limit chosen is extremely aggressively chosen, and could be a lot more generous without causing major problems.

steve.dower · September 13, 2022, 1:02pm

Many server machines are actually running less powerful hardware than your dev machine, and are always handling more requests in parallel. A long enough int to stretch a ~10ms request into a ~100ms request can reduce throughput for all users ~~to 10%~~ ^[1], possibly more if you’re also logging details about failed requests (which presumably will be the result of such a request).

This may be true, but as we’ve seen from the examples above, the people who are unhappy about the limit won’t be happy until it’s completely gone. So it may as well be low enough that they’ll discover and disable it for their apps, and those of us who mostly live in the “ints never exceed 64 bits” world dominated by other languages don’t have any worries at all about users exceeding reasonable limits.

Edit I made the “10%” up. Let’s just go with “drastically”, because I can’t be bothered doing the actual throughput calculation. Certainly one request taking 10x longer on the CPU is going to use up resources that are needed to process other requests, and cause congestion on the server. ↩︎

Rosuav · September 13, 2022, 1:57pm

That’s true, but how long IS long enough to stretch a request until it becomes a problem?

For the record, I haven’t lived in the “ints never exceed 64 bits” world … ever. I grew up with ints being 16 bits, and then met REXX, where numbers have whatever precision you choose (kinda like the Decimal module in Python), before ever getting to 32-bit ints. So when Python showed me arbitrary-precision ints, I was right at home. You’re absolutely right; any limit will cause problems for at least some apps. But for a change that’s being rolled out in point releases, minimizing the damage should surely be a priority.

Stefan2 · September 13, 2022, 2:04pm

Maybe “10^1000000” then, computation takes me 0.4 seconds and conversion to str takes me 15 seconds.

I didn’t think anything about it other than to show a counterexample to the “there is no way for them to apply the “int->str is quadratic” part of the attack” that I replied to.

tim.one · September 13, 2022, 5:03pm

I haven’t seen a DoS-mitigation rationale quite like the current one - like you, I was looking for some reason to believe that an individual conversion near the limit was waaay slow, but that’s not the issue here. On the currently released CPyhon, on my pretty capable desktop box:

$ py -m timeit -s "s = '9' * 50000" "int(s)"
20 loops, best of 5: 10.6 msec per loop

So even now it appears instantaneous to a human eye, with an input string more than 10x longer than the new limit.

But they don’t care about that. Instead there’s some argument about “well, what if they have megabytes worth of strings of this size, and go through all of them as fast as possible? No matter the speed of an individual conversion, the lower the limit on an individual string, the smaller the aggregate time they can consume for going through all of a fixed overall total size of input strings, because conversion per string takes worse than time linear in the length of the string”.

If this is to be consistently applied, then essentially any operation a server farm can be talked into doing, that can consume a few milliseconds given an input of no more than several thousand bytes,. is equally a “DoS vulnerability”.

For example, this takes more than twice as long as the conversion above despite that the input string is 10 times shorter:

$ py -m timeit -s "import re; s = 'x' * 5000" "re.search('x*y$', s)"
10 loops, best of 5: 24 msec per loop

How can that not be a DoS vulnerability if converting a merely 5,000-digit string to int (which goes a couple hundred times faster) is?

I understand that users generally don’t get to supply the regexps an app uses for parsing. But they do supply the strings, and genuine expertise in writing regexps for an re engine like Python’s is rare. I’ve generally been the one to fix stdlib regexps that display exponential-time behavior in bad cases, but best I know nobody has ever even tried to probe for quadratic-time cases. Like converting a 50,000 digit string to int, one instance goes so fast nobody cares. Well, until now .

steve.dower · September 13, 2022, 7:22pm

None of these operate in isolation, though. If an attacker is forced to send 100x the requests, that will be detected (and possibly handled) by other parts of the system, whereas one request that’s 100x the size will likely pass through as “normal” traffic.

Individual requests are also the granularity. With 100 x 10ms requests, other users get an opportunity every 10ms to get a request in. With 1 x 1000ms request, users have to wait over a second. So at least for this kind of DoS attack, we can make a big difference to those who are doing a fixed number of parses per request (e.g. Sebastian’s example above)

Those who allow an unlimited amount of data to be received and processed synchronously are in just as much trouble as they were before

tim.one · September 13, 2022, 7:46pm

Steve, I should have spelled this out: the regexp example I gave takes time quadratic in the length of the user-supplied string. The same O() behavior as int(user_supplied_str), although the regexp spelling has a much higher constant factor. There is no DoS argument you can make about one that can’t be made about the other, because they’re both ways of applying standard Python operations to user-supplied strings, with the same asymptotic behavior in the length of the strings.

# try at 5,000
$ py -m timeit -s "s = '9' * 5000" "int(s)"
2000 loops, best of 5: 112 usec per loop

$ py -m timeit -s "import re; s = '9' * 5000" "re.search('9*8$', s)"
10 loops, best of 5: 24 msec per loop

# try at 50,000 - about 10**2 = 100 times slower
$ py -m timeit -s "s = '9' * 50000" "int(s)"
20 loops, best of 5: 10.1 msec per loop

$ py -m timeit -s "import re; s = '9' * 50000" "re.search('9*8$', s)"
1 loop, best of 5: 2.39 sec per loop

A difference is that a hostile user has to find a vulnerable regexp in the stdlib, and contrive a bad-case input for it. Which I expect they’ll do, now that I’ve mentioned it .

oscarbenjamin · September 13, 2022, 10:11pm

I’ve opened a PR to refactor the PyLong_FromString function here:

github.com/python/cpython

gh-90716: Refactor PyLong_FromString to separate concerns

python:main ← oscarbenjamin:pr_pylong

opened 09:36PM - 13 Sep 22 UTC

oscarbenjamin

+299 -258

This is a preliminary PR to refactor `PyLong_FromString` which is currently quit…e messy and has spaghetti like code that mixes up different concerns as well as duplicating logic. In particular: - `PyLong_FromString` now only handles sign, base and prefix detection and calls a new function `long_from_string_base` to parse the main body of the string. - The `long_from_string_base` function handles all string validation and then calls `long_from_binary_base` or a new function `long_from_non_binary_base` to construct the actual `PyLong`. - The existing `long_from_binary_base` function is simplified by factoring duplicated logic to `long_from_string_base`. - The new function `long_from_non_binary_base` factors out much of the code from `PyLong_FromString` including in particular the quadratic algorithm reffered to in gh-95778 so that this can be seen separately from unrelated concerns such as string validation. I intend to follow up on this with a PR to improve the algorithm used for decimal and other non binary bases but I think that would be a lot easier to do after this refactoring. I could also submit that algorithm in the same PR but I thought it would be easier to review this refactoring separately from a change of algorithm. * Issue: gh-90716

bjorn-martinsson · September 14, 2022, 12:23pm

What is the argument for limiting str(x) for integers? I understand the argument that int(s) can be used for DOS attack, but that does not explain why str(x) should be limited. If a program doesn’t allow big integers in its input, then it generally won’t try to output big integers either, unless it actually intends to print a big int.

Another problem that arises from limiting str(x) is that this kind of program

x = int(input())
print(x + 1)

will now get a very unexpected error on the print line by being given the input '9' * 4300. So the limit on str(x) essentially introduces overflow errors to Python. This wouldn’t have been the case if the limit was only introduced to int(s).

tiran · September 14, 2022, 12:55pm

The CVE fix does not disable big integer support. It only prevents inefficient conversion from decimal text representation to integer and vice versa. Built-in Types — Python 3.12.1 documentation has a complete list of APIs that are limited. Octal, hexadecimal, and int from/to bytes conversions are not limited because they have linear complexity.

A possible attack vector is YAML as YAML supports hexadecimal integer notation. An attacker could send you a malicious YAML file with a very large hexadecimal number. If your application uses e.g. logging to log the value, then your application would be affected.

If you need to deal with very, very large numbers, then use hexadecimal notation. It’s much faster and unlimited. A decimal string with 1,000,000 digits takes about 4 seconds to convert on my system. A hexadecimal number string with 1,000,000 digits takes about 1.3 millisecond to convert.

steve.dower · September 14, 2022, 1:22pm

As someone who’s still actively working on “Trojan Source” mitigations, I disagree. The main difference there is that the researchers were unwilling to wait for mitigations before making big announcements and trying to claim credit for it, but even despite this many of the biggest editors and renderers (the tools that need fixes, not the languages) had basic mitigations in place. Once Unicode publish more concrete guidance on how code renderers should handle Unicode, they’ll be able to move toward more targeted mitigations, but for now they’re using a similarly heavy hammer.

This doesn’t address the ability of libraries to use it and support their users who don’t want to upgrade. You can’t have an optional keyword argument that is ignored in older versions, and so every library that wants to use it ends up with a version-selected helper method anyway, likely until the end of time.

It also doesn’t address the fact that one user’s DoS is another user’s intentional calculation, but that is the user’s decision, not the library’s. Libraries that know users will commonly end up converting very large integers to/from strings to the point where they would consider adding a utility function to do it unsafely can equally just inform their users that some/many/all will need to modify how their app is run.

But most importantly, nothing will solve all the issues brought up in this thread, other than going back in time to having an implicit limit on the size of int again. Just like one day when we get rid of the recursion limit, then likely decide to bring it back because it’s useful for catching runaway bugs, this is likely something that we will come to appreciate as a reasonable check that prevents runaway work from being done.

h-vetinari · September 14, 2022, 1:24pm

Since my more exhaustive post was flagged^[1], the only point I’d like to reiterate is that the SC eminently has the power to allow backporting a keyword to int() back to stable releases, which would solve this situation in a much less broken fashion.

I’d be interested to hear how it can be considered off-topic or even discourteous… Relatedly, lack of transparency is not helped by hiding (even gentle) criticism. ↩︎

steve.dower · September 14, 2022, 1:29pm

Because the conversion takes a very long time, and if you have a way to induce someone else’s app to generate a very large integer, then you may have a way to force it to freeze up while it converts that number to a string (e.g. it lies about the length of some content that an app tries to put into a Content-Length header).

I agree this is the less exploitable side, compared to parsing an arbitrarily long string, but it’s still possible. It’s also a bit weird for int(repr(x)) on an integer to fail on one side but not the other (though that is what used to happen in the past anyway, so maybe not so weird).

You just typed an entire screen’s worth of numbers into a prompt, is it really that unexpected?

You were also about to get an entire screen’s worth of numbers in response. Is that really helpful?

The default length limit is pretty arbitrary, yes, but it’s also well beyond the length of a number that humans can comprehend. They are valuable as intermediate results in a calculation, but we as people can’t do anything useful with numbers that big. Rendering it as “<4300 digits>” is probably much more useful, even without the leading ‘9’.

steve.dower · September 14, 2022, 1:32pm

FWIW, I didn’t flag it and don’t agree with it being flagged.

Yes, and I’m sure they considered it and decided against it. Likely for the reasons I mentioned above, which they’ve used before for not backporting certain changes.

The SC is our new BDFL - they can, but don’t have to post justifications of their decisions. And the rest of us have agreed to accept them. Anyone who disagrees is more than welcome to run for the next SC to have more influence over it, but by definition, their decision is final.

bjorn-martinsson · September 14, 2022, 1:42pm

@steve.dower I was talking about malicious user input. I wonder how many Python codes will crash unexpectedly from being given the input '9' * 4300.

The default length limit is pretty arbitrary,

What the exact limit is doesn’t matter. This “overflow” problem I’m describing would be happen even if the limit was 200 or 2000000.

steve.dower · September 14, 2022, 1:43pm

From the testing we did to determine the initial default, not very much will crash unexpectedly. That’s how we landed on that number.

bjorn-martinsson · September 14, 2022, 1:49pm

From the testing we did to determine the initial default, not very much will crash unexpectedly. That’s how we landed on that number.

I don’t see how you could possibly run tests on this. What kind of tests existed that checked if maliciously giving '9' * 4300 unexpectedly crashes the program before this new limit was introduced?