How must we handle integer overflow?

Do I need to handle overflow here like @storchaka pointed out? gh-118184: Support tuples for `find`, `index`, `rfind` & `rindex` by nineteendo · Pull Request #119501 · python/cpython · GitHub
It’s going to make the implementation messy.

#define FIND_CHUNK_SIZE 10000
...
Py_ssize_t len = PyUnicode_GET_LENGTH(str);
ADJUST_INDICES(start, end, len);
// This addition
for (; result == -1 && start <= end; start += FIND_CHUNK_SIZE) {

I can’t even create such a long string:

>>> string = "a" * 9223372036854775807
Traceback (most recent call last):
  File "<python-input-0>", line 1, in <module>
    string = "a" * 9223372036854775807
             ~~~~^~~~~~~~~~~~~~~~~~~~~
MemoryError

Is this an issue if python is supporting 32 bit builds?

I don’t know, that’s why I’m asking it here.

My guess is using 32 bit you can trigger overflow in that code.

Does the existing, released, code also have the overflow issue in find?

1 Like

Resolved, thanks for the help.

Is this the solution for this thread?

if (a > PY_SSIZE_T_MAX - b) {
    // handle overflow
}

Yes, that’s the solution.

Can we have some helper macro/inline function for add/mul with overflow check?

1 Like

I checked some compiler output. It seems if (a > ULONG_MAX - b) is not replaced with carry flag test.

#include <limits.h>
#include <stdbool.h>

bool add_overflow(unsigned long a, unsigned long b, unsigned long *out) {
    if (a > ULONG_MAX - b) {
        return true;
    }
    *out = a + b;
    return false;
}

bool add_overflow2(unsigned long a, unsigned long b, unsigned long *out) {
    return __builtin_add_overflow(a, b, out);
}

macOS Apple Clang

gcc -O3 -S

_add_overflow:                          ; @add_overflow
	.cfi_startproc
; %bb.0:
	mvn	x8, x1
	cmp	x8, x0
	b.lo	LBB0_2
; %bb.1:
	add	x9, x1, x0
	str	x9, [x2]
LBB0_2:
	cmp	x8, x0
	cset	w0, lo
	ret
	.cfi_endproc
                                        ; -- End function
	.globl	_add_overflow2                  ; -- Begin function add_overflow2
	.p2align	2
_add_overflow2:                         ; @add_overflow2
	.cfi_startproc
; %bb.0:
	adds	x8, x0, x1
	cset	w0, hs
	str	x8, [x2]
	ret
	.cfi_endproc

Ubuntu 24.04 GCC

gcc -O3 -S

	.section	__TEXT,__text,regular,pure_instructions
	.build_version macos, 14, 0	sdk_version 14, 4
	.globl	_add_overflow                   ; -- Begin function add_overflow
	.p2align	2
_add_overflow:                          ; @add_overflow
	.cfi_startproc
; %bb.0:
	mvn	x8, x1
	cmp	x8, x0
	b.lo	LBB0_2
; %bb.1:
	add	x9, x1, x0
	str	x9, [x2]
LBB0_2:
	cmp	x8, x0
	cset	w0, lo
	ret

_add_overflow2:                         ; @add_overflow2
	.cfi_startproc
; %bb.0:
	adds	x8, x0, x1
	cset	w0, hs
	str	x8, [x2]
	ret
	.cfi_endproc
                                        ; -- End function
.subsections_via_symbols

details

Python/pytime.c has pytime_mul_check_overflow(x, y) which check x*y would overflow, and pytime_add() which reports overflow and clamps the result to [PyTime_MIN; PyTime_MAX] on overflow. The problem is the size of the API if we consider many input types. Which types matter the most?

This was regarding Py_ssize_t, but I think the problem was overcome by restructuring the problem in a way that it never goes out of input bounds.

Thank you for information, this will surely be useful.

I care about only Py_ssize_t. bool Py_AddSsize_Overflow(Py_ssize_t a, b, *out); and bool Py_MulSSize_Overflow(Py_ssize_t a, b, *out);.

    // listobject.c
static PyObject *
list_repeat_lock_held(PyListObject *a, Py_ssize_t n)
{
    const Py_ssize_t input_size = Py_SIZE(a);
    ...
    // current code
    if (input_size > PY_SSIZE_T_MAX / n)
        return PyErr_NoMemory();
    Py_ssize_t output_size = input_size * n;

    // will be
    Py_ssize_t output_size;
    if (Py_MulSSize_Overflow(input_size, n, &output_size)) {
        return PyErr_NoMemory();
    }

//unicodeobject.c
...
    if (len > PY_SSIZE_T_MAX / 8)
        return PyErr_NoMemory();
    v = PyBytes_FromStringAndSize(NULL, len * 8);

    // will be

    Py_ssize_t bsize;
    if (PyMulSSize_Overflow(len, 8, &bsize)) {
        return PyErr_NoMemory();
    }
    v = PyBytes_FromStringAndSize(NULL, bsize);
...
    if (PY_SSIZE_T_MAX / (Py_ssize_t)sizeof(wchar_t) - 1 < size) {
        return -1;
    }
    unicode = PyMem_RawMalloc((size + 1) * sizeof(wchar_t));

    // will be
    Py_ssize_t alloc_size;
    if (PyMulSSize_Overflow(size+1, sizeof(wchar_t), &alloc_size);
        return -1;
    }
    unicode = PyMem_RawMalloc(alloc_size);

I am not sure this improves readability because we need to add temporary variables to many places.

FWIW, mimalloc has a helper function only for mul.