How must we handle integer overflow?

Nineteendo · May 26, 2024, 8:30pm

Do I need to handle overflow here like @storchaka pointed out? gh-118184: Support tuples for `find`, `index`, `rfind` & `rindex` by nineteendo · Pull Request #119501 · python/cpython · GitHub
It’s going to make the implementation messy.

#define FIND_CHUNK_SIZE 10000
...
Py_ssize_t len = PyUnicode_GET_LENGTH(str);
ADJUST_INDICES(start, end, len);
// This addition
for (; result == -1 && start <= end; start += FIND_CHUNK_SIZE) {

I can’t even create such a long string:

>>> string = "a" * 9223372036854775807
Traceback (most recent call last):
  File "<python-input-0>", line 1, in <module>
    string = "a" * 9223372036854775807
             ~~~~^~~~~~~~~~~~~~~~~~~~~
MemoryError

barry-scott · May 26, 2024, 8:44pm

Is this an issue if python is supporting 32 bit builds?

Nineteendo · May 26, 2024, 8:45pm

I don’t know, that’s why I’m asking it here.

barry-scott · May 26, 2024, 8:49pm

My guess is using 32 bit you can trigger overflow in that code.

Does the existing, released, code also have the overflow issue in find?

Nineteendo · May 27, 2024, 11:37am

Resolved, thanks for the help.

elis.byberi · May 27, 2024, 11:54am

Is this the solution for this thread?

if (a > PY_SSIZE_T_MAX - b) {
    // handle overflow
}

Nineteendo · May 27, 2024, 11:55am

Yes, that’s the solution.

methane · May 28, 2024, 5:28am

Can we have some helper macro/inline function for add/mul with overflow check?

CPU has overflow flag. a + b can detect overflow at CPU level.
Recent GCC and clang has builtin functions for overflow check.
- Integer Overflow Builtins (Using the GNU Compiler Collection (GCC))
- Clang Language Extensions — Clang 19.0.0git documentation

methane · May 28, 2024, 6:03am

I checked some compiler output. It seems if (a > ULONG_MAX - b) is not replaced with carry flag test.

#include <limits.h>
#include <stdbool.h>

bool add_overflow(unsigned long a, unsigned long b, unsigned long *out) {
    if (a > ULONG_MAX - b) {
        return true;
    }
    *out = a + b;
    return false;
}

bool add_overflow2(unsigned long a, unsigned long b, unsigned long *out) {
    return __builtin_add_overflow(a, b, out);
}

macOS Apple Clang

gcc -O3 -S

_add_overflow:                          ; @add_overflow
	.cfi_startproc
; %bb.0:
	mvn	x8, x1
	cmp	x8, x0
	b.lo	LBB0_2
; %bb.1:
	add	x9, x1, x0
	str	x9, [x2]
LBB0_2:
	cmp	x8, x0
	cset	w0, lo
	ret
	.cfi_endproc
                                        ; -- End function
	.globl	_add_overflow2                  ; -- Begin function add_overflow2
	.p2align	2
_add_overflow2:                         ; @add_overflow2
	.cfi_startproc
; %bb.0:
	adds	x8, x0, x1
	cset	w0, hs
	str	x8, [x2]
	ret
	.cfi_endproc

Ubuntu 24.04 GCC

gcc -O3 -S

	.section	__TEXT,__text,regular,pure_instructions
	.build_version macos, 14, 0	sdk_version 14, 4
	.globl	_add_overflow                   ; -- Begin function add_overflow
	.p2align	2
_add_overflow:                          ; @add_overflow
	.cfi_startproc
; %bb.0:
	mvn	x8, x1
	cmp	x8, x0
	b.lo	LBB0_2
; %bb.1:
	add	x9, x1, x0
	str	x9, [x2]
LBB0_2:
	cmp	x8, x0
	cset	w0, lo
	ret

_add_overflow2:                         ; @add_overflow2
	.cfi_startproc
; %bb.0:
	adds	x8, x0, x1
	cset	w0, hs
	str	x8, [x2]
	ret
	.cfi_endproc
                                        ; -- End function
.subsections_via_symbols

details

notes/c/overflow at master · methane/notes · GitHub

vstinner · May 31, 2024, 3:26pm

Python/pytime.c has pytime_mul_check_overflow(x, y) which check x*y would overflow, and pytime_add() which reports overflow and clamps the result to [PyTime_MIN; PyTime_MAX] on overflow. The problem is the size of the API if we consider many input types. Which types matter the most?

dg-pb · May 31, 2024, 4:28pm

This was regarding Py_ssize_t, but I think the problem was overcome by restructuring the problem in a way that it never goes out of input bounds.

Thank you for information, this will surely be useful.

methane · June 3, 2024, 7:46am

I care about only Py_ssize_t. bool Py_AddSsize_Overflow(Py_ssize_t a, b, *out); and bool Py_MulSSize_Overflow(Py_ssize_t a, b, *out);.

    // listobject.c
static PyObject *
list_repeat_lock_held(PyListObject *a, Py_ssize_t n)
{
    const Py_ssize_t input_size = Py_SIZE(a);
    ...
    // current code
    if (input_size > PY_SSIZE_T_MAX / n)
        return PyErr_NoMemory();
    Py_ssize_t output_size = input_size * n;

    // will be
    Py_ssize_t output_size;
    if (Py_MulSSize_Overflow(input_size, n, &output_size)) {
        return PyErr_NoMemory();
    }

//unicodeobject.c
...
    if (len > PY_SSIZE_T_MAX / 8)
        return PyErr_NoMemory();
    v = PyBytes_FromStringAndSize(NULL, len * 8);

    // will be

    Py_ssize_t bsize;
    if (PyMulSSize_Overflow(len, 8, &bsize)) {
        return PyErr_NoMemory();
    }
    v = PyBytes_FromStringAndSize(NULL, bsize);
...
    if (PY_SSIZE_T_MAX / (Py_ssize_t)sizeof(wchar_t) - 1 < size) {
        return -1;
    }
    unicode = PyMem_RawMalloc((size + 1) * sizeof(wchar_t));

    // will be
    Py_ssize_t alloc_size;
    if (PyMulSSize_Overflow(size+1, sizeof(wchar_t), &alloc_size);
        return -1;
    }
    unicode = PyMem_RawMalloc(alloc_size);

I am not sure this improves readability because we need to add temporary variables to many places.

FWIW, mimalloc has a helper function only for mul.

github.com

python/cpython/blob/3ea9b92086240b2f38a74c6945e7a723b480cefe/Include/internal/mimalloc/mimalloc/internal.h#L323-L361


      
          // Overflow detecting multiply
          #if __has_builtin(__builtin_umul_overflow) || (defined(__GNUC__) && (__GNUC__ >= 5))
          #include <limits.h>      // UINT_MAX, ULONG_MAX
          #if defined(_CLOCK_T)    // for Illumos
          #undef _CLOCK_T
          #endif
          static inline bool mi_mul_overflow(size_t count, size_t size, size_t* total) {
            #if (SIZE_MAX == ULONG_MAX)
              return __builtin_umull_overflow(count, size, (unsigned long *)total);
            #elif (SIZE_MAX == UINT_MAX)
              return __builtin_umul_overflow(count, size, (unsigned int *)total);
            #else
              return __builtin_umulll_overflow(count, size, (unsigned long long *)total);
            #endif
          }
          #else /* __builtin_umul_overflow is unavailable */
          static inline bool mi_mul_overflow(size_t count, size_t size, size_t* total) {
            #define MI_MUL_NO_OVERFLOW ((size_t)1 << (4*sizeof(size_t)))  // sqrt(SIZE_MAX)
            *total = count * size;
            // note: gcc/clang optimize this to directly check the overflow flag

This file has been truncated. show original