PyObject_GetIter() segfaults when trying to get tp_iter

DanielLee343 · November 11, 2023, 11:56pm

Hi, I’m hacking through CPython 3.12. I’m calling PyObject_GetIter() within a loop to see whether each PyObject is potentially, iterable. If so, get its tp_iter and iterate it, psudo code like this:

for (;;) // some conditions
{
    PyObject *each_op = obtain_from_some_source();
    PyObject *iterable = PyObject_GetIter(each_op);
    if (!iterable)
    {
        return;
    }
    PyObject *inner_op;
    while ((inner_op = PyIter_Next(iterable)))
    {
        // do some logic
    }
    ...
}

Also providing the code snippet of PyObject_GetIter() implementation:

PyObject *PyObject_GetIter(PyObject *o)
{
    PyTypeObject *t = Py_TYPE(o);
    getiterfunc f;

    f = t->tp_iter;
    if (f == NULL)
    {
        if (PySequence_Check(o))
            return PySeqIter_New(o);
        return type_error("'%.200s' object is not iterable", o);
    }
    else
    {
        PyObject *res = (*f)(o);
        if (res != NULL && !PyIter_Check(res))
        {
            PyErr_Format(PyExc_TypeError,
                         "iter() returned non-iterator "
                         "of type '%.100s'",
                         Py_TYPE(res)->tp_name);
            Py_SETREF(res, NULL);
        }
        return res;
    }
}

However, my call segfaults in f = t->tp_iter;, GDB shows:

(gdb) p o
$1 = (PyObject *) 0x7ffff772bac0
(gdb) p t->tp_iter
Cannot access memory at address 0xd7
(gdb) p t
$2 = (PyTypeObject *) 0xffffffffffffffff
(gdb)

It seems the PyObject o is a valid object (I guess?), but it does not have tp_iter field. And I don’t know what object it is since I cannot call PyObject_Print() or any other.

Is that supposed to happen, or any ways to debug? Thanks for any replying.

kknechtel · November 12, 2023, 12:48am

The debug information implies that Py_TYPE(o) evaluated to -1 (all set bits, in 2s complement, thus 0xffffffffffffffff in the debugger). It should have resulted in a pointer to the object (i.e., a PyTypeObject* that represents o’s class.

Py_TYPE is a simple static inline function (in 3.12; I think it used to be a macro). We conclude that the ob_type field of the object wasn’t set. In Python terms, we are trying to call the __iter__ method, but the object is somehow not an instance of any particular class, so method lookup (which assumes this information is set) crashes.

There is something wrong either in obtain_from_some_source or in the “source” that it’s processing. The comment above Py_TYPE tells us: “// bpo-39573: The Py_SET_TYPE() function must be used to set an object type.” Presumably, this didn’t happen.

DanielLee343 · November 12, 2023, 6:35pm

Thanks. I agree that my obtain_from_some_source() function is obtaining the object that has been freed. That’s why.

Topic		Replies	Views
Revisiting a C API for asynchronous functions C API	1	315	April 15, 2024
Problem with Pdb Python Help	5	499	October 21, 2023
Different CPython memory management for loop iteration Python Help help	6	254	January 5, 2024
Itertools.tee should retain StopIteration value Ideas	10	286	April 17, 2024
How to efficiently know a PyObject liveness Python Help help	3	162	March 15, 2024

PyObject_GetIter() segfaults when trying to get tp_iter

Related Topics