Py_BuildValue function behavior

I don’t understand the behavior of the Py_BuildValue function when the number of format units doesn’t match the number of arguments . The no-response behavior makes it difficult to notice the problem if you have one. Do I miss something about this function?

static PyObject *
iter_search(PyObject *self, PyObject *args){
    PyObject *list = Py_BuildValue("[ssss]",
        "CPython",
        "Jython",
        "Cython",
        "CPython",
        "Cython",
        "CPython",
        "CPython");

    PyObject *cpython = PyUnicode_FromString("CPython");
    Py_ssize_t result = _PySequence_IterSearch(
        list,
        cpython,
        PY_ITERSEARCH_COUNT);
    fprintf(stdout, "CPython count: %ld\n", result);

    Py_DECREF(list);
    Py_DECREF(cpython);
    Py_RETURN_NONE;
}

Output:

CPython count: 2

The format string tells Py_BuildValue what the arguments are, i.e. how many and the type of each one. You’ve told it that there are 4 arguments of type char*, so that’s how it handles the argument list it has been given on the stack.

The C language itself has no way of telling at runtime how many arguments there are or what their types are.

Standard C functions such as printf and CPython functions such as Py_BuildValue rely on a format string to tell the function such things, and that string is parsed at runtime by the function each time it’s called. If there’s a mismatch, it won’t know, and maybe it’ll crash either then or at some point in the future.

The compiler might spot matches at compile time for those functions it knows about, ones that are part of the standard C library, but otherwise, it’ll have no idea and just compile the code as-is.

2 Likes