PEP 697 – Limited C API for Extending Opaque Types

Thanks for tracking this down! Adding a custom __doc__ member sounds like perfectly fine workaround. I don’t think the extra space matters at all in this case.

I have adapted nanobind to use the functionality in this PEP via your extend-opaque branch. The result is committed in nanobind’s limited_api branch (GitHub - wjakob/nanobind at limited_api, specifically the top two commits). The set of changes entails:

  • Creating the static property type using a negative basicsize along with a Py_tp_members for the __doc__ field that is referenced using the new PY_RELATIVE_OFFSET flag.
  • Using negative basicsize to add extra storage to instances of nanobind’s type metaclasses (nb_type, nb_enum).
  • Using PyObject_GetTypeData() to get access to said extra storage.
  • Generally refactoring the codebase a bit to rely on auto-inheritance of GC slots (tp_traverse, tp_clear) rather than manually copy-pasting them from parent to child classes via PyType_GetSlot and PyType_FromSpec.

With these changes, the test suite passes in with a debug build of nanobind and a debug build of your branch :tada:. I had to slightly cheat by cherry-picking the PR that made outgoing vector calls limited API-compatible on top of your changes.

It wasn’t 100% clear to me what the final convention for inheriting both basicsize and/or itemsize is. Can both be set to zero to request this? I am doing so now, and it appears to work. It’s possible that the PEP document is slightly out of sync with what is done in the PR – for example, the Py_tp_inherit_itemsize is mentioned in the PEP but does not appear in your branch.

There are also some places in the implementation where I still need to call PyType_GetSlot(). It was not clear to me to what extent that is a potential limited API violation. Just to give an example, let’s look at the basic example of a tp_dealloc function implementation from the CPython documentation: Type Objects — Python 3.11.0 documentation

static void foo_dealloc(foo_object *self) {
    PyTypeObject *tp = Py_TYPE(self);
    // free references and buffers here
    tp->tp_free(self);
    Py_DECREF(tp);
}

In the limited API, the tp->tp_free access isn’t legal, and one needs to use PyType_GetSlot(tp, Py_tp_free). Is that okay?

By the way: the main function of the PR I am not using in my changes is PyObject_GetTypeDataSize(), but it could of course still be useful for other applications.