Thanks for tracking this down! Adding a custom __doc__ member sounds like perfectly fine workaround. I don’t think the extra space matters at all in this case.
I have adapted nanobind to use the functionality in this PEP via your extend-opaque branch. The result is committed in nanobind’s limited_api branch (GitHub - wjakob/nanobind at limited_api, specifically the top two commits). The set of changes entails:
- Creating the static property type using a negative basicsize along with a
Py_tp_membersfor the__doc__field that is referenced using the newPY_RELATIVE_OFFSETflag. - Using negative
basicsizeto add extra storage to instances of nanobind’s type metaclasses (nb_type,nb_enum). - Using
PyObject_GetTypeData()to get access to said extra storage. - Generally refactoring the codebase a bit to rely on auto-inheritance of GC slots (
tp_traverse,tp_clear) rather than manually copy-pasting them from parent to child classes viaPyType_GetSlotandPyType_FromSpec.
With these changes, the test suite passes in with a debug build of nanobind and a debug build of your branch
. I had to slightly cheat by cherry-picking the PR that made outgoing vector calls limited API-compatible on top of your changes.
It wasn’t 100% clear to me what the final convention for inheriting both basicsize and/or itemsize is. Can both be set to zero to request this? I am doing so now, and it appears to work. It’s possible that the PEP document is slightly out of sync with what is done in the PR – for example, the Py_tp_inherit_itemsize is mentioned in the PEP but does not appear in your branch.
There are also some places in the implementation where I still need to call PyType_GetSlot(). It was not clear to me to what extent that is a potential limited API violation. Just to give an example, let’s look at the basic example of a tp_dealloc function implementation from the CPython documentation: Type Objects — Python 3.11.0 documentation
static void foo_dealloc(foo_object *self) {
PyTypeObject *tp = Py_TYPE(self);
// free references and buffers here
tp->tp_free(self);
Py_DECREF(tp);
}
In the limited API, the tp->tp_free access isn’t legal, and one needs to use PyType_GetSlot(tp, Py_tp_free). Is that okay?
By the way: the main function of the PR I am not using in my changes is PyObject_GetTypeDataSize(), but it could of course still be useful for other applications.