While working on some C code to implement a type somewhat alike PyFunction_Type
, but as a heap-allocated type, I ran into an issue related to handling of __doc__
. My goal is, similar to function
objects, for the type’s docstring to be some constant, and every instance to have their own docstring.
In PyFunction_Type
, this is implemented by setting tp_doc
to the type’s docstring, and have a PyMemberDef
for __doc__
in tp_members
, referring to the func_doc
field in the backing structure.
In the PyType_Spec
of a Demo
type, I tried the same approach, setting the Py_tp_doc
slot to some constant string, and have a __doc__
member in Py_tp_members
using the offset of a char *
in the instance struct. However, unlike with a regular function
object, the __doc__
of an instance of Demo
is Demo
’s Py_tp_doc
value, not the desired one (i.e., the one of the instance).
It turns out PyType_FromMetaclass
(which I use to go from Demo
s PyType_Spec
to a PyTypeObject
) has some special handling for tp_doc
, setting a __doc__
value in the constructed type’s __dict__
: cpython/Objects/typeobject.c at a549f439384b4509b25639337ffea21c2e55d452 · python/cpython · GitHub
I wonder whether this causes the difference I observe? Compare
>>> def regular_function():
... """Regular docstring"""
...
>>> assert regular_function.__doc__ == "Regular docstring"
>>> assert type(regular_function).__doc__.startswith("Create a function object.")
>>>
>>> type(regular_function).__dict__["__doc__"]
<member '__doc__' of 'function' objects>
>>> type(type(regular_function).__dict__["__doc__"])
<class 'member_descriptor'>
with
>>> type(func.demo)
<class 'func.Demo'>
>>> assert func.Demo.__doc__ == "Demo doc"
>>> assert func.demo.__doc__ == "Demo instance doc"
Traceback (most recent call last):
File "<python-input-12>", line 1, in <module>
assert func.demo.__doc__ == "Demo instance doc"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
>>> type(type(func.demo).__dict__["__doc__"])
<class 'str'>
In the PyFunction_Type
case, the __doc__
member in the type’s __dict__
is/remains a descriptor, where in the heap-allocated type case, this is overwritten to be a str
.
Is this expected/desired? If so, is there a way to make this work? A way out I see is to construct a descriptor myself which “does the right thing” in its __get__
, based on whether it’s invoked on the class or on an instance, get this in the type’s __dict__
(though unclear how to do so, given I’d like the type to be immutable, and I couldn’t figure out how to access tp_dict
using the limited API), and not set a Py_tp_doc
slot, which seems rather inconvenient.