While working on some C code to implement a type somewhat alike PyFunction_Type, but as a heap-allocated type, I ran into an issue related to handling of __doc__. My goal is, similar to function objects, for the type’s docstring to be some constant, and every instance to have their own docstring.
In PyFunction_Type, this is implemented by setting tp_doc to the type’s docstring, and have a PyMemberDef for __doc__ in tp_members, referring to the func_doc field in the backing structure.
In the PyType_Spec of a Demo type, I tried the same approach, setting the Py_tp_doc slot to some constant string, and have a __doc__ member in Py_tp_members using the offset of a char * in the instance struct. However, unlike with a regular function object, the __doc__ of an instance of Demo is Demo’s Py_tp_doc value, not the desired one (i.e., the one of the instance).
It turns out PyType_FromMetaclass (which I use to go from Demos PyType_Spec to a PyTypeObject) has some special handling for tp_doc, setting a __doc__ value in the constructed type’s __dict__: cpython/Objects/typeobject.c at a549f439384b4509b25639337ffea21c2e55d452 · python/cpython · GitHub
I wonder whether this causes the difference I observe? Compare
>>> def regular_function():
... """Regular docstring"""
...
>>> assert regular_function.__doc__ == "Regular docstring"
>>> assert type(regular_function).__doc__.startswith("Create a function object.")
>>>
>>> type(regular_function).__dict__["__doc__"]
<member '__doc__' of 'function' objects>
>>> type(type(regular_function).__dict__["__doc__"])
<class 'member_descriptor'>
with
>>> type(func.demo)
<class 'func.Demo'>
>>> assert func.Demo.__doc__ == "Demo doc"
>>> assert func.demo.__doc__ == "Demo instance doc"
Traceback (most recent call last):
File "<python-input-12>", line 1, in <module>
assert func.demo.__doc__ == "Demo instance doc"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
>>> type(type(func.demo).__dict__["__doc__"])
<class 'str'>
In the PyFunction_Type case, the __doc__ member in the type’s __dict__ is/remains a descriptor, where in the heap-allocated type case, this is overwritten to be a str.
Is this expected/desired? If so, is there a way to make this work? A way out I see is to construct a descriptor myself which “does the right thing” in its __get__, based on whether it’s invoked on the class or on an instance, get this in the type’s __dict__ (though unclear how to do so, given I’d like the type to be immutable, and I couldn’t figure out how to access tp_dict using the limited API), and not set a Py_tp_doc slot, which seems rather inconvenient.