Heap type handling of Py_tp_doc vs. static types

While working on some C code to implement a type somewhat alike PyFunction_Type, but as a heap-allocated type, I ran into an issue related to handling of __doc__. My goal is, similar to function objects, for the type’s docstring to be some constant, and every instance to have their own docstring.

In PyFunction_Type, this is implemented by setting tp_doc to the type’s docstring, and have a PyMemberDef for __doc__ in tp_members, referring to the func_doc field in the backing structure.

In the PyType_Spec of a Demo type, I tried the same approach, setting the Py_tp_doc slot to some constant string, and have a __doc__ member in Py_tp_members using the offset of a char * in the instance struct. However, unlike with a regular function object, the __doc__ of an instance of Demo is Demo’s Py_tp_doc value, not the desired one (i.e., the one of the instance).

It turns out PyType_FromMetaclass (which I use to go from Demos PyType_Spec to a PyTypeObject) has some special handling for tp_doc, setting a __doc__ value in the constructed type’s __dict__: cpython/Objects/typeobject.c at a549f439384b4509b25639337ffea21c2e55d452 · python/cpython · GitHub

I wonder whether this causes the difference I observe? Compare

>>> def regular_function():
...     """Regular docstring"""
... 
>>> assert regular_function.__doc__ == "Regular docstring"
>>> assert type(regular_function).__doc__.startswith("Create a function object.")
>>> 
>>> type(regular_function).__dict__["__doc__"]
<member '__doc__' of 'function' objects>
>>> type(type(regular_function).__dict__["__doc__"])
<class 'member_descriptor'>

with

>>> type(func.demo)
<class 'func.Demo'>
>>> assert func.Demo.__doc__ == "Demo doc"
>>> assert func.demo.__doc__ == "Demo instance doc"
Traceback (most recent call last):
  File "<python-input-12>", line 1, in <module>
    assert func.demo.__doc__ == "Demo instance doc"
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
>>> type(type(func.demo).__dict__["__doc__"])
<class 'str'>

In the PyFunction_Type case, the __doc__ member in the type’s __dict__ is/remains a descriptor, where in the heap-allocated type case, this is overwritten to be a str.

Is this expected/desired? If so, is there a way to make this work? A way out I see is to construct a descriptor myself which “does the right thing” in its __get__, based on whether it’s invoked on the class or on an instance, get this in the type’s __dict__ (though unclear how to do so, given I’d like the type to be immutable, and I couldn’t figure out how to access tp_dict using the limited API), and not set a Py_tp_doc slot, which seems rather inconvenient.

Could you open an issue? It looks like we shouldn’t add __doc__ if it already exists (the same way we skip __module__).

In 3.14 there’ll be a way to adjust a type before making a type immutable, PyType_Freeze, but that also won’t help you now :‍(