Heap type with base type: what about tp_dealloc?

Hello,

In a PoC I’m working on, I’d like to use the Py_mod_create slot in a PyModuleDef to return a module that’s of a custom type, subclassing PyModule_Type. A value of this type can, then, contain extra data that can then be used by Py_mod_exec, m_free and similar functions, whilst retaining special-cased support for PyModule_Type modules in the import machinery.

I started implementing this type as a heap-allocated type (trying to target the limited API), using a static PyType_Spec. Since the object structure of moduletype isn’t public, I use a negative basicsize in the spec. Finally, I use PyType_FromSpecWithBases(&spec, (PyObject *)&PyModule_Type) to instantiate the type object.

This works, but I’m puzzled about implementing the Py_tp_dealloc slot for my type, for two reasons, both related to the fact I need to (also) call PyModule_Type->tp_dealloc from my custom tp_dealloc function (unless I’m mistaken and this wouldn’t be required?!):

  • module_dealloc calls PyObject_GC_UnTrack. I didn’t set Py_TPFLAGS_HAVE_GC in my custom PyType_Spec, but maybe that’s be required given the base type has it? If I need the flag, I should also call PyObject_GC_UnTrack before doing anything else in my tp_dealloc hook (including before calling PyModule_Type->tp_dealloc). Would it be an issue if the function gets called twice, then? The docs aren’t conclusive about this.

  • Given this is a heap type, I have to Py_DECREF the type of an instance in tp_dealloc. Currently PyModule_Type is not a heap type, so I should not expect it to do so, but in some later implementation it might be transformed into a heap type, and then I should not decref the object’s type in tp_dealloc (otherwise the type gets decref’ed twice). What’s a common pattern to be “future-proof” here? Should I check the flags of PyModule_Type for the Py_TPFLAGS_HEAP type, and if it is, assume it’ll perform the decref for me?

Thanks in advance!

If you only need extra data, you can set PyModuleDef.m_size; you shouldn’t need a module subclass.

I didn’t set Py_TPFLAGS_HAVE_GC

Py_TPFLAGS_HAVE_GC is inherited from the base. A module subclass will have GC.

I should also call PyObject_GC_UnTrack before doing anything else in my tp_dealloc hook

Currently, PyObject_GC_UnTrack is idempotent. You can call it, clear out stuff, and then call PyModule_Type->tp_dealloc.
I’m not sure how things should be, but, I don’t think we can change this detail now and so we should probably document/guarantee it.

Should I check the flags of PyModule_Type for the Py_TPFLAGS_HEAP type, and if it is, assume it’ll perform the decref for me?

Yes. Sorry for the inconvenience.

If your data includes references to other Python objects, you should also implement tp_travel and can also implement tp_clear (the latter is optional if you are sure that this will not create closed loops that can’t be broken at other link).

Consider also using m_free, m_traverse and m_clear slots instead of creating a new module subclass.

Thanks! No worries about the “inconvenience”, I don’t mind, just wanted to make sure i’m “doing it right”.

I can’t use m_size since

  • I need to run some code in the module’s tp_new, and some related code in tp_dealloc, and
  • What’s stored as extra data needs to be in-place before the Py_mod_exec function(s) are called (the extra data contains something-alike-a-function-pointer to “execute” the module, which I need to invoke in a (C) Py_mod_exec function). I guess I could do that after a call to PyModule_FromDefAndSpec in the create_module method of my loader, but that feels… not great. In a sense, this data isn’t related to the module (implementation) itself, but to the implementation of this “framework”.