Instantiating Python objects from C - segfault on `type(obj)`, `__class__` attribute missing

I’m learning the C API to embed Python in a game engine. I’ve got my little family of standard GUI objects, some of which are instantiated in the compiled project, and I’d like to expose both the types and instances to the REPL and scripts.

I’ve got my Python class defined:

  static PyTypeObject PyTextureType = {
      .tp_name = "mcrfpy.Texture",
      .tp_basicsize = sizeof(PyTextureObject),
      .tp_itemsize = 0,
      .tp_repr = PyTexture::repr,
      .tp_hash = PyTexture::hash,
      .tp_flags = Py_TPFLAGS_DEFAULT,
      .tp_doc = PyDoc_STR("SFML Texture Object"),
      .tp_init = (initproc)PyTexture::init,
      .tp_new = PyType_GenericNew,
  };

The methods pointed to in the PyTypeObject above are all defaults, or static methods in a C++ PyTexture class.

This works just fine from the REPL: I can call mcrfpy.Texture, provide arguments for init, and I get a fully functional Python object back. I’m having no issues with using the struct directly from C++.

texture = mcrfpy.Texture("assets/kenney_tinydungeon.png", 16, 16) # works fine

The other direction - instantiating the C++ object and generating a Python instance had some issues. The object mostly behaves normally, but segfaults under some conditions:

procedure

What steps am I missing or doing incorrectly?

step 1. allocate.

PyObject* obj = PyType_GenericAlloc(&PyTextureType, 0);

I’ve had better luck with GenericAlloc. I originally called PyObject* obj = (PyObject*)PyTextureType.tp_alloc(&PyTextureType, 0); - this segfaults on the call to tp_alloc.

step 2. initialize the object; instead of calling init, I directly set the members of the PyTextureObject struct.

step 3. Add the objects to the module so they can be used from Python.

PyObject* PyInit_mcrfpy()
{
    PyObject* m = PyModule_Create(&mcrfpyModule);

    // ...

    McRFPy_API::default_texture = std::make_shared<PyTexture>("assets/kenney_tinydungeon.png", 16, 16);
    PyModule_AddObject(m, "default_texture", McRFPy_API::default_texture->pyObject());

    return m;
}

This module is being loaded into the embedded interpreter with PyImport_AppendInittab("mcrfpy", &PyInit_mcrfpy);

How to replicate the segfault

>>> type(mcrfpy.default_texture)
Segmentation fault (core dumped)

gdb points me to a problem with ob->ob_type.

Oddly enough, there seems to be a workaround:

>>> mcrfpy.default_texture.__class__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'mcrfpy.Texture' object has no attribute '__class__'. Did you mean: '__hash__'?
>>> mcrfpy.default_texture.__class__
<class 'mcrfpy.Texture'>
>>> type(mcrfpy.default_texture)
<class 'mcrfpy.Texture'>

for some reason, __class__ is not defined at first, but after trying to access it, it is - and the type function then works fine.

Py_SET_TYPE(obj, &PyTextureType); does not seem to be of any help. Explicitly giving PyTextureType the default metaclass .tp_base = &PyBaseObject_Type doesn’t seem to have any effect.

How can I instantiate these objects correctly? What other time bombs are lurking under the surface of these incompletely initialized objects I’m building?

Is there a reason you aren’t just using one of PyObject_Call* methods? AFAIK that is the intended way to create object instances. It makes sure that the normal object creation process is done properly.

Otherwise, look at the implementation of type.__call__/object.__new__ and make sure you follow all those steps. But this might break from minor version to minor version.

In this case, it’s because the Python __init__ loads a file by name, and from C++ I want to share a file handle directly into the struct without making users of my Python API have to deal with those. But increasing the amount of “internal” use of the Python interpreter is a possible workaround.

I’m looking in Objects/typeobject.c at slot_tp_call, this seems to be the implementation to execute the call, not the implementation for whatever tp_call’s default value is. Any idea where I should be looking? I have really been struggling to get search engines to give me anything other than tutorials in Python, or frameworks to replace the C API.

Yes, don’t look at slot_tp_call, look at tp_call.

It should be enough to call type->tp_new and then not call your type->tp_init and instead do the initialization in a different method that directly receives the file handle.

1 Like

Putting a conclusion here for posterity: my particular issue was caused by doing too much Python-y stuff before Py_Initialize was complete.

My module is built via PyImport_AppendInittab, which expects a function argument that returns a module object. All of my types are being added to the module, but I encountered this issue with instances not having correct type information. My solution is going to be slightly specific to embedded Python applications, but it should make sense to anybody who is calling Py_Main for their application too.

To correct it, I did this stuff:

  • Don’t call any types or try to create any instances during PyImport_AppendInitTab. The types are all readied and added to the module.
  • After Py_InitializeFromConfig has been called, Python is actually ready to use the module.
  • The module object and type objects I provided to initialize the module do not seem to be the same objects in use at the REPL. Maybe PyType_Ready copies the PyTypeObject struct somewhere else to apply defaults / inheritance? Instead, import the module: PyObject* mobj = PyImport_ImportModule("mcrfpy");
  • Instead of using the type objects I defined, get the finalized type out of the module: PyTypeObject* tobj = (PyTypeObject*)PyObject_GetAttrString(mobj, "Texture");

From there, as Cornelius stated, calling the type object would be a very straightforward solution. I was also able to use the type object’s tobj->tp_new / tobj->tp_alloc to side load or “factory function” my objects without any weird side effects.