Embedding Python in a C++ self-contained executable

So, my use case would be to have a C++ Framework and embedded Python more or less as a scripting engine for UI rendering, where the structures are written in C++ and then exposed to the Python script via pybind11. So as far as I understand this would not involve any runtime modules as everything is compiled in one go and the python script is written afterwards and not extended any more. If it is to extend, you would modify the original code and not load a module at runtime.

Does this sound correct, would this work completely without module extensions, when exposing it directly from the main C++ app? In such a case I would completely disable module extensions as it is not of any use

Yeah, the main thing that you lose by omitting extension modules is networking (socket, select and ssl) and foreign-function interfaces (ctypes/libffi). If you can live without these, and without any third-party stuff, you shouldn’t have a problem.

(I’d expect in your circumstances you’d want to provide your own networking anyway, so that it integrates properly with your UI loop.)

On Linux these can be statically linked if you want them. Windows is a bit more difficult because the third-party libs we depend on don’t statically link easily (and it only gets harder trying to do two levels of static linking…)

1 Like

Yes, perfect, i see why that is. But the missing networking can actually be considered a good thing in my case as it shouldn’t be done in the UI loop, instead in the C++ base. Thus, it would prevent users from using the python script for anything more than it was designed for.

Thank you for your help. I will keep the possibility in the back of my mind, but I think I will just ship the embedded distro and have python UI scripting and self-contained executables exclude each other. The security gap is closed by noting that anything security related should be done in the C++ backend and not exposed to the Python UI loop anyways :slight_smile: .

But I can’t really get the embedded distro to work. I basically want to compile the C++ application once using the Python C headers, link the Python library, and then run the executable on another machine while providing the extracted python embeddable distro, containing python311.dll, python311.zip and many .pyc files.

But how am I supposed to compile something for the embedded distro when there are no headers included? I really searched but can’t find good instructions in the web. The application should use python311.dll from the embedded package, but there are no headers or .lib files to compile with. And if I compile with the headers and libraries from the normal python installation, but then provide the embedded distro when running, the program simply crashes. (Versions are the same)

How am I supposed to use the embedded package when there are no headers? And when I use the normally installed python files, how do I get an embeddable distro, that fits together with the headers it was compiled with? (instead of downloading headers and libraries from unrelated download links and hoping they will work)? Somehow the headers are missing in the embedded distro, or the embedded distro is missing in the normal installation, what am I supposed to do?

The crash will be related to something else, the binaries are identical. They’re all laid out as part of the same release process. (Any Python 3.11 install should be fine, they’re compatible enough that the precise version doesn’t have to match.)

So you did the right thing here, but will need to diagnose the crash independently.

You didn’t extract the zip with all the .pyc files did you? There’s no need, Python will read them directly from the zip file. But if you move them around, you’ll need to update the python311._pth file to point at the new search path.

1 Like

Well, i extracted the downloaded embed zip file, and now i have a folder with python311.dll, python311.zip and many other files and some more dlls.

I have compiled a small test program that includes and links against the normal global python install. It runs perfectly on the build machine.

For simplicity, I just copy the executable into the extracted embed folder on the other machine, so that it is right besides the python311.dll and python311.zip . However, when I execute it I simply get a Windows error message “The application could not be started correctly (0xc000007b)”. But that could be anything, this usually means something is incompatible.

When I instead only copy the python311.zip file from the embedded distro and python311.dll from the global python installation from the other machine next to the executable, then it also runs on the other machine. Until i try to import something like sockets, then it also crashes again. So something must be wrong with the dlls i think.

EDIT: It also works for simple packages when i extract the embed zip but replace the dll from the zip with the globally installed dll, but it still only works for simple packages, not for sockets. Thus, the dll that is globally installed behaves differently than the one that is part of the embedded distro

You will either need to copy python311._pth or specify the module search paths as part of initialization. Without one of these, it will default to trying the usual search process, which is bound to fail. Specifying the paths during initialization is more secure, but way more complex than using the ._pth file.

If the DLLs from the two sources are different, it’s because you have mismatched packages. The binaries are identical - we only build and sign it once, and then package it up multiple times.

1 Like

This is my python311._pth:

python311.zip
.

# Uncomment to run site.main() automatically
#import site

That seems to be correct as python311.zip, python311.zip, python.exe and my compiled executable are all in the same directory. I want to note that python.exe (the interactive shell) works perfectly fine when double clicking it (inside of the embedded distro).

My App still works with the global installation, but when i remove the global installation from the path and put the app into the embedded distro folder on the same machine, it crashes. It must be the combination of my app being compiled with the global installation, but then using the embedded version. The global installation works in itself, and the python.exe interpreter in the embedded distro also works. Just the combination of my app and the embedded dll does not want to work.

Is there maybe something that must be set in the application to differentiate between global and embedded installations?

Also, here is my current code:
main.cpp

#include <pybind11/embed.h> // everything needed for embedding
namespace py = pybind11;

#include <fstream>
#include <iostream>

int main() {
    py::scoped_interpreter guard{};

    // Load a file
    std::string path = "script.py";
    std::ifstream file(path);
    if (!file.is_open()) {
        std::cout << "Could not open file " << path << std::endl;
        return 1;
    }

    std::string str((std::istreambuf_iterator<char>(file)), std::istreambuf_iterator<char>());
    py::exec(str);
}

CMakeLists.txt

cmake_minimum_required(VERSION 3.16)
project(pybind)

add_subdirectory(pybind11)

add_executable(pybind src/main.cpp)
target_link_libraries(pybind pybind11::embed)

That’s it. pybind11 internally uses FindPython.cmake as far as i am aware, which basically finds the headers and library of the global installation.

Alternatively i used this code directly:

#define PY_SSIZE_T_CLEAN
#include <Python.h>

int
main(int argc, char *argv[])
{
    wchar_t *program = Py_DecodeLocale(argv[0], NULL);
    if (program == NULL) {
        fprintf(stderr, "Fatal error: cannot decode argv[0]\n");
        exit(1);
    }
    Py_SetProgramName(program);  /* optional but recommended */
    Py_Initialize();
    PyRun_SimpleString("from time import time,ctime\n"
                       "print('Today is', ctime(time()))\n");
    if (Py_FinalizeEx() < 0) {
        exit(120);
    }
    PyMem_RawFree(program);
    printf("Works\n");
    return 0;
}

, while specifying include directory and library path manually. Both yield the exact same result that it works when the global instance is on the path, but crash when they are next to the embedded distro’s files…

This is real interesting, there shouldn’t be anything more to distinguish here.

You say it crashes - are you able to pinpoint where? Or if you can capture a dump from the crash, I can probably do some deeper analysis.

FWIW, I took your code sample above and compiled it myself using Store Python, then dropped in just python311.dll, python311._pth and python311.zip from the 3.11.3 64-bit embeddable package:

cl "/IC:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.1008.0_x64__3847v3x7pw1km\Include" crash.c /c
link /LIBPATH:"C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.1008.0_x64__3847v3x7pw1km\libs" crash.obj

It worked fine for me. Didn’t see anything that looked a likely culprit for causing a crash either. So any more info you can collect would be real helpful.

Alright, I think i need to get out bigger guns. But just to confirm, it is the correct approach to compile with the headers from a global install, and then run it with any embedded distro that is the same major version (and minor version is irrelevant)?

1 Like

You have to run with the same minor version (3.11, or 3.10). The micro version (3.11.1 or similar) is irrelevant.

2 Likes

Holy cow, i finally found the mistake, while writing a 500 point list of every single action i take. I didn’t notice the bit-ness.

I always downloaded and used the 32-bit embedded distro, because of greater system compatibility, and then never questioned it anymore. However, because I installed Python globally with the big download button and the big download button as well as the recommended download link on the windows page don’t say they’re 64-bit, I never thought about it. I now downloaded the 64-bit embedded distro and everyhing works beautifully. Thank you for taking a look at this, although it was my fault.

But i have one more question, now that i can successfully use it:
How can i guarantee that anyone building my project has a fitting embedded distro to the version they have installed? I assume it would be quite cumbersome to detect the version they have installed and then download the right package, besides the fact that this makes the build non-reproducible.
I would much rather have both python packages downloaded at CMake configure time, so that the project is always built with the same python version.

Question: Is there any way that I can automatically download like a zip file that contains the windows installation (including the headers and compiled library to link against, python311.lib), that is not an installer, but a zip file i can extract locally? I need the headers and the static library to link against.
That way CMake can specify the python version wanted, and the compilation is completely independent of what is installed in the developer’s system

1 Like

yep that is what i meant, didn’t think of the word ‘micro version’

Grab the package from Nuget. There’s a direct download URL, which you can pretty easily calculate, and a .nupkg is just a ZIP file with more metadata.

Alternatively, you could make your build calculate the embeddable distro URL from its own version and download that one.

Unfortunately in both cases, you’ll need to know the micro version (unless you use nuget.exe to do the download) - you can’t just ask for “the latest 3.11” for example.

I can think of a few ways to streamline our set of installers that I’d kinda like to do, which would affect this. But I don’t really want to go breaking everyone until it’s all lined up in a way that isn’t going to be surprising, and also doesn’t leave us maintaining all the old packages forever. Probably when the next big “thing” happens in packaging I’ll revisit the distributions we provide on python.org, but we’re still waiting to see what that will be.

1 Like

Ah, this is a good hint. Although this is only for Windows 64-bit as far as i can see, right? The project i am working on must be 100% cross-platform, so such a package would be needed for every supported platform and architecture. (I only support Windows and Linux so far, 32 and 64 bit).

Needing to know the micro version is not an issue. As everything is supposed to be reproducible, the full version is specified, and thus only this exact version is going to be used until someone changes it intentionally.

For the sake of completeness, it would be nice to have a zip file for download which includes the files the installer installs, but without being an installer. This should in the best case be available for every installer that is downloadable, on every platform.

… oh wait, aren’t there any linux packages for download? I know that python is installed in almost every system, but the same things apply here, the goal is to streamline everything on every platform. Isn’t there an embedded distro for linux?

For now I am detecting the installed python version and downloading the exact matching embeddable distro and copying it to the target folder, it turned out to be extremely simple. Although i planned to do the exact same thing for linux…

1 Like

pythonx86 and pythonarm64 are also available on Nuget.

Non-Windows platforms are still under development, as mentioned earlier. You’ll want to follow along with PEP 711: PyBI: a standard format for distributing Python Binaries for those.

1 Like

Ahh, I know THAT feeling. Congrats on finding the problem. If you “feel dumb” because it turned out to be something trivially small, remember that there are ten thousand trivially small things that it could have been, and it takes a smart person (and/or a lot of extreme tedium) to figure out WHICH small thing it was!

2 Likes

I’ve read so many explanations on Stack Overflow for this very issue I feel like my eyes are going to start bleeding. Such a simple solution I feel like I may need to pursue an alternative career path lol. Thank you so much; this worked for me having the same issue with pybind11.