Creating a standalone CPython distribution

Author of GitHub - indygreg/python-build-standalone: Produce redistributable builds of Python here. (Thanks for the mention, Steve!)

Having somewhat solved this problem, I’ve thought a lot about it.

I have too much context lingering around in my head to capture in a single comment. But I’ll start with conveying some key points.

Foremost, there are both build system / binary portability issues as well as run-time issues. A lot of people (myself included) get transfixed over the obvious problem of how to build a Python distribution that can even run on other machines. This is indeed a hard problem (CPython’s build system doesn’t make this easy and cross-platform binary portability on operating systems like Linux make this harder than it deserves to be).

But even if you solve the build system problems to produce a standalone distribution (python-build-standalone is an existence proof this is possible), there are still a host of run-time problems that need to be tackled. OpenSSL needs to be pointed at trusted CA certificates. Terminfo needs to reference a database of terminal definitions otherwise the REPL (and anything else using readline) is completely broken. Extension module building (or anything else using sysconfig) is likely broken because the build-time settings don’t match the run-time environment. TCL files aren’t found, so tkinter is broken (CPython has Windows-only logic assuming the layout from the Windows MSI installer.). Even resolving the path to the stdlib can be finicky due to how some paths are baked into libpython.

So even if you produce binaries that are capable of running on other systems, there’s a lot that can go wrong when you invoke python. Today, CPython largely buries its head in the sand about these problems outside of Windows. For people like me existing outside CPython, that means you have to supplement the CPython runtime with addition logic to cajole OpenSSL, terminfo, tkinter, etc into working. (I wrote the pyembed Rust crate to handle this, and much more. And PyOxy is essentially the marriage of python-build-standalone + pyembed - via PyOxidizer - to produce a somewhat usable distribution. But there are still run-time corner cases this supplemental code doesn’t cover.)

If there were to be official standalone CPython distributions, you need to solve these run-time quirks to some degree. That requires maintaining a new pile of code somewhere. IMO a compelling case can be made for this logic existing in CPython + its stdlib itself. But that may be a contentious topic, as I’m sure people like Linux distro package maintainers may take issue with certain approaches. I’m not convinced it is possible to solve all the potential problems related to sysconfig on Linux: there’s just too much variance between machines. You may need to distribute your own compiler toolchain with your CPython distribution and point sysconfig at it to enable extension module compiling. (I always intended to do this with python-build-standalone but never found time to do it.)

I started python-build-standalone to support PyOxidizer. And my initial goal of PyOxidizer was to try to achieve single file Python applications, without any temporary files at run-time. I was therefore transfixed with having a single file executable for both the distribution and the Python application built on top. This is still a noble goal to have. But the reality is that while simple and convenient, single file distributions/applications probably aren’t as important as I initially thought. On Windows and macOS, the concept of installers is normalized among users. MSIs or exe installers on Windows. DMGs or pkg on macOS. And on Linux, BSDs, etc it is common to installs apps from a zip or tarball when bypassing your distro’s package manager. So in all cases your user base is likely familiar with some kind of install step to materialize a multi-file app. So having a single file executable only buys you so much.

I learned that single file Python distributions/applications break a lot of assumptions within CPython, the libraries it uses, and especially among Python packages in the wild. e.g. static OpenSSL libraries on Windows blow up in weird ways at run-time. And of course there are __file__ assumptions everywhere. As much as I desperately want to make it possible to have a single file Python distribution and application, they are a lot of pain. So my advice to others is to not get hung up on statically linking everything into a single binary. By all means make it possible to do this in the build system. But think long and hard before you recommend this as the shipping configuration, otherwise you are signing yourself up for a lot of funky bug reports.

I keep telling myself I’d like to upstream more and more of python-build-standalone into CPython. Ideally delete the python-build-standalone project completely, as official CPython distributions of the future would fulfill its use case. So I’m very much aligned with helping CPython produce official standalone distributions. Let me know how I can help further.