Deprecating the assumption that libpython is dynamically linked

Hi! When we switched python-build-standalone to statically linking libpython into bin/python, we ran into some code that made assumptions that libpython was dynamically linked and expected to be able to open libpython3.x.so.1 and have that be the running libpython. We had to introduce some poor workarounds to maintain compatibility with such code. I want to propose that CPython actively monitor for this assumption and raise DeprecationWarnings with the intention of breaking such code in the future.

As background, the Python interpreter executable, which I’ll call “bin/python”, is a short C program that just calls Py_BytesMain(argc, argv). The actual Python runtime is in libpython, which can be either statically or dynamically linked into bin/python. Static linking is better for performance, especially on the free-threaded build for reasons involving thread-local storage access, which is why we switched over. However, if you need to ship a libpython.so anyway (e.g. for use by embedders), this now doubles your install size, so some distributors prefer dynamic linking. (Note that this has nothing to do with “static linking” / “a static executable” in the sense of a fully-static binary that doesn’t use the system libc. In general, a fully-static binary cannot later dynamically load anything, so the problems in this post don’t arise, so this discussion is about dynamic executables.)

If your bin/python dynamically links libpython, that means that by the time the interpreter starts, your libc has already loaded a library named libpython3.x.so.1, and that’s where the interpreter’s implementation of Python comes from. Any later request to load libpython3.x.so.1 (i.e., directly or indirectly via dlopen) will be satisfied by the existing one, and return the functions and data in the real running Python interpreter.

However, if your bin/python statically links libpython, a request to load libpython3.x.so.1 will cause a separate copy of libpython to be loaded from disk. Depending on the details of how it is requested, libpython symbols might be returned from the main interpreter or might be returned from this second copy.

The two specific cases we’ve seen are:

  1. Extension modules that themselves declare a dependency on libpython3.x.so.1, because they were (incorrectly) built with -lpython3. The upstream python3-config command and pkg-config files properly distinguish two cases, the (default) extension module use case, where you shouldn’t use -lpython3 because you’re expecting to be loaded into an existing Python, and the embedding use case (a non-Python application or library that is pulling in Python), where you should. But there are various third-party build systems that don’t get this quite right.
  2. Pure-Python code that is accessing the CPython API via something like ctypes.PyDLL(f"libpython3.{sys.version_info[1]}.so.1").

On distributions like Fedora that have bin/python dynamically link libpython, such code works properly. On distributions like Debian that have bin/python statically link libpython but also ship a libpython.3.x.so.1 on the default library search path, such code mostly works properly. Specifically, in case 1, the Linux/ELF ecosystem does not actually associate symbols to the libraries they came from, so the extension module has a request to load libpython3.x.so.1 and a totally unrelated request to find e.g. PyArg_ParseTuple from wherever it can be found, which hapens to be found first in the main executable. Even in case 2, as long as you’re calling functions that do not operate on global data and have found the right libpython, you’re executing the same code and things mostly work. But “mostly” is of course pretty dangerous.

On distributions that do not ship a libpython3.x.so.1, this code will all fail to work—if you’re lucky. If you’re unlucky, you have a libpython3.x.so.1 on the search path from some other installation of Python, and things can get quite weird. We ran into this problem almost immediately where extension modules in case 1 worked in python-build-standalone provided that you happened to have that same version of Python installed systemwide, which was quite difficult to debug. Even though we did still ship a libpython3.x.so.1 for embedders to use, it was no longer loaded in by bin/python and the binary didn’t declare an rpath that listed its directory (because it didn’t need to), so the linker would find and load /usr/lib/libpython3.x.so.1 if it existed.

We’re currently working around this by setting an rpath on our bin/python to point at the directory containing our libpython, simply to make this class of code “work”—we don’t actually need that rpath ourselves in normal operation. But this only gets us to the status quo of builds like Debian, where such code is still loading a second copy of libpython and hoping for the best. There are other possible hacks, e.g., placing a fake libpython.so with no actual code on the search path, so that one is found but symbols must be resolved from the main executable. The upstream effort for prebuilt-cpython will need to figure out whether to adopt one of these hacks, risk breaking this type of code, or dynamically link libpython (and take the performance hit).

It seems to me that the best option is to try to get maintainers of such code to fix it, so I propose the following deprecations:

  1. When loading an extension module, we can parse the shared-object headers ourselves to see if the module declares a dependency on libpython3.*.so*. If it does, we can raise a warning that the module was compiled incorrectly, perhaps with a link to some docs explaining the problem, and then continue to attempt to import it and hope for the best.
  2. ctypes.pythonapi already exists, and is set (on non-Android UNIX) to ctypes.PyDLL(None), which means “resolve symbols from what’s already loaded into the process”. There is basically no valid correct use of ctypes.PyDLL("libpython3.x.so.1"). Either you’re loading the current Python interpreter, in which case your code should use None for the reasons described above, or you’re loading a different Python interpreter (another build, another version, etc.), which has no guarantee of working well because of symbol collisions. Doing ctypes.CDLL("libpython3.x.so.1") is also certainly wrong for GIL reasons. So I think we should have the CDLL constructor check if the basename of the argument is of the form /libpython3.[0-9]*.so(.[0-9*])?)/ (or something like that) and raise a warning telling you to use ctypes.pythonapi instead.

In a few release cycles I’d like to see both of these be errors.

The downside of this change, of course, is we’re raising warnings and eventually hard errors on code that largely works right now. But I think all of this code has an alternative way of being written that would also work fine and works more reliably, so while it’s a migration cost, I think that’s defensible because that code currently has undefined behavior. There is also a small risk that people have private code that they know will only be run on a specific environment where these concerns don’t apply. The one I’m most worried about is a setup with a library that matches the libpython3.x.so.1 pattern which is, in fact, not CPython at all, or some very careful code loading and driving a different-version libpython via ctypes.CDLL("libpython3.y...") and they can convince themselves that there’s no risk of symbol conflicts in their use case. I’m not sure if these setups even exist; I suggest that if we get bug reports during the deprecation period, we can add some sort of flag to override the safety check, since I think this type of thing is much less common than code that wants to access its own libpython.

As a data point, auditwheel now complains about extension modules that load libpython, so for any extension modules that have been run through (a recent version of) auditwheel, this problem shouldn’t arise.

(For clarity, all of this discussion refers to UNIX-shaped platforms, namely Linux, Mac, etc. I’ve written .so but this generally applies to .dylib or the Python framework on macOS. I am pretty sure that this class of problem doesn’t arise on NT because symbol resolution works very differently. I see that ctypes.pythonapi on Android is defined by explicitly loading libpython3.x.so, and I don’t yet know why it differs from other Linux and will figure that out before proposing a PEP or PR.)

8 Likes

FTR, it would arise (with ImportErrors) and you’d have to rebuild all extension modules to switch to a statically linked Python binary[1] - there’s no way to build once for both approaches. But a statically linked Python binary is only really interesting if you statically link all the extensions modules as well, which means you’re outside of “use standard builds” territory and so it doesn’t actually impact anyone because virtually nobody is trying to do it (vs. just distributing the EXE and the DLL).

Given how symbol resolution works on *nix platforms, setting up extension modules to support both approaches by default seems like a good move. Whether this needs a deprecation or merely advice and feedback, I’m not totally clear, but if there are tools that build the “wrong” way that could be changed, I’d start by suggesting they change before we try and force them into it by changing upstream.


  1. Or at least do some nasty patching of their import tables. ↩︎

How big is “some”? What order of magnitude of PyPI are we talking about here? I ask because …

… unless we know the magnitude. I definitely think it’s reasonable to document not to link to libpython3, but whether we need to go as far as “parse the shared-object headers ourselves” isn’t obvious to me.

1 Like

We did hit two libraries within a day or so of shipping the switch to static linking last year, as reported in this bug report. I believe both (python-mscl and ucxx) are fixed now, and the auditwheel change probably helped catch any other packages that were keeping up with best practices. In an unscientific search of the majority of wheels of the latest versions of the top 500 packages on PyPI (I ran out of disk partway through extracting), I haven’t found any extensions linking libpython. It’s quite possible that older versions are more affected and I can try to do a more thorough search.

I don’t think parsing the ELF header is particularly much code nor particularly likely to be perceptible overhead (you iterate through the program headers to find the dynamic section, then you iterate through those and look at the name of each dependency, and also libc has to do this anyway so you benefit from the page cache), and CPython already has a good amount of this kind of object-file-parsing code already in Python/remote_debug.h. So I think if there are no hits, that could be an argument in favor of introducing this check, in the sense that it’s not much code and won’t cause much user annoyance, just as much as an argument against it, in the sense that it’s not needed.

I don’t know of a good way to analyze how much code is out there doing ctypes.CDLL("libpython3.x.so"). I did find a decent handful of hits on Sourcegraph public code search. A couple are in actively-developed libraries from a major company, but those are all pre-loading libpython into the process to work around some inability to find it, which might all turn out to be workarounds for extension modules that link libpython; these hits are not actually trying to use the resulting CDLL object. (I believe one catches the failures and logs a warning, and another would actually be a hard failure.) I did see a couple of real hits in what looked like vibe-coded projects and it wasn’t clear to me if it’s worth my time to send them PRs. But overall I’m much more confident that this kind of code is out there.

One case that I need to understand better is a wheel called find_libpython, which is not per se unreasonable—getting the current interpreter’s libpython (if there is one) is a valid thing for embedders etc. to want to do. But it works internally by trying to ctypes.CDLL a bunch of candidates and see if they expose Python API functions, which would break with this change. There is probably a better way to do what it’s doing, and maybe an argument for something in sys or sysconfig to report this information accurately without heuristics, but I haven’t looked in detail yet. I have also seen some code using this crate to proceed to actually construct a CDLL on the result and use it, which is more clearly wrong.

I should also say, I’m worried a bit about modules that aren’t on PyPI and distributed privately (extensions shipped with some commercial product, in-house extensions, etc.) as those are least likely to use tooling like auditwheel, most likely to use some odd custom build system, and least likely to have been tested against a variety of builds of Python.

These aren’t the only criteria for “should we do something” - status quo is the default, and it’s definitely less code and less performance impact than adding this.

The more important factors here are “do the people getting it wrong know what they’re doing?” and “do the people who do this correctly know what they’re doing?”, and neither of those are necessarily solved or helped by Python trying to specifically catch them out at load time.

The two sides to these modules are that we shouldn’t actively break them, and we also can’t take responsibility for ensuring they don’t break. Personally, I prefer to assume that our users know what they’re doing and so whatever build options they’re using in private are the ones they need - we shouldn’t mess with them.

I don’t think we’re the ones responsible for whether there’s a libpython or not - that’s a distro choice, right? If they need us to say it’s okay to leave it out, I’m sure we can say that (and if they want to include one that prints a warning when used until they leave it out, that’s also fine), but other than nannying the developers who get things right, I don’t really see what more we can do on this side?

3 Likes

That tells me it’s a not a big problem anymore.

That’s @pablogsal code, so I’m already scared. :wink:

I have the opposite reaction. Using my own analogy, that code would be like a puppy: you see a cute dog and all I see are vet bills and training headaches.

As I said previously, I’m quite happy to see documentation changed where you think it should go, but I don’t think this warrants maintaining some C code.

Hi,

In 2019, I modified Python (distutils stdlib module and python-config scripts) to no longer link extension modules to libpython on Unix, except on Windows, Android and Cygwin. There were multiple reasons for this change:

I also added to the documentation that libpython must not be loaded with RTLD_LOCAL but RTLD_GLOBAL on embedded Python.

I’m not sure about maintaining binary parsers to check if an extension module is linked to “libpython”. There are multiple binary formats for shared extensions (not only ELF), and it’s not trivial to write a (“correct”/“safe”) parser for each of them. Loading an external library for that task would avoid having to write such code ourself, but it would add a dependency at runtime which can cause other unpleasant issues.

About emitting a warning on ctypes.CDLL("libpython3.y..."): I’m not sure that loading “libpython” is always a bad idea, there might be cases where it’s relevant.

I would prefer to remain in the current status quo where issues are fixed one by one when problematic code is detected, as we did so far.

Adding more checks in linters such as auditwheel is a good idea. It doesn’t require to modify Python.

What can be done safely right now is to add documentation to guide users facing issues with “libpython”. For example, guide users from ctypes.CDLL("libpython3.y...") to cpython.pythonapi.

1 Like

+1 from me on the general direction here, and to head off @brettcannon puppy analogy: I’m happy to take on the ELF-parsing code if we decide to go that way :slight_smile: The code for this is relative straighfoward: It’s walk the program headers, find the dynamic section, iterate DT_NEEDED, string-match. That’s it. If I had to guess I’d say it’s smaller than the rpath workaround discussion in this thread.

@vstinner point about multiple binary formats is fair in principle but I think overstated in practice for what we’d actually need to ship. The problem manifests on ELF-based UNIX, which is where the check would live. Mach-O has the same conceptual shape if we ever wanted to extend it, but we don’t have to, and Windows genuinely is a different problem as @steve.dower noted (you’d get an ImportError immediately rather than a subtle wrong-symbols-loaded situation, which is its own kind of correct).

On the question of how widespread the broken code actually is: I think @geofft’s data is suggestive but slightly undersells the private/in-house case. What worries me is the long tail of corporate extensions built with bespoke build systems that nobody outside the company ever sees. Those are exactly the ones that will silently break when their users eventually move to a statically-linked CPython, and a deprecation warning during the transition window is a much kinder failure mode than “it just stopped working and we don’t know why.” I deal with enough of these at $work to have opinions.

Then I’m indifferent about the warning if Pablo wants to own it.