PEP 778: Supporting Symlinks in Wheels

Mine aren’t. My argument against is based on a significant lack of platform-specific build steps in our build tools which basically assures that all platforms will follow the same build steps, and the inevitable surprise publishers will face when Windows doesn’t behave “like Linux”. (This is also partly why I favour source-only releases and having platform-specific distros do builds, but I’ve been told that that is “anti open-source” as well :man_shrugging: Nobody tell all the Linux distros…)

I think the middle ground we’re landing on here is workable though. If anyone is strongly opposed to “specify links in our own file, and the installer will link or copy the files/directories as appropriate for the target platform” then that’s the design to argue about.

4 Likes

The primary motivating use case in the PEP is for multiply-named .so files on Unix - and to a first approximation this means on Linux, since AFAIK MacOS does something different. But @njs is arguing that these multiple names don’t even make sense for .so files packaged inside wheels. Nathaniel was instrumental in the manylinux spec which lets us distribute compiled libraries on Linux, so I place a pretty high value on his opinion in this area.

Is it worth putting the details of how symlinks might work on ice and starting a discussion instead about how different scenarios with .so files in packages might best be handled?

(I’m interested in the ‘editable installs’ use case for symlinks, but that’s explicitly left for a future PEP. The other use case mentioned - alternative names for executables created by non-Python build systems - seems niche enough that I don’t imagine it would drive the PEP on its own.)

2 Likes

There seems to also be a use case around build systems producing symlinks which build tools and then we need to either have Python-specific workarounds to remove, or symlink support in wheels so we don’t need the workarounds.

I’d like to see this expanded, as I don’t really understand the distinction being made here - wheels are built with build backends, whose job is to run build systems and process the results into a wheel. Is the issue here that someone has to deal with those symlinks, and the question is whether it’s build backends or the wheel format?

The build and runtime tools already understand the thing I’m proposing. Let me see if I can sketch this out in a bit more detail. Suppose you have native library libfoo, and a native library libbar that depends on libfoo.

The install process for libfoo (e.g. make install, or the equivalent for cmake or whatever else you may use) will create, in the target lib directory, a structure like

lrwxrwxrwx  1 root root     15 Jan  1 00:00 libfoo.so -> libfoo.so.1.2.3
lrwxrwxrwx  1 root root     15 Jan  1 00:00 libfoo.so.1 -> libfoo.so.1.2.3
-rwxr-xr-x  1 root root 123456 Jan  1 00:00 libfoo.so.1.2.3

The build process for libbar will expect a libfoo.so to exist. The linker will read that file, find an entry in the binary file headers stating that its soname is libfoo.so.2 (e.g. DT_SONAME for ELF), and produce an output file libbar.so.4.5.6 with a dependency on libfoo.so.2 (e.g. DT_NEEDED for ELF).

I’m proposing two changes to this process.

First, at no point in this process is the name libfoo.so.1.2.3 actually needed. It’s present for human information, and I think it’s used in the case where you have multiple libfoo.so.1.* files in the sense that ldconfig will find the newest one and update the libfoo.so.1 symlink. But we aren’t calling ldconfig here because we’re not installing anything systemwide. So, it is safe to get rid of the filename libfoo.so.1.2.3 and make libfoo.so.1 an actual file, not a symlink.

Second, libfoo.so isn’t actually loaded as a dynamic library[1]. It’s just used as input to the (compile-time) linker, and the purpose is to cause the linker to figure out which actual name to use in the dependency (the soname) and keep track of which symbols are defined. For all the platforms that PyPI accepts binary wheels for, their linker supports an equivalent mechanism where libfoo.so is a text file. For GNU ld and things compatible with it, this is a linker script (with the same filename). For Apple ld, this is a .tbd file (with a slightly different filename). So, it is safe to get rid of the libfoo.so -> libfoo.so.1 symlink and replace it with this text file.

Both of these replacements can be done as a postprocessing step after the build of libfoo is complete, and does not require knowing anything about which build tool libfoo uses or how it works. You build the library via whatever means, and then you look at the lib directory, and you take any file whose filename does not match its soname but where there’s a symlink from the soname, and you rename it to clobber the symlink. You also take any symlink that has no version number (e.g. .endswith(".so")) and replace it with a linker script or equivalent. And hopefully there are no symlinks left.

And the transformed lib directory is transparently compatible to libbar. The build process of libbar will call ld -lfoo, which will find the linker script and behave just as if it had found a symlink, The runtime of libbar will use the dependency on libfoo.so.2 and open that file and not mind that this is a real file and not a symlink. So you don’t need to care about the build tooling or implementation of libbar either. It also doesn’t matter if libbar does a dlopen("libfoo.so.2") instead of having a declared dependency, or if it’s a Python extension module instead of a generic C library, or if it’s a binary, or even if it’s Python code using something like ctypes.CDLL("libfoo.so.2") instead of native code; all of these cases just open the filename libfoo.so.2 and work whether or not it’s a symlink.

In other words I’m not proposing doing anything in the general space of pkg-config. If your library happens to use pkg-config, the exact same .pc file will work for consumers before or after the transformation.

Ugh, that’s a good point. Thanks.

Would it be untenable to rework the wheel-building process to do the packaging and the auditwheel-ing in a single step? This would require one fix per Python build backend, since the mechanism is build backends directly produce wheels, but it still would not require one fix per C build tool.

Another option - accept this PEP as written, and then teach auditwheel to take a wheel 2.0 as input, see if it can get rid of all the symlinks, and if so produce a wheel 1.0 as output. That would resolve my concern about the practical impact of getting people to upgrade pip.

I’m not sure I agree with that, in that while the packages are few and advanced, I think they will be widely installed by people. Examples given so far are CUDA, MKL, cupy, and arrow, which from at least my perspective (people doing scientific computing but not using Conda) are very common packages. Very few packagers will be doing something like this, but many users will be consuming their packages. And, also from my perspective, running an out-of-date version of pip and not being totally aware that upgrading pip is a reasonable thing to do is also very common.

This is also something of a dependency for moving away from statically linking common dependencies like OpenSSL, and there have been several discussions about how we ought to get to that world.

Apple ld supports text-based stubs as I mentioned in my comment, which can accomplish the goal, see man tapi. Is this insufficient? I learned about these while writing that comment so it is entirely possible I’m missing something. :slight_smile:

And aren’t symlinks only relevant for shared linking anyway? For static linking, my impression is people just have a single normal file libfoo.a and no libfoo.a.1 names or symlinks or anything.

There’s also a discussion to be had somewhere about having separate wheels for development libraries/headers and runtime libraries, and if the development wheels are 2.0, that seems fine. I have no objection to building wheels requiring the latest version of pip, as long as the average user trying to make some tensors flow has a good default experience.

For the same reason I think it’s also fine if we say that AIX users need to upgrade to the latest version of pip—that’s strictly better than requiring that everyone upgrade.

So I think this is convincing me of the merits of accepting this PEP to have the format well-defined and standardized, and least as an interim measure, having auditwheel try to generate 1.0 wheels as best as it can. In a couple of years the need to support old pip versions will be less relevant and we can drop this compatibility code and expect everyone to be able to consume wheel 2.0.

(And, yes, this is influenced a bit by what else ends up in wheel 2.0.)


  1. Unless consuming code is doing dlopen("libfoo.so") without the version suffix, perhaps via ctypes.CDLL("libfoo.so"). But this is technically incorrect and should be discouraged. Precisely because it doesn’t encode the soname (the compatibility version, equivalent to a semver major version), using libfoo.so at runtime is unsound. The same function name can change its ABI between libfoo.so.1 and libfoo.so.2, perhaps because a struct changed definition or some #defines changed. Many times this is done in a way that is backwards-compatible in API, i.e. recompiling (including CFFI’s API modes) would work and pick up the binary changes. But if you’re doing stuff with dlopen or equivalent (ctypes, CFFI’s ABI modes), there’s no compilation step and you’re expected to match the ABI, and you can’t do that. Moreover, because this is a development symlink, for system-wide libraries, it’s usually not installed by default by the system package manager, and the symlink is packaged separately. The practical effect is that runtime use of dlopen("libfoo.so") will return file-not-found unless the user installs a package named something like libfoo-devel, which will include header files, build dependencies, and all sorts of other things not needed at runtime. So your distro packager and their users will be happier if you dlopen("libfoo.so.1") instead. If you want to support multiple sonames and you are dealing with the ABI incompatibilities (the easy case is if some other function you’re not calling is the incompatible one), it’s totally okay to loop over all the sonames you know you can handle. ↩︎

7 posts were split to a new topic: Ways to update pip (and when updating is a good idea)

I suspect yes, that’d be a major change. Aside from the complexity of the logistics of the change: not everything works with auditwheel & co, nor is the repair step desirable for use cases other than distributing on PyPI or another index.[1]

To demonstrate, try running auditwheel on MKL. I had a hunch that that wouldn’t work, because MKL uses a pretty complex structure of dynamic libraries, some of which IIRC dlopen other ones. I get, for the 2024.1.0 x86-64 wheel:

$ auditwheel repair mkl-2024.1.0-py2.py3-none-manylinux1_x86_64.whl --plat manylinux_2_24_x86_64
...
ValueError: Cannot repair wheel, because required library "libmkl_core.so.2" could not be located

$ auditwheel lddtree mkl-2024.1.0-py2.py3-none-manylinux1_x86_64.whl
...
elftools.common.exceptions.ELFError: Magic number does not match

I’m not sure how the MKL wheels are originally produced (almost certainly not through a regular build backend), so it may be fixable. Or not - hard to know.

That could be an interesting exercise, and potentially useful indeed. I don’t think it should be required for the PEP to be accepted, but it would be helpful at least in the transition period when support in pip is new (or even not yet released).

I have no idea to be honest, I’ve never used it. But the general idea of linker scripts worries me, exactly because it’s so niche and impossible to predict where it’ll break.

In SciPy we actually do use a linker script for better symbol hiding. We check at build time if binutils-style -Wl,--version-script is supported, and if so then we can use it at build time. So I searched the issue tracker for:

Checking if "-Wl,--version-script" : links: NO

The first issue I found was scipy#19378 - “Cross-Compile scipy for riscv target”. Why a linker script isn’t supported for such a niche build config is something I’d really rather not spend any time investigating.


  1. Side note: on Windows it can be desirable though to have that option, because pip install . can produce completely broken wheels when you deal with external shared libraries without running delvewheel. ↩︎

Speaking with my PEP delegate hat on, I’d like to suggest some caution here. We’re getting way ahead of ourselves discussing details of symlink support, and especially of how installers will handle the transition, when there’s not even a proposal for Wheel 2.0 yet.

Also, the whole split between PEP 777 and PEP 778 seems misguided to me. We don’t agree new versions of standards independently of functionality. Rather, new functionality gets proposed, and if it needs a new version of the standard, the introduction of that new version is part of the proposal for the new functionality.

If there’s multiple proposals, they will each need their own version bump - or they are going to have to be presented and approved as a group, which will significantly complicate the whole process.

It may be that PEP 777 will make all this clear, and include a plan for incrementally adding functionality that makes my concerns unfounded. But honestly, I doubt it. Whatever happens, though, we should either be working on PEP 777 first, or this PEP should include the bump to wheel 2.0, and other PEPs adding functionaliy to the wheel format will need to define their own version bump.

(To be clear, I think one of the big problems here is that wheel format versioning needs a rethink - as the “what will pip do with version 2.0 wheels” subthread established, a major version bump is very disruptive, and yet no-one has come up with a plausible example of something that would be a minor version bump. So unless we come up with something better, every bit of new functionality is hugely disruptive, and we end up trying to make “kitchen sink” changes bundling all the functionality improvements that are currently blocked on a version bump into one change. And these current PEPs seem to be doing just that, although they are trying to hide that fact by splitting things into multiple PEPs - but if the PEPs can’t be implemented independently, they remain one change in practice).

3 Likes

One question that has come up is whether and why library symlinks might exist outside of system install directories managed by ldconfig. Here’s an example of how such a library could be produced, and some hopefully useful perspective on why it might be useful.

# CMakeLists.txt
cmake_minimum_required(VERSION 3.22)
project(example VERSION 0.1 LANGUAGES C)

add_library(ex SHARED example.c)
set_target_properties(ex PROPERTIES
    SOVERSION "${example_VERSION_MAJOR}"
    VERSION "${example_VERSION}"
)
# example.c
#include <stdio.h>

void hello() {
    printf("Hello, World!\n");
}

If we run cmake on this (cmake -S . -B build/ && cmake --build build/), we see

$ ls -l build | grep lib
lrwxrwxrwx 1 coder coder    10 May 24 01:35 libex.so -> libex.so.0
lrwxrwxrwx 1 coder coder    12 May 24 01:35 libex.so.0 -> libex.so.0.1
-rwxr-xr-x 1 coder coder 15216 May 24 01:35 libex.so.0.1

Setting aside the wheels question for the moment, I can think of a few reasons a project might be structured this way:

  • It allows end-users to build, install, and use the library without any knowledge of ldconfig. This is especially true if they are installing into a non-system prefix that ldconfig is not managing. Having the build system generate this structure means these libraries can be discovered by downstream builds that expect any of the three names to exist.
  • It allows usage of the build directory itself as the library source without installation, and in place of another version of the library already installed into a system directory (for instance, if you’re on a shared system and don’t have permissions to overwrite system libs).
  • It makes it easy to repackage into package formats other than those used by system package managers (e.g. apt) to install onto another a system.

IMHO those three arguments translate fairly well to why we might want this in wheels, since what we’re proposing is essentially moving along the trajectory of facilitating the packaging of libraries (such as MKL, like @rgommers mentioned above) into wheels for consumption by other wheels. I would echo Ralf here that while a lot of the issues with such packaging could probably be achieved using a suitable collection of workarounds, such workarounds can be quite painful to work with and get right no matter how familiar you are with the Python packaging system.

I have previously prototyped essentially the four steps @njs proposed above (basically use ctypes/dlopen to load the library in the wheel at runtime; I realized later that this is basically the pynativelib proposal, but without some of the library naming bells and whistles that Nathaniel added to make things safer and more portable) in a few libraries and that approach can address a number of use cases, but it’s difficult to see it addressing all of them, particularly those around using the libraries at build time as well as at runtime. IIUC most of the counterpoints Nathaniel made are around runtime usage, but in principle given the ecosystem’s move towards build isolation over time it’s only going to become more important to be able to provide what’s needed at build time as well. Ideally we would want the build and runtime environments to be identical w.r.t. which libraries are used, otherwise we are forced to build against system libraries that have all the symlinks etc that build tools use and then use a library repackaged into a wheel at runtime and hope for ABI compatibility. Additionally, there are further edge cases of runtime usage (like downstream consumers actually wanting to dlopen the library for use, not just access it via a DT_NEEDED entry, and which ought to always use the SONAME to ensure ABI compatibility) that presumably require more than one of these to exist in the wheel.

On the topic of editable installs, I’m sure most people in this thread are already familiar but FWIW PEP 660 does discuss how that limitation can be worked around fairly easily and there are build backends that do this in the wild, which seems to suggest that it’s not crucial to solve that issue here.

1 Like

This is a very fair critique. I think that I had hoped to have a draft of PEP 777 to start discussion with at this point, but that hasn’t happened.

I agree this is very sub-optimal, but on the other hand I don’t want to bump the wheel major version 3 or 4 times and have each of those need a 3-5 year deployment. Which I think we agree is not a great solution either.

Yeah, I hope there is a better way to handle this. I have created How to reinvent the wheel to discuss how to handle feature bumps in wheel. Let’s discuss over there how to proceed.

Worst case we end up back on “all at once changes are the only route” :grimacing:

2 Likes

Just to chime in, I have the functional implementation of “wheel-axle” that is currently used in vivo that supports symlinks in a reasonable manner: