Symbolic links in wheels

The whole scheme of symlinks like libarrow.solibarrow.so.14.0.0 is tightly coupled to how the Linux system linker searches for libraries. Wheels have a different and incompatible way of handling library searching. Including libraries like this inside a wheel is a dubious thing to do… I can see how it might solve some problems in the short term but in the long term I think you’ll hit unsolvable problems. For example, if the user also has a system copy of libarrow.so, and you’re relying on the linker recognizing well-known names like this, then your package might end up using either the system’s copy or the wheel’s copy basically at random, which sounds like a recipe for obscure segfaults.

IMO if you want to ship shared libraries in a wheel, then you should take the search problem seriously, and not rely on the linker’s naming scheme for system libraries. Auditwheel gives each vendored library a unique mangled name, which works well for that use case. From your post, I assume you also want the library to also be usable by other packages. In that case, the best approach I’ve been able to come up with on Linux is:

  • Give your library a unique name that designates a specific ABI as shipping inside a Python wheel, like libarrow-wheel-14.so or similar.
  • Provide a Python API that lets other packages query for build time configuration (linker flags, include dir, etc.), as well as the resulting wheel dependency (maybe if the third-party package is built against pyarrow 14.2.3, then that means its Install-Requires should include pyarrow >= 14.2)
  • Provide a Python API that lets other packages request the library be available at runtime, doing whatever linker finagling is necessary to make that work. (On linux, the simplest thing is to just dlopen("path/to/libarrow-wheel-14.so"); then any future requests for that shared library will be automatically satisfied without going through the normal library search.)
1 Like