Native dependencies in other wheels -- how I do it, but maybe we can standardize something?

njs · February 16, 2023, 7:35am

I wrote up some notes on how to do this years ago, but stalled out do to lack of time + folks not seeming too interesting: wheel-builders/pynativelib-proposal.rst at pynativelib-proposal · njsmith/wheel-builders · GitHub
It’s exciting to see interest again!

The challenging part is making wheels that support arbitrary Python environment layouts. Your macOS solution works for the case where all the packages are installed into the same site-packages directory, and that should work… a lot of the time? But normally if package A depends on package B then all you need is for A and B to both be findable in sys.path – requiring that they both be installed into the same directory is an extra requirement that normal packages don’t need, and can break down when dealing with things like in-place installs, or environments that are assembled on the fly by just tweaking environment variables instead of copying files around (like I’ve been experimenting with).

However, this is solvable. It’s just frustratingly complicated.

For Windows and Linux: as you noted, if a shared library declares a dependency on some-lib, and a shared library named some-lib has been loaded into the process, then the dynamic loader will automatically use that some-lib to resolve the dependency. You do want to make sure that your libraries all have unique names, to avoid collisions with potentially-incompatible binaries from other sources – so e.g. you might to rename the version of libntcore to pynativelib-libntcore, in case the user has another copy of libntcore floating around under its regular name. Fortunately, patchelf can do this renaming on Linux, and machomachomangler can do it on Windows.

This part is relatively routine – the library vendoring tools auditwheel for Linux and delvewheel for Windows do the same renaming trick for similar reasons. That’s why I wrote the Windows renaming code in machomachomangler in the first place

For macOS: as you noted, it is absolutely impossible to convince the macOS dynamic loader to look up a library by anything other than an absolute or relative path. Super annoying. However! There is a trick. Explaining it requires a brief digression into obscure corners of Mach-O.

Say you’re building a library that uses cv::namedWindow from opencv. What normally happens is:

Your compiler “mangles” (this is a technical term) the C++ name into a simple string, probably something ugly like _ZN2cv10namedWindowEv.
It looks around at the libraries you have to link to, and finds that this symbol is exported by libopencv.dylib
It makes a note in your binary that when it wants to call cv::namedWindow, it should do that by first finding libopencv.dylib, and then looking inside it for _ZN2cv10namedWindowEv.

macOS calls this procedure the “two-level namespace”, because your binary stores two different pieces of information: the library to look in (as a filesystem path!), and the symbol to look for inside that library.

But! macOS also supports an alternative lookup procedure, the “single-level/flat namespace”. When using this, your compiler just writes down that it wants a symbol named _ZN2cv10namedWindowEv, without any information about where it should come from. And then when your binary gets loaded, the loader looks around through all the libraries that are loaded into the process for a symbol with that name. Now that pesky filesystem lookup is gone!

So all we need to do is:

make sure that our wheel-wrapped libopencv.dylib is already loaded before loading our library
…somehow make sure that every single symbol that our wheel-wrapped libopencv.dylib exports has a globally unique name, so it can’t accidentally collide with some other random binary, and that our library looks up symbols by those globally unique name.

Unfortunately, our libraries probably don’t have globally unique symbol names; that’s the whole reason why macOS added the two-level namespace stuff.

But, we can fix this! All we need to do is go through the symbol tables stored inside libopencv.dylib and inside our binary, and rewrite all the symbols to make them unique, so e.g. now libopencv.dylib exports _our_special_uniquified__ZN2cv10namedWindowEv, and that’s what our binary calls, and we’re good.

Of course rewriting symbol tables is pretty complicated, especially since macOS stores symbol tables in this weird compressed format where instead of a table, it’s actually a program in a special bytecode language that when executed outputs the symbol table, so you need to evaluate the program and then generate a new one? It’s kinda wild tbh.

Luckily, machomachomangler has code to do exactly that. Well, it’s not really luck; this is why I wrote it It’s been like 5 years since I looked at this last and I don’t think anyone has used that code in production, so it might need some tweaks to bring it up to speed, but that shouldn’t be too hard as long as someone is interested in making it happen.

Nice – if you read the pynativelib proposal I linked at the top, then I think we converged on a pretty similar design. The big thing I’d suggest adding is metadata for “if you build your wheel against this version of this package, then here are the install-requirements that you should add to your final wheel’s metadata, to make sure that an appropriate version of this package is available”. So e.g. building against ntcore 1.2.7 might suggest a runtime requirement of ntcore == 1.2.*, >= 1.2.7, or ntcore == 1.*, >=1.2.7, or even ntcore == 1.2.7 – whichever one matches ntcore’s ABI compatibility guarantees, which the ntcore distributors will understand better than their users.

This is doable – we already have package names and version constraints and all that to solve these exact problems for Python libraries We just need to figure out how to encode the C compatibility constraints using those tools.

(It’s even possible to support e.g. two incompatible versions of opencv. Just name the python packages opencv-v1 and opencv-v2, and make sure the binary mangling uses different names, so wheels depending on opencv-v1 get the v1 symbols and wheels depending on opencv-v2 get the v2 symbols.)