I’d like to discuss mechanisms for python packages/wheels to depend on native dependencies that were installed by other python wheels. Specifically, I’ve already solved this problem in a custom build tool and I think the way it’s implemented would be useful for other projects if this pattern + implementaton details existed in some standard/standalone form (maybe as part of packaging
, for example?).
If there’s interest from the community in this topic, I’d be happy to work with others to make this a standard mechanism to use native dependencies across wheels. I’m also interested in hearing how others have solved this problem in different and/or similar ways.
Sidebar: Why not conda?
I’m not a data scientist, and I don’t use conda myself. When I first started going down this road in late 2019, my work colleagues who did use conda would often complain about issues they had with using conda, and the few times I had to interact with conda I didn’t have a particularly good experience.
I do recognize that Conda solves a lot of these problems that I had to deal with, but my customers are high school students and their (often non-software engineers) mentors. If my colleagues at work are running into a variety of issues when they use conda, there’s no way I was signing up for that support headache.
A lot of people use Conda to solve the binary dependency problem, but a lot of people can’t or won’t use it. This thread is about ways we can solve it when not using conda.
About Me
I am the primary maintainer of RobotPy, which is an open source project that allows high school students to use Python to program and control their robots that they create for the FIRST Robotics Competition (FRC). Currently, FRC officially supports C++, Java, and LabVIEW as options for programming your robot. RobotPy has been an unofficial python alternative since 2011, but is expected to become an officially supported language option in 2024. There are 3000+ FRC teams, and last season around ~50 teams used RobotPy to program their robot directly, but many more used some of our libraries for interacting with the robot.
Motivation
From 2015-2019, RobotPy maintained a pure python port of the programming libraries needed to control an FRC robot. For a variety of reasons this was becoming untenable, and after considering many options I turned to pybind11
to wrap the existing C++ libraries.
The official C++/Java libraries live in a massive monorepo, with a handful of native libraries that depend on other native libraries. Additionally, there are vendors that provide advanced motor controllers and other sensors for use in FRC. These vendors publish binary-only releases of the libraries needed to interact with their motor controllers, so I needed something that could use native libraries that had dependencies on other native libraries.
For example: most of the vendor libraries depend on wpilib
, which depends on hal
and ntcore
and wpimath
, which depends on wpinet
, which depends on wpiutil
.
My goal was to make pip install robotpy
Just Work. To solve all of this, I wrote robotpy-build, which parses C/C++ header files and semi-automatically generates pybind11-based wrapper libraries (and type stubs) that can be imported by python code. While a lot of what is does is very cool, that’s a whole separate topic. I will focus on a very narrow subset of what it does in this thread.
Challenge: make ‘import _myext’ work
Anyone who has tried to do this immediately runs into this problem: if my extension depends on a native library, how do I convince Python and/or the OS to find the correct library? If the library is installed to the system, this is easy enough – but if it lives inside another wheel in site-packages, the system loader isn’t going to look there automatically.
Often the naive solution to this is to modify the system path or LD_LIBRARY_PATH
to force the system loader to find your library, but that solution doesn’t really feel right to me. Additionally, if the library you are trying to load exists on the system AND in a wheel somewhere there is potential for the system to load the wrong library.
There are approaches that work for all the major operating systems, but they vary slightly.
macOS
There is only one way to do it on macOS. The system loader insists that it must be able to find any referenced libraries, and will not resolve symbols that aren’t in a referenced library. However, there is a nice way to tell the loader to find a library relative to the library – @loader_path
.
Since our wheels install to site-packages, we know where the libraries will be relative to our library, so we use delocate
to modify where the libraries are loaded (here’s how robotpy-build does it).
Given this simplified site-packages for my ntcore
package and its dependencies wpiutil
and wpinet
:
+- wpiutil
| +- lib
| +- libwpiutil.dylib
+- wpinet
| +- lib
| +- libwpinet.dylib
+- ntcore
+- lib
| + libntcore.dylib
+- _ntcore.cpython-311-darwin.so
Here’s the (simplified) output of otool -L
for the modified libraries:
$ otool -L ntcore/lib/libntcore.dylib
ntcore/lib/libntcore.dylib:
@loader_path/../../wpiutil/lib/libwpiutil.dylib
@loader_path/../../wpinet/lib/libwpinet.dylib
$ otool -L wpinet/lib/libwpinet.dylib
wpinet/lib/libwpinet.dylib:
@loader_path/../../wpiutil/lib/libwpiutil.dylib
$ otool -L ntcore/_ntcore.cpython-311-darwin.so
ntcore/_ntcore.cpython-311-darwin.so:
@loader_path/../wpinet/lib/libwpinet.dylib
@loader_path/../wpiutil/lib/libwpiutil.dylib
@loader_path/lib/libntcore.dylib
With this setup, an import of ntcore._ntcore
will just work in any standard CPython installation, virtualenv or not (unless you mix system + virtualenv + user site-packages… but don’t do that).
One caveat: to modify the install_name_path there has to be enough space in the binary for the modified name. You can pad the install_name_path to the max by compiling your native libraries with -Wl,-headerpad_max_install_names
.
Windows
Windows doesn’t have a mechanism to tell the system loader to resolve libraries relative to a library, but it turns out that as long as the dependencies of a library are already loaded in the process then Windows will use those to resolve symbols and it Just Works. We can use ctypes.cdll.LoadLibrary()
to manually load each needed library in the correct order, and when we finally do a import ntcore._ntcore
it will load without any problems.
Linux
For Linux you can actually use either approach. You can modify the ELF to resolve libraries relative to the library (just like macOS), or you can take the approach we take for Windows and manually load each library in the correct order. My build tool takes this last approach, but either is fine.
How robotpy-build deals with native dependencies across wheels
There are several pieces that need to be solved:
- At build time: finding all the pieces needed to compile + link
- At run time: finding and loading native dependencies in the correct order before
native python extensions are imported (see above discussion)
The FRC official libraries + headers are distributed in a maven repository, so we have a separate mechanism that downloads them and puts pieces in the right places for building a wheel. That won’t be discussed here – every project is going to obtain its native libraries in a different way, so below we assume that’s all figured out.
Build Time
At build time, we need something effectively like pkg-config
, but one that works in the python ecosystem and only finds things that were installed by other wheels. The build system needs to find at least the following:
- library names
- link paths to library
- associated include directories for header files
- (pybind11 specific) type casters header files
To find these, I chose to use setuptools entry points, using the robotpybuild
entry point. Each entry point has a name (used to define which native dependency) and a python package associated with it.
Let’s examine my pyntcore
project, which both contains a library for others to use and also uses other libraries. Here’s the entry_points.txt
in the installed *.dist-info
:
[robotpybuild]
ntcore = ntcore.pkgcfg
This pkgcfg file is generated by the build system when it’s generating a wheel (I’m not going to discuss how the build system figures these things out since that’s very build system dependent, the important part is that it can figure it out and generate the pkgcfg.py
), and is distributed with the wheel. At build time when resolving dependencies build system finds the associated entry point, and directly execs ntcore.pkgcfg
(while being careful to NOT import its parent, which wouldn’t work when cross-compiling). Here’s that file on macOS:
# fmt: off
# This file is automatically generated, DO NOT EDIT
from os.path import abspath, join, dirname
_root = abspath(dirname(__file__))
libinit_import = "ntcore._init_ntcore"
depends = ['wpiutil', 'wpinet']
pypi_package = 'pyntcore'
def get_include_dirs():
return [join(_root, "include"), join(_root, "rpy-include")]
def get_library_dirs():
return [join(_root, "lib")]
def get_library_dirs_rel():
return ['lib']
def get_library_names():
return ['ntcore']
def get_library_full_names():
return ['libntcore.dylib']
Most of this information’s purpose is obvious (and similar to what pkg-config provides), but I’d like to call attention to several specific pieces:
get_include_dirs
and get_library_dirs
retrieve the locations of libraries and include files. I chose to include them in the wheel in the package directory because other ‘standard’ locations (in particular, the headers
argument for setuptools) didn’t seem to work work the way I would expect and sometimes would try installing to system locations, and IIRC didn’t work in editable installs (which is really important for my development setup because pybind11 takes FOREVER to compile for some of my template-heavy dependencies).
depends
indicates other robotpy-build compatible native dependencies of this library, which can be looked up by finding the associated robotpybuild
entry point and loading its pkgcfg file.
libinit_import
specifies a file that MUST be imported before importing any other python package that tries to use the native dependency. This python file is responsible for loading any native libraries and its dependencies. This leads us very logically into the next section…
Runtime
When a user uses pyntcore, our goal is to make is so that they just need to import ntcore
without needing to know all of the magic native dependency stuff that we discussed above. That package has an __init__.py
that does a few things:
from . import _init_ntcore
from ._ntcore import (
# Here we expose the symbols from the native extension, but
# elided here for brevity
)
Because python always loads __init__.py
first, the first part imports the libinit_import
mentioned above, which ensures that any native dependencies for the compiled python extension _ntcore
are loaded before it’s loaded.
Let’s look at ntcore/_init_ntcore.py
on Linux:
# This file is automatically generated, DO NOT EDIT
# fmt: off
from os.path import abspath, join, dirname, exists
_root = abspath(dirname(__file__))
# runtime dependencies
import wpiutil._init_wpiutil
import wpinet._init_wpinet
from ctypes import cdll
try:
_lib = cdll.LoadLibrary(join(_root, "lib", "libntcore.so"))
except FileNotFoundError:
if not exists(join(_root, "lib", "libntcore.so")):
raise FileNotFoundError("libntcore.so was not found on your system. Is this package correctly installed?")
raise FileNotFoundError("libntcore.so could not be loaded. There is a missing dependency.")
This accomplishes the runtime loading of native dependencies that we discussed above that is needed for Windows and Linux. On macOS this isn’t strictly needed to resolve the native dependencies, but I keep it in there because it’s simpler and as a side effect it loads the python dependencies also which is needed for pybind11 to resolve types.
Once _init_XXX.py
is imported, all native dependencies are loaded in process, and the import of _ntcore.cpython-311.so
(or whatever it is on the platform) will succeed and can be used just like any other native python extension.
Cross-compilation
The robot controller we use runs Linux on ARM, so we cross-compile all of our packages. I use crossenv to do this, and as long as I don’t try to import anything directly from the native compiled libraries at build time this scheme works fine.
My proposal
… well, I don’t quite have one yet. I’ve been using this method for 3 years now and all the pieces I’ve described have been fairly static. However, if nobody is interested in this, then it’s not really worth taking the time and separating it from robotpy-build.
Final Thoughts
The robotpybuild
pkgcfg entrypoint stuff probably would need to be very different for a standardized version of this:
- Different name for the entry point (or maybe a better registration system?)
- Originally for the
pkgcfg
file I used a pure python file that cannot depend on anything other than the standard library, but I think a standardized version of this should just be a JSON or TOML blob instead. - A standardized version of the
pkgcfg
thing probably needs compile flags and other things thatpkg-config
already provides… though I haven’t needed it, certainly some projects might.
Additionally, I am conscious that some of the things done here fly in the face of some ‘standard’ guidelines for python wheels (particularly with older versions of manylinux, and I’m sure it doesn’t pass auditwheel) – but it does work, and even a high school student can use the resulting wheels. Most of the issues teams have had when using RobotPy has been with my autogenerated code, and (almost) never with not being able to find library dependencies.
Want to see how this works in practice? There are a dozen or so RobotPy packages published to PyPI as wheels for macOS, Windows, and Linux for Python 3.7 - 3.11. Just pip install robotpy[all]
and take a look.
I’m optimistic that we can leverage some of these ideas to make native dependencies work more easily in python. Thanks for reading!