Pre-PEP: Rust for CPython

I recently elected to try learning C++ after an entire professional career of Rust + Python, because I wanted to make a build system that can be bootstrapped in as few jumps as possible (so that it could be invoked from inside other build systems like cargo or pip). Having worked a lot on the pants build tool in python + rust, I have strong feelings about affordances the build system should provide to users—feelings I will try to quell to achieve a productive discussion.

The pants build tool has incorporated rust since Twitter’s “moonshot” rewrite from python-only through 2017-2019—I contributed the current iteration of our python-level interface for cacheable+parallelizable build tasks (Concepts | Pantsbuild), which uses pyo3 to hook up rust async methods as “intrinsics” which can be interchanged with “async def” coroutines (Stu Hood originally proposed and implemented this approach)—we’ve had this interop since before async was stable and before pyo3 existed.

We used rust, so we had to use cargo. Most of a decade later, we haven’t managed to use our own build system (written in rust) to build the rust code at the heart of our system. I claim that one of the major reasons for this failure is that cargo is almost unique among build systems in providing absolutely no structured mechanisms for:
(1) communicating with other package build scripts in the same dependency graph
(2) communicating with the downstream user who invokes cargo (such as a distro packager, or a github actions pipeline)

This failure is especially notable because of the same powerful guarantees cargo ensures for rust-only dependency graphs, as described in OP:

Finally, Rust has an excellent build system. Rust uses the Cargo package manager, which handles acquiring dependencies

cargo unfortunately has no standard mechanism for declaring dependencies downloaded within a build script so that they can be audited or overwritten, except the excessively restricted (non-transitive) and poorly documented links key in Cargo.toml, which requires that a build script link a native library into the resulting executable.

cargo also has no standard conception of a “toolchain”, or even an ABI outside of rustc output. This can and does mean that build scripts will fail because a dependency was built for a slightly different ABI, because again, not only is there no standard interface for downstream build configuration, but there’s not even a standard interface for communicating structured data across build scripts. So rust devs end up doing the natural thing and using somewhere in ~/.cache or ~/.config or elsewhere as undocumented mutable state.

I am relatively confident this isn’t an oversimplification, because bootstrapping the rust compiler itself ends up invoking multiple distinct reimplementations of LLVM target triple parsing, added at different times and never synced up. This is because rustc uses cargo, and cargo does not support structured communication across build scripts.

I proposed some of how I wanted to help improve this situation to NGI Zero at the end of last year. This C++ system I mentioned at the beginning is a competing approach, which would replace cargo instead of attempting incremental reform. I’m still not sure of the “right” answer to this—and I don’t think fixing cargo should be the purview of this PEP anyway.

But I am personally convinced that if CPython were to integrate rust (possibly even just at phase 1, with only external module support), we (CPython and pypa contributors, of which I am only the latter) would necessarily have to figure out a more structured way to thread ABI info through cargo, and potentially even institute a whole structured communication mechanism across the build script dependency graph. I think that will be a lot of hard work and we should prepare for it earlier rather than later.

I am very heartened to read in OP that there are steps being taken to interface with the rustc team to express CPython’s bootstrapping+portability requirements. It sounds like we’re on a good trajectory already to consider the above. I would just urge contributors to consider that pants has not solved cargo’s python packaging difficulties in many years and that it may be worth opening up a greater discussion about cargo affordances in order for cargo (not just rustc) to support CPython’s (and consequently pypa’s) needs.

23 Likes