Sure, but as I said in the rest of my post, that “formal decision” is really “can we let some people be lacking certain stdlib modules”. The choice of language is fairly orthogonal, since it isn’t going to affect the design of the runtime at all - we might as well bundle up C++ extension modules in the same PEP, since a similar number of users will be unable to use those, and some extension modules would benefit.
Also, there’s only a small gap between “the stdlib may not have this module for you” and “if you want this module, choose to add it”. The latter is already possible with whatever language you want to use, so with the re-scoped proposal, we’re only slightly reframing things from the latter to the former. Once we’re officially okay with “the library is (core modules) plus (optional modules) plus (whatever your distro adds) plus (whatever you’ve installed)”,[1] then “optional modules” could be anything from anywhere (which means it’s really easy to say “yes, they could use Rust”).
For explicitness, the current situation is “the library is (core modules) plus (whatever your distro adds) plus (whatever you’ve installed)”. The only thing we’re adding here is “optional modules”. ↩︎
In fact, base64 is built on top of the binascii module, which is implemented entirely in C. So I think you’re right, we can try replacing the current binascii implementation with a Rust-based one.
This post is about Rust, so it will attract considerable attention from Rust fans.
As a result, most of the comments are likely to be positive (since they come from Rust fans).
We still need to wait and see.
but eventually will become a required dependency of CPython and allowed to be used throughout the CPython code base
<del>
Additionally, using Rust (which relies on LLVM) means abandoning many niche platforms. Is this really an appropriate choice?
Many popular open-source projects use Python to some extent. If CPython drops support for certain platforms, those platforms will no longer be able to run many projects that depend on Python.
That would be truly unfortunate!
Rust may mitigate some memory-related issues, but… are you really in such a hurry to introduce Rust into CPython?
In any case, Rust should be made optional.
This way, maintainers of (Python-dependent) open-source projects can choose whether to use the Rust-dependent features or not. They deserve to have a choice!
Moreover, making Rust a mandatory dependency for CPython sounds like a proposal driven by Rust fans to push their own agenda.
</dev>
Perhaps we can wait until the maturity of the following options:
C++26: hardening and contract
Zig
gcc-rs
After that, we can conduct more comparisons before making a final decision.
Thanks for pointing that out, that’s a great point.
Personally, I thought of optionally introducing Rust more as a way to better understand how it would affect the community, and how well it would fit into the CPython codebase, and developers’ workflows.
I also don’t see a great value from allowing Rust in optional modules purely from a maintainer perspective, but I think it is a necessary step to decide on making Rust a hard build dependency.
That said, maybe someone else does have a use-case where Rust being available for optional components would be great. If that’s the case, it would be great to let us know
Bear in mind that Zig isn’t aiming to serve the same niche as Rust as far as safety/correctness goes (See How (memory) safe is zig? for comparative details on its design philosophy) and posts like these suggest a worrying attitude toward Rust-level safety/correctness within the groups responsible for steering C++.
I’m having trouble remembering what I bookmarked it under so I can link it, but I’d also add the paper Stroustrup put out within the last few years which, to me at least, felt like a sign that the defensiveness various loud C++ developers feel toward the idea of Rust-like guarantees goes all the way to the top. (Which would be consistent with what On “Safe” C++ lays out.) I remember it being longer than A call to action: Think seriously about “safety”; then do something sensible about it.
You’re completely correct here, the pre-PEP does not get us to Rust being ubiquitous in CPython. I think we probably didn’t do a good enough job explaining why the approach is so conservative, and should explain more on a potential path to make Rust ubiquitous, so let me expand on that.
We intentionally want to keep the initial test as conservative as possible for a few reasons. Right now, the impact of introducing Rust at all is not fully known, so we want to make sure that it is both easy to back out of the change if it is untenable, and understand what workflows would break when Rust becomes required. One thing this thread has reinforced for me is that CPython is used all over the place, often in places we the core team are not aware of. The only way we can properly evaluate the potential impact of making Rust required is to start warning users we hope to do so and hear about places it would break, and what is needed to change to make Rust required.
As for using Rust outside of extension modules, I think it could be used for experimental, opt-in interpreter features such as the JIT. I want to be cautious of this though because I don’t want the JIT to be in the state where it should become the default but making Rust required is a blocker to that. As everything else, it should be considered if migrating to Rust makes sense for the JIT.
There is a temptation, when introducing a new language, to pursue an aggressive timeline to justify the investment with short-term gains. Such an approach, fixated on overly ambitious goals, or rapid, sweeping change, invariably carries elevated risks of failure and can actively disincentivize future adoption by disrupting roadmaps and competing with other business objectives.
A more powerful strategy, the one that proved so effective for Android, is to treat language adoption as a long-term investment in sustainable, compounding growth that supports other business objectives instead of competing with them. This approach patientlyaccepts initially lower absolute numbers to provide the necessary time for the new language to establish a foothold, build momentum, and achieve critical mass.
In summary, for Android they found that moving to Rust is a long-term investment, that may not see immediate payoffs, but that moving to Rust is a long term success with significant payoffs. I realize this is a hard sell. But to me it makes a lot sense because there is a lot of learning and investigation to do related to integrating Rust into CPython. We need to spend the early period where Rust is allowed conservatively to build out the experience and support infrastructure for making Rust for CPython succeed.
I could see a path where we start very conservatively with base64, then introduce a Rust version of a more central standard library module like json (as an optional replacement to the C version to start with), then make Rust required. But this is a guess at a plausible path forward, and the final path needs to be informed from experience.
I would push back against this. One of the theses of this pre-PEP is that contributors are not inclined to write complex C code because it is difficult to do, thus projects like a C accelerator are not getting written.
As an example, the performance of the json module has been significantly slower than other implementations for a long time. Yet a re-write in C would be daunting. I think Rust would excel in cases like this.
Wholeheartedly agree with this, and that’s the goal with base64. We want to start somewhere, to better inform efforts elsewhere. Starting simply and minimally seems like the best path to allowing us to get an understanding without risk.
I think on the contrary we will see the greatest benefits for memory safety from making new code in Rust. A usenix paper found that the majority of vulnerabilities are in new code, as older code is battle-tested. So while I think we should enable replacing C code with Rust code where it makes sense, for example the json module, I think we will still see plenty of wins in new code too.
While most vulnerabilities may be in new code, I doubt they are in new modules. If we add a feature to (say) the JIT, or the JSON module, we have to write new C code.
To interact with the interpreter today, we need to interface with C APIs. Currently the proof of concept binds to the unstable Python APIs as well as internal ones. We definitely want to build safe abstractions over the raw C APIs for common use cases, and that could perhaps use an API like handles. But it has to wrap the existing APIs like HPy does for CPython, so these necessarily need to be exposed to Rust.
To define a module, we need to have some way of defining a PyModuleDef, unless we want to introduce a new module initialization protocol (which is a large proposal in of itself). Therefore we need a pointer to a PyModuleDef, which needs it’s first member to be PyModuleDef_HEAD_INIT which internally expands to a structure with it’s first member being PyObject_HEAD_INIT. So some of this coupling is necessary as part of the module initialization protocol. Changing this protocol could be something we look into, but seems like it’s own rabbit hole
I realize this coupling is less than ideal, but any nicer HPy-like API would likely need to necessarily build on the existing C APIs, and I would like internal functions to be available to Rust modules just as they are to C modules in the standard library.
We absolutely need to write new C code for the JIT, the JSON module, or most other parts of CPython when making changes and adding features. However I think demanding immediate results across the code base from introducing Rust is infeasible. I’m curious as to your thoughts on the rest of my comment explaining why.
My thoughts are that a proposal that allows Rust only for optional extension modules is likely not useful enough to justify the cost of adding a new language. It can be a first step (in fact, it’s a very reasonable first step!), but that means there must be a plan to expand beyond it just being an optional part of building CPython. And that in turn means that we need to deal with the concerns that are being raised around bootstrapping and niche platforms.
There’s been a number of comments about “optional extension modules” in this discussion so far. I want to make sure we’re 100% clear on what we mean by this, as I’m concerned we could end up measuring the wrong thing otherwise.
As an end user, I have literally no interest in what language a stdlib module (or a core feature) is written in. It’s irrelevant to me. However, I have a strong interest in what is available in the stdlib and core. Optional modules are an awkward compromise here - can I use the module safely, or do I need to account for the possibility that it might not be available? This is exacerbated by the fact that the packaging ecosystem has no way to express a dependency on “Python >= 3.13, with the tkinter module available” (for example).
If we introduce Rust by using it for stdlib modules, and as a result make them (and anything that depends on them!) optional, then we risk getting negative feedback which will look like it’s a downside of Rust, when it’s actually a downside of optional modules.
I think the intention is not to do this, but rather to use Rust to create accelerators for existing pure-Python modules. But if that is the case, can we be clearer about this, so that people don’t get the wrong impression? Assuming we are talking about accelerators, I feel that @Jelle has a point. Using Rust to get a faster JSON module[1] will be a good way of finding out what’s involved in adding Rust to the core build process, but I don’t think it will provide much useful feedback on whether Rust provides sufficient benefit to be worth taking further.
Which prompts the question - what would useful feedback look like? And how do we get it? I don’t think that’s been clearly established yet.
I have to say that I don’t think anyone cares about a faster base64 module… ↩︎
Destabilising the existing C API isn’t an option, and providing a Rust abstraction over the unstable APIs doesn’t make them stable - they’re unstable because we want to be able to change them. If we didn’t want that, we’d make them stable or limited APIs.
You can have PyO3 with access to unstable APIs already, presumably (perhaps without them being officially part of PyO3, but you aren’t being prevented by CPython from using them). If there are other APIs that are not currently public at any stability level that would be useful, we can discuss making them public.
These problems are not good motivations for bringing Rust into core, since we already have the processes to manage them. They are good motivations for contributing to PyO3, which seems to be doing just fine without the restrictions and limitations that it would “enjoy” if it were part of core, and proposing new public APIs to the C API WG (who definitely enjoy dealing with those limitations).
If a good first step to exposing subinterpreters (an existing core feature) was a module on PyPI, then I don’t see why a drop-in replacement for stdlib modules written in Rust can’t also start on PyPI. That way, distributors can immediately choose to include them instead of the core ones, and it’s much easier for core to later adopt an existing library than to approve what is currently a vague notion of “allowing” it.
Regarding optional extension modules, here’s a rough timeline I could see and why each step is chosen to be so.
The goals of this stage are to get the initial build system changes set up, and start getting experience interfacing with the C code from Rust. Extension modules are a limited interface to the interpreter which minimizes the work to do for interoperability. Ideally we would also start warning users that we plan to eventually make Rust a hard requirement, and if this is a hardship for their environment, to please relate that to us. Only optional extension modules are allowed. Even more of a limitation, they should only be introduced where they would replace a C extension module (this could be things like base64 or json). The latter restriction exists to ensure we are not limiting access to new features to those who have Rust available. I picked the base64 module because it was simple but should exercise enough parts of the build system and C interop code, but we could just as well choose the json module.
The goals of this stage are to make progress on improving portability and bootstrapping workflows based on feedback from stage 1. If we don’t gather feedback in stage 1 we could also start doing so here. More modules may be ported here to get even more experience with the C <—> Rust interfacing and overall developer experience.
The goals of this stage are to ensure that we have resolved bootstrapping issues and warn of the impending requirement on Rust. Make Rust an opt-out feature of the interpreter. This is an even stronger warning of the up-coming requirement. At this point the vast majority of users should be able to build CPython with Rust enabled and bootstrapping issues should be resolved to greatest extent possible. I would say at this point new extension modules can be in Rust.
The goal of this stage is to integrate Rust into core parts of CPython. Rust is required to build CPython. Rust can now be used across the CPython code base.
This is one hypothetical scenario. We’re not going to propose this as the specification in the PEP.
There is simply too much to consider to resolve the bootstrapping and portability problems to have a concrete plan now. I think like PEP 703 did, we could leave an open question with some version of the above roadmap. But I don’t think we have enough information at the moment. The only way to tell what the impact of introducing Rust to CPython is to introduce Rust to CPython. In the above plan we start very conservatively to ensure we don’t break anyone’s workflows until we’re confident we can introduce Rust to core. I wouldn’t want to set hard timelines to any of this because we don’t know now when we should move to the next step in the process. So I think like free-threading, it would be best to start with step 1, then have a follow-up PEP when we are confident we have the information we need to move to the other steps.
Surely this step was accomplished years ago, with the caveat that the Rust extensions aren’t usually “drop-in” replacements for an existing stdlib module, because that’s rarely what someone wanted to write. I don’t think anyone doubts that such a thing is possible, though?
Write one, prove it’s better, propose stdlib inclusion. It doesn’t even have to be a drop-in replacement, it can be a net-new module (but then the proposal needs to justify adding the module, which might make it too complicated).
This is the standing process for adding something to the stdlib, and all that’s different is it’s the first proposal that would also bring in a new language. But then we are at least debating the merits in the context of something concrete, rather than something that feels very hypothetical.
I think that this is missing the point of this proposal, though. The point is not “we really need a faster base64, and specifically it should be written in Rust”. It’s “we should think about introducing Rust. The minimally invasive way to do so is to start with an optional extension module[1]”.
that is, one that accelerates existing functionality ↩︎
That said–maybe it’s still too early for a PEP, and the real Proof-of-Concept is to set-up a full build? (edit: and the existing repo might be insufficient for that purpose if it doesn’t cover enough ground)
I think perhaps I may not have been clear in my earlier message, so apologies if not. I do not intend to destablize the C API, of course that is a non-starter. And I don’t intend to abstract over any unstable APIs either.
What I imagine is a PyO3-like API abstracting the stable API. Then extension modules in the stdlib can also access the unsafe, unstable C APIs. The Rust FFI bindings will be kept up to date with the unstable C API so there is no issue with adding or removing functions.
I see two reasons:
The main reason subinterpreters started on PyPI, if memory serves, is to figure out the API design. PyO3 already has 8 years of experience for us to steal re-use refining their APIs. I expect the standard library abstractions over the stable API to be very, very similar to PyO3, except without support for PyPy and perhaps usage of internal APIs.
A critical part of this proposal is understanding the impact of introducing Rust to CPython. We get no information from existing on PyPI.
Great! So then once we have a prototype of the abstraction your concerns will be assuaged?
Absolutely, thank you for bringing this up. If you read over my earlier comment you will see that inclusion of base64 is not even a requirement for the overall goals of the pre-PEP.