Pre-PEP: Rust for CPython

I highly recommend reading the article from the Android Security team that the original post linked. In particular, the article focuses on a “near-miss” where they almost shipped one (1) memory safety vulnerability in Rust:

This near-miss inevitably raises the question: “If Rust can have memory safety vulnerabilities, then what’s the point?”

The point is that the density is drastically lower. So much lower that it represents a major shift in security posture. Based on our near-miss, we can make a conservative estimate. With roughly 5 million lines of Rust in the Android platform and one potential memory safety vulnerability found (and fixed pre-release), our estimated vulnerability density for Rust is 0.2 vuln per 1 million lines (MLOC)

Our historical data for C and C++ shows a density of closer to 1,000 memory safety vulnerabilities per MLOC. Our Rust code is currently tracking at a density orders of magnitude lower: a more than 1000x reduction.

The primary security concern regarding Rust generally centers on the approximately 4% of code written within unsafe{} blocks. This subset of Rust has fueled significant speculation, misconceptions, and even theories that unsafe Rust might be more buggy than C. Empirical evidence shows this to be quite wrong.

Our data indicates that even a more conservative assumption, that a line of unsafe Rust is as likely to have a bug as a line of C or C++, significantly overestimates the risk of unsafe Rust. We don’t know for sure why this is the case, but there are likely several contributing factors:

  • unsafe{} doesn’t actually disable all or even most of Rust’s safety checks (a common misconception).
  • The practice of encapsulation enables local reasoning about safety invariants.
  • The additional scrutiny that unsafe{} blocks receive.

I totally understand the concern but there’s so many big old C(++) projects that have integrated Rust and “safety of the bindings” is an obvious concern everyone has and it just… doesn’t end up being that big of a deal in practice?

In practical specific terms, it’s often reported that Rust often makes implicit ownership/lifetime constraints in C APIs explicit and easier to work with. The Rust bindings to C functions can include lifetimes that enforce these contracts (and yes a lot of C APIs map onto lifetimes and ownership well).

14 Likes

I’m aware. Happy to chat more with the people interested in driving this forward or evaluating/deciding on it, but there are more than enough opinions here already and I don’t see mine helping drive the discussion forward in any particularly meaningful way (especially given the response to my “keep it contained” concern was “actually, we’re going to make it less contained”).

8 Likes

It’s worth noting that Rust’s compile-time code execution (via build.rs) closely mirrors the style and trust model already assumed by Python packaging (via setup.py). I’m not extremely familiar with the history of build.rs, but it wouldn’t especially surprise me if setup.py (and Gemfile, etc.) were used as a reference point.

(I also think the risk of compile-time code execution in Rust is narrow compared to what already exists: how often do you read the autoconf that your C and C++ dependencies use to codegen shell scripts at build time? We have empirical evidence in the form of xz-utils that attackers find that very appealing!)

12 Likes

Still a few important questions:

  1. Who will lead this large-scale refactoring?
  2. Who will be responsible for designing the overall code architecture?
  3. Who will be making final decisions (for example, how will conflicts with the existing C implementation be resolved)?
  4. How can we ensure a stable core team will be able to contribute continuously to this long-term development effort?
  5. Is there a communication plan to ensure that progress, challenges, and design changes are transparently communicated to the entire community?
  6. How are the project’s key milestones planned? At what point in time or under what circumstances should we reassess the feasibility of these milestones or the overall direction of the project?
2 Likes

I am basically neutral on adopting Rust, and I think Ruby provides a good reference model. They introduced Rust for their JIT implementation as an optional component, and it is worth studying how they approached. (only using std-lib, except unittest and unittest module are excluded when they release)

However, I am cautious about the idea of fully rewriting the CPython codebase in Rust. We have a lot of low level and very optimized C code that Rust cannot express safely. A good example is the computed goto dispatch in the interpreter, which would require a large amount of unsafe code or inline assembly if we tried to reproduce it in Rust.

Platform support is also still a concern, and that’s why people are waiting for gcc-rs, because once it is shipped, we can cover over where gcc and clang do.

My view is that if we want to move forward, we should begin with an experimental approach. We can start by introducing new, non-essential modules written in Rust and evaluating the results. That feels like a reasonable and safe first step, and it allows us to focus on productivity rather than framing everything around memory safety.

I believe the CPython core team already maintains the C codebase as safely as possible, so while language level safety would certainly be beneficial, the current situation is not one where we are struggling or suffering. From what I understand, the Ruby team adopted Rust mainly because implementing their JIT in Rust was more productive than doing it in C. I think that was one of the major factors behind their choice.

9 Likes

People still remember the huge drama from early 2025, but I think we should pay more attention to what Torvalds and Greg K-H recently said at the Open Source Summit Korea just two weeks ago.

So, so that’s that’s one thing that has changed for me is that I actually feel like sometimes I need to encourage some of the other maintainers to be more open to to new ideas.

If we want to bring up Rust for Linux as an example, we should not only talk about how introducing a new and unfamiliar idea can create conflict among existing maintainers, but also emphasize that leadership plays a role in encouraging the community to embrace such changes.

11 Likes

I will state upfront I support this endeavour. I was thinking of trying this as a retirement project, so I’m glad Emma and Kirill are trying this much sooner than that!

As the current maintainer of PEP 11, it won’t require anything and will naturally be assumed that Rust support is a minimum requirement just like C11 support via PEP 7 is an implicit requirement.

So there’s the C code that calls into CPython’s APIs and the C code that stays on your side of things. You’re right that when we only talk about extension modules we are still crossing into the unsafe C code of CPython’s internals, but there’s plenty of code that’s just plain C that you could mess up that never crosses the C API barrier. And if Rust makes inroads into CPython internals then the safety benefits start to go deeper.

Emma and Kirill as the PEP authors along with any other core devs and folks who want to get involved and have appropriate Rust experience.

I don’t think these are really pertinent as they are things we deal with everyday already on the core team in general. Even knowing when to assess success will come down to the SC making a call.

I agree, but a “C codebase as safely as possible” is still less safe than a code base in Rust. And now that we have a decade-old systems language that’s safer than C, I think it behooves us to at least try and see if we can make it work.

20 Likes

I agree with everything Brett says above, but also wanted to add that I am going to spend some time over the next few days on community building around Rust in CPython with a goal of kicking off discussions around a lot of the topics brought up here.

7 Likes

I would go further and predict that this will never happen, and claim that it shouldn’t be a goal of the Rust-in-Python project. There are 10**oodles of person-years invested in the CPython core interpreter, and I just don’t see how that will ever be cost effective to rewrite, even as a Ship of Theseus.

But that’s not to say we shouldn’t go forward with this experiment, because we’ll better understand the costs and benefits[1].

One of the soft consequences I’m especially interested in is whether this attracts more or fewer core developers contributing to the Rust bits than the C bits. I don’t remember the numbers, but I vaguely recall some discussion about the number of core devs who are comfortable contributing to the C bits vs the Python bits. The former is surely a smaller number, and my guess is that even fewer are comfortable in Rust today[2].


  1. and better know the unknown unknowns ↩︎

  2. To be clear, I consider everyone’s contributions, regardless of where or what language, to be incredibly valuable and valued ↩︎

16 Likes

I think this is actually a great example where Rust could be a huge improvement over C. There is interest in the Rust community to implement a safe state machine loop, e.g. this proposal RFC: Improved State Machine Codegen by folkertdev · Pull Request #3720 · rust-lang/rfcs · GitHub. That proposal may not get into Rust, but given the interest I am sure there will be some safe solution implemented eventually. And I would like to re-iterate another point: we absolutely should not re-write things that don’t make sense to. The core interpreter loop itself may not make sense to for a while, but that doesn’t mean other important runtime things can’t be Rust, like thread state management, the parser, and others.

Well then you’ll be really glad with this blurb I was writing in response to Donghee’s post :laughing:. Speaking for myself here:

The goal of this project should not specifically be to re-write CPython in Rust, but rather iteratively move more C code to Rust over time and reap the benefits for those portions of code. This may end up meaning Python becomes entirely Rust! But I don’t think that will necessarily be the end goal.

I’m also quite interested in this! Given that we have seen people shifting to Rust for new 3rd-party extension modules I really hope that will translate into more contributors.

3 Likes

One thing I do think is worth saying: The highest ROI pieces are going to be modules and perhaps builtin types/functions, which can be implemented entirely in safe Rust with ergonomics high level APIs, reaping performance and developer experience/velocity wins.

Something like the GC is at the absolute nadir of the value of Rust: it inevitably requires a decent amount of unsafe and won’t benefit from Rust’s other advantages.

And then many other things (e.g., the parser or interpreter loop) are likely to be in the middle in terms of where I’d estimate the ROI is.

11 Likes

Just to clarify: I love using Rust, and I’m one of the people interested in bringing Rust into CPython. I’ve talked about this topic in the context of JIT because of its practical advantages.

@emmatyping What I’m curious about is this: in the current PoC, most of the code uses unsafe blocks. I understand this isn’t Rust’s fault but rather a limitation of the CPython API. Still, how do these modules become memory-safe in the PoC, and how much less do we need to worry about memory safety compared to writing the same code in C? For example, if you could say something like “X% of the code in the base64 module becomes memory-safe,” that would be a helpful metric to highlight.

Also, do you have any plans to remove the unsafe blocks in modules like base64? If so, could you include that plan in the PEP?

Another thing: could you compare build times and performance between the C version (with PGO + LTO) and the Rust build? I think that would make the PEP much more balanced and fair for reviewers.

2 Likes

Based on my recent experience in adding hardware prefetch to the free-threaded GC, I feel like writing in Rust could have provided some good benefit. For example, implementing the gc_span_stack_t data structure and associated methods would have been easier to write and to review. I would expect that it also would have prevented this memory leak bug in that code.

8 Likes

I think this section covers that:

The current implementation is really a proof of concept, so there are a lot of places it could improve. It started with myself wishing to see how hard it would be to integrate Rust with cargo into our existing build system. A Rust _base64 module in CPython will look a bit different from the current proof of concept I hope. I expect extension modules will likely be able to be to be 80% safe hand-written code or more.

That being said, building out the safe abstractions will take time and effort to do properly. So I think I would say, if you want to see something like where I hope to end up, look at PyO3, where the vast majority of hand written code is safe.

@Eclips4 found that his hand-rolled implementation that does not use any SIMD is about 1.6x faster than the _binascii implementation in use today. It’s hard to make a “fair” compilation speed benchmark because there are many variables that can come into play and knobs that can be tuned. The added Rust code will also necessarily add more compile time because we aren’t removing code by introducing _base64.

3 Likes

Will the binaries written in rust be able to share a single copy of dependencies and/or the rust runtime? My recollection is that ABI in rust is a forgotten dream and that each library/ or executable ends up with separate copy of all its dependencies plus a big fat core runtime lumped into it, turning a network of little libraries[1] into a network of bloatware. But it’s a long time since my last (unsuccessful) attempt to get into rust so that might no longer be true (assuming that it was ever true).


  1. or extension modules in CPython’s case ↩︎

4 Likes

Rust has no problem using the C ABI; from experience, the norm when integrating Rust into existing C codebases is retain existing ABI boundaries and perform dynamic linking in the same ways that the codebase would normally.

(There’s a separate issue, which is that fully separate Rust builds tend to prefer static linkage because there’s no stable Rust ABI. But the integration efforts of Chrome, Firefox, etc. are good examples of integration of Rust components into projects that assume the C/C++ ABIs.)

5 Likes

Hi, I’m one of the RustPython developers, and during work hours I maintain a tightly-coupled C++/Rust project of about 200k lines.

I’d like to comment on some of the points raised in the post and the thread. I’m still getting used to Discourse, so please excuse me about missing quotes.

Questions about RustPython

RustPython isn’t something that can be considered in this PEP in short term. RustPython and CPython are not semantically compatible across many layers of their implementation.
Well, RustPython has a bunch of pure Rust library with excellently working Python stdlibs. it could serve as a reference when introducing Rust versions of certain libraries. I don’t believe it is directly related to this PEP.

RustPython has its own approach to running without a GIL, but it’s not compatible with CPython’s nogil direction.

If there’s one aspect of RustPython worth highlighting in this PEP, it’s that it has achieved a surprising amount of CPython compatibility with a very small number of contributors. I rarely contribute directly to CPython’s C code, but I’m very familiar with reading it. After implementing equivalent features in RustPython, the resulting Rust code is usually much smaller, with no RC boilerplate, and error handling is much clearer.

bindgen

I think there must be a good guidelines on how bindgen should be used. bindgen generates both data structure definitions and function bindings. Function bindings are usually reliable—but data structure definitions often are not. If we rely on bindgen for those, we must run the generated tests to verify compatibility.

In base64, the code currently uses a direct definition of PyModuleDef. To be safe, either:

  1. verify struct size via tests, or

  2. let C create the struct and only access it through FFI.

As far as I can tell, cargo test for cpython_sys currently doesn’t run the generated tests (I might have missed something).

I’m not saying this PEP must adopt following idea, but from experience, defining data structures on the Rust side and generating C headers with cbindgen can be safer than generating Rust code with bindgen. Though while rust-in-cpython focuses on writing stdlib modules, where C doesn’t need to call Rust, there may be limited motivation to use cbindgen.

This perspective comes from my experience with mixed C++/Rust projects. CPython being a C/Rust project may lead to fewer issues.

clinic

All Python functions will end up exposed as extern "C". For now, I’d actually suggested to consider cbindgen for this:
Each module could run cbindgen to produce a C header including all FFI functions with their original comments. Then, maybe clinic tooling could operate directly on those headers without major changes? I’m not totally sure since I don’t fully understand clinic, but it seems it could require less modification than the other 2 suggested methods.

When Rust penetrates deeper than the module boundary and this approach breaks down, we’ll have better insight for future decisions anyway.

ABI

I’m not sure how far Rust implementation will expand, but compared to Pants, CPython’s requirements seem much simpler. If we connect this with the clinic/cbindgen idea, we could enforce a policy that every exported symbol must be declared in a properly generated C header. Since the only stable ABI in Rust is the C ABI, having headers fully specify remains reasonable until Rust APIs are officially exposed to users.

Build time

Ideally, Rust debug builds shouldn’t be too slow. But many Rust libraries lean heavily on proc-macros, which can significantly impact build times. For example, RustPython has far less code and functionality than CPython, yet it takes ~5× longer to build, and the gap is even bigger for incremental builds.

If build time is a major concern, guidelines limits unnecessary proc-macro usage may help. Also, on the external tooling side, we can hope llvm might support faster Rust debug builds later since Python is a priority project for llvm project.

I don’t worry about generics in rust-in-cpython. Unlike RustPython, rust-in-cpython must generate C interface, which discourage to abuse generics.

From a build-time perspective, keeping one crate per module as _base64 doing now is very appealing.

Using unsafe

In my opinion, completely eliminating unsafe from base64 isn’t the right goal.

Rust guarantees that code outside an unsafe {} block is safe. Anything the compiler cannot verify must be wrapped in unsafe {}. Wrapping unsafe internals in a “safe” API means the programmer is manually guaranteeing safety.

Some guarantees can be established through review and careful implementation, but FFI safety often cannot be fully guaranteed due to inherent interface limitations. If we hide unsafe behind safe APIs even where true safety can’t be guaranteed, then we lose track of which code must be treated with caution.

So instead of trying too hard to remove unsafe, it’s better to encourage properly mark actually unsafe code and minimize them when possible.

Rust benefits vs. FFI cost

Rust reduces memory-related bugs, but across FFI boundaries, things can actually become less safe than using a single C compiler. The more FFI boundaries exist, the more type information is lost, and the more binding risk increases.

Usually, early Rust adoption increases FFI surface area and reduce problems in the rust codebase but also creates new problems at the same time. Then over time, as Rust takes over more internals, the boundary shrinks and things feel cleaner again.

From that perspective, starting with modules is a positive direction: a lot of code, limited boundaries.

Questions

Shipping strategy: Will the Rust extension only support nogil build? If so, that might help reduce some FFI complexity.

Duplication:

Python currently ships duplicate C and Python implementations for some modules. If this PEP considers moving some stdlib pieces to Rust, could Rust implementations also coexist as duplicates? If so, a guideline to have different implementations about same feature will be great. Having separate module paths and build flags would allow experimentation, and then flipping Rust on by default once stable. It will be work like a sort of feature-level incubators. If possible, I’d love to see this code used: GitHub - RustPython/pymath (While working on it, I learned how dealing with FMA is way nicer in Rust than in C. Thanks tim-one.)

If this proposal moves forward, I’m ready to dedicate a significant portion of my 2026 open-source time to it. As mentioned, I’m experienced with large-scale Rust FFI using bindgen, and I’m fairly familiar with Python internals as well. Please feel free to poke me if I can help.

Finally, I’m genuinely impressed that the CPython community is open to such a bold direction. I’m curious to see how this proposal plays out, and I’ll be following this thread with great interest. Cheers!

38 Likes

Hi,

Can you expand on this with some specific examples? I can only interpret this as a suggestion that Python releases would include uv and ruff, and it doesn’t make much sense to me.
These projects already exist, their adoption does not depend on CPython using Rust itself.
And formatters and linters are third-party projects, not developed by python-dev, and chosen by developers.
One thing with special status is pip, included via ensurepip, to solve the packaging boostrapping issue.

2 Likes

This is great to hear, and we’ll definitely note this in the PEP!

I agree adding tests for the struct size is a good idea. And I’ll add a comment to get the bindgen tests working on the PR. Thanks for the feedback!

I expect this is a non-starter as the C API is the source of truth and will likely remain so - maybe indefinitely.

This is definitely an interesting approach! I will experiment with it and see how that goes.

Yeah, I expect this will need to be the case, especially since as mentioned about, the C API is considered the source of truth.

Yes I think this has a few benefits, such as faster compile times and modularization.

Absolutely agree here. We probably won’t be able to make everything safe, but being principled about how we interact with unsafe will help significantly.

I was discussing this with Kirill and we’re thinking Rust modules should be required to support free-threading and sub-interpreters from the start. I don’t think we will have too much difficulty supporting the regular builds if we already support free-threaded.

I probably would say the Rust implementation should replace the C implementation, as having 3 implementations is rather a lot. But I’d be open to considering the path you propose. I think we’d need good motivation that people will use the in incubation Rust versions if we were to consider that plan.

That’s fantastic to hear! I’ll definitely follow up about that.

5 Likes

Yeah on a re-read I think my earlier comment misinterpreted this message as discussing clippy and rustfmt (Rust tools). So I would say adding Python tooling that is written in Rust to CPython is out of scope for this proposal.

3 Likes