Pre-PEP: Rust for CPython

I’m a firm -1 on proceeding in this direction. The reference implementation CPython is called CPython for a reason, after all, :slightly_smiling_face:

By adding additional requirements, we make CPython less portable, maintenance a lot harder and complicate adoption in spaces where you need to recompile the whole package to other platforms such as WASM.

Besides, there already is a GitHub - RustPython/RustPython: A Python Interpreter written in Rust effort. I’m sure they’d love to get more support.

If you want to use Rust for writing optional extensions, that’s perfectly fine, but please upload them to PyPI instead of requiring Rust in the CPython core.

26 Likes

Members of both the RustPython community and PyO3 already expressed their interest in this approach, as already pointed out from the previous replies on this thread.

And conversely, would a ASAN/UBSAN build of CPython be able to see/instrument the Rust parts? Otherwise, not seeing the full program execution could impair the ability of the instrumentation to find bugs at runtime.

1 Like

Anecdata: I don’t plan on writing any (more) C code, so outside of all the non-coding ways that exist, I don’t see myself ever meaningfully becoming a contributor to CPython (‘s core).

On the other hand, I can’t write enough Rust to scratch the itch.

And I don’t think I’m particularly unique or special here :wink:

9 Likes

ASAN can definitely work with the caveat that this is a nightly Rust feature at present - sanitizer - The Rust Unstable Book

UBSAN - I’m less sure, I suspect that the C parts would be instrumented, the Rust parts would not, I would think this would not impact getting meaningful value from the sanitizer.

4 Likes

Hi from the Rust Compiler Team!

As someone who programmed in Python for several years previously, I’m really excited to see this!

I haven’t read the entire thread, but I have a minor concern that I previously raised personally with Kirill and wanted to bring here as well. After reading the PEP, one question remains regarding the specific version of Rust that will be used in Python. While Rust maintains excellent stability for the vast majority of users, large foundational projects like CPython often benefit from more conservative versioning strategies. Following approaches used by other large projects, would it make sense to pin specific Rust versions and update deliberately.

Additionally, the policy around nightly features isn’t entirely clear. These often contain some quality-of-life improvements that might assist development. Within the Rust compiler itself, we regularly rely on many nightly features, so their treatment remains an open question from the PEP.

These are the main points that I feel still need clarification after reading the proposal. Thank you for your tremendous work on integrating Rust into Python – it will be very exciting to watch this progress!

12 Likes

Thank you, and I am grateful for your help whenever we run into specific problems with Rust. Unfortunately, here the problem is Rust itself — for platforms it doesn’t support, all we can do is either drop the package from that platform (which generally means also dropping all the packages that require it) or remove the Rust dependency somehow. For the latter, it often means disabling tests (which is far from optimal, but there’s at least some hope that testing on other platforms will suffice for pure Python packages), and lately replacing uv-build with a pure Python build system (say, when cachecontrol started using it, given it’s required by pip and poetry).

5 Likes

Thank you Emma and Kirill for taking this on.

The kudos you deserve is beyond what can be expressed in words.

While I love reading the virtues of rust extolled… there are perhaps some areas that pre-pep should address that got glossed over.

Rust’s approach to memory safety in multithreaded programs is very different from Python’s. In fact, I don’t think it can be used out of the box. Please make a plan or a PoC and show otherwise. Or set out an educated set of guards rails.

Looking at the sample module, this stood out to me:

#[inline]
fn encoded_output_len(input_len: usize) -> Option<usize> {
    input_len
        .checked_add(2)
        .map(|n| n / 3)
        .and_then(|blocks| blocks.checked_mul(4))
}

This is just rust for the sake of rust. A safe C equivalent would be two lines long. The moral is that not all valid rust code belongs to CPython, just like PEP-7, there needs to be a spec about what rust features and idioms to use and what not to.

4 Likes

From subinterpreters, yes I agree there are differences. From freethreaded Python, it has so far felt very similar to me (atomic datatypes, locks etc).

This code could be written in a one liner if really wanted, I wouldn’t pick at LOC as a relevant metric. Some Rust code is more verbose than C because it encourages checking, some Rust code is less verbose because it (e.g.) handles RAII for you.

#[inline]
fn encoded_output_len(input_len: usize) -> Option<usize> {
    (input_len.checked_add(2)? / 3).checked_mul(4)
}
2 Likes

clippy and rustfmt are fantastic tools (configurable) that enable a common standard of Rust to be used widely across the ecosystem with specific tailoring possible. I would think these will be great (possibly sufficient) starting points.

1 Like

FWIW, I’ve tried pretty hard, but I can’t find a way to write a (readable) 2-line version of this function in C that retains the overflow checking. (And any C version I do either relies on a magic sentinel like -1 for a return value or an out param, which is obviously more challenging for the caller).

I think this is a good example of a dynamic with Rust: it definitely forces you to front load a lot of work. It’s more annoying for building POCs and playing with ideas. The trade-off is you get way less debugging and vulnerabilities on the back side.

7 Likes

This is very disappointing to see rust being pushed into Python itself. That will break Python for all platforms where rust is broken, which will hit users badly, since a lot of apps rely on Python. (And will be a regression as compared to C implementation generally.)

Using it optionally, like Ruby does, is fine. I honestly hope it does not become obligatory.

6 Likes

To provide some numbers on Rust’s build performance: today, I can build the whole Rust compiler (600 kLOC) plus its ~200 dependencies (a couple more hundred kLOC) on my Zen3 16 core (8C+8HT) laptop in ~50s from scratch, in release mode with optimizations, with incremental rebuilds taking 5-20s (depending on how deep I modify something in the dependency tree).

While that is still slower than rebuilding CPython, especially in incremental, I don’t think that the initiative mentioned in this PEP would run into Rust build time performance issues soon, unless you somehow manage to write (or depend on) hundreds thousands of Rust code very quickly.

5 Likes

Rust’s memory safety guarantees have been formally proven by the RustBelt project for code that does not use “unsafe” .

There are CVE’s in safe Rust

4 Likes

Well, there are CVEs in pure Python too, that doesn’t mean that Python and C are equivalent when it comes to avoiding security vulnerabilities.

18 Likes

I think this is a great idea! So much so I finally signed onto the Discourse server to endorse it. Will follow developments with much interest.

3 Likes

No, but it does undercut the idea that the memory safety guarantees have been “formally proven” when there are longstanding known direct counterexamples.

For what it’s worth, Miri does catch that issue.

3 Likes

Currently, it is trivial to build python on a computer that isn’t connected to the internet. IMO this must continue to be the case, it’s really important to many users in restricted environments.

20 Likes

I have too many concerns about the use of Python in various bootstrapping to be in favor of this currently.

I agree with the overall goal of increasing memory safety and making it easier to write code people can be confident in by default, and I like Rust for this, but I don’t see this as the right move without more supporting pieces that just aren’t there yet when considering how Python is used in the world.

It seems more advantageous to focus on which modules have both C and Python implementations that would highly benefit from the guarantees afforded. This also seems to have cleaner boundaries on a technical level, and doesn’t force people to evaluate Rust adoption as an all-or-nothing roadmap to be committed to before it is proven to work within CPython’s core development, and before seeing actual impact of even that smaller transition.

It’s also worth pointing out that there are options other than rust which have stronger formal guarantees than C (some more than Rust), and which don’t require a Rust toolchain. Python is already using GitHub - hacl-star/hacl-star: HACL*, a formally verified cryptographic library written in F* for various cryptography functions, and getting more from doing so than had a Rust implementation been chosen:

The code for all of these algorithms is formally verified using the F* verification framework for memory safety, functional correctness, and secret independence (resistance to some types of timing side-channels).

While Rust is certainly more popular than a purpose-chosen subset of F*[1], it serves as a point that it is possible to get the level of additional compiler-enforced safety that’s desired without compromising on the existing portability of CPython.

As CPython doesn’t support these unsupported triples either, Rust stabilizing user-provided JSON targets brings it to effective parity: “You’re on your own, but the build tools required have a stable way of doing it.”

I also want to be crystal clear, I don’t think it’s even remotely feasible to say “Rust has to support all target triples that have ever used or ever will use python.” There’s a limited amount of maintainer bandwidth in every project, and some hardware just isn’t being developed for by the core teams. It’s niche.


A probably less important issue, but one that I think hasn’t been mentioned directly[2], is that rust and rust-analyzer both use significantly more memory than existing tooling for C. I don’t think it’s an amount likely to be a significant contribution barrier, and don’t personally count this against the proposal, but would like to make sure all known impacts are considered.


  1. Low* ↩︎

  2. Compile times were mentioned, but there’s workflows that avoid the brunt of this. ↩︎

16 Likes

Given that Rust isn’t standardised like C and C++ (ISO/IEC 9899, ISO/IEC 14882), isn’t this premature?

2 Likes