Whenever I see arguments like this, I get the impression that the people making such an argument forget what “volunteer” means.
If they’re not getting paid to work on Python, then it’s entirely possible that it’s not “Rust vs. C”, but “Rust vs. wear out and stop contributing”.
I know that situation is why I’m rewriting my Python projects in Rust. Even without memory safety on the line and even with strict-mode MyPy, the constant vigilance of a dynamic language with ubiquitous NULL/None/nil and exceptions introducing hidden return paths all over the place wears on me.
But at any rate the pre-PEP has already been discussed exhaustively, it was just my sense of humour that found this hilarious in the context of this discussion where lots of points about integrating Rust were about memory safety.
This is a good point that doesn’t seem to be getting enough attention. I can’t speak for any of the core devs that routinely work on the C parts of the codebase, but I can say that personally, there were a number of improvements that I would have loved to make to the old py launcher, which I gave up on because I couldn’t face the manual memory management, and pointer manipulation, involved in string processing in C. If the launcher had been written in Rust[1], then I could have used Rust’s standard string type, and as a result I would have been motivated to work on those improvements.
So yes, a significant benefit of using Rust, even just for isolated parts of the Python stdlib, is that it could dramatically reduce the risk of developer burnout. Of course, we have to take care not to have the transition process burn people out as well, but I’m in favour of doing extra work to manage a short-term transition exercise in order to set up a better long term foundation.
An entirely reasonable possibility, it was self contained, and its build process is isolated from the core build. ↩︎
I’d say that this is misleading. The Cloudflare’s outage actually has nothing to do with the memory safety guarantees of Rust or with inappropriately managing memory in general.
I’m a huge fan of this proposal. Thanks everyone for the effort. I volunteer to support in the endeavor in any way needed.
The technical details can be discussed in due time, but one thing I’d love is for CPython API to define things around ownership and thread safety more formally. Right now this is delegated to documentation, but both C users and foreign language binding implementers (e.g. Rust and C++) would benefit if they could verify that a certain reference is meant to be owned or borrowed and so on.
I understand there are challenges on both the technical and social aspect. I wish for both the aspects to be cared for seriously (and I renew my availability for help). It’s not the first time that adoption of Rust in a big project brings up some strong[1] resistance. I feel that there is space to adequately understand needs and worries; and to empower the people who have some stake in this change to impact the process.
For example, some people brought up a worry that Rust is chosen because it’s a trendy toy rather than an actual solution. It’s good to spend time identifying why this is considered a problem, why it is a worry, and what can be done to address it. In part this is being done now (thanks to everyone who took time to answer) by reassuring that there is technical merit and proper research into the solution. But maybe this is not the whole argument and people would like to hear something closer to their worry. In other contexts I’ve seen a resistance to Rust adoption as a sort of threat to job stability (growing as a proficient C programmer takes a huge investment and attention to a number of issues, and something like Rust seems to promise to automate away part of your skill set), and maybe that also needs to be treated with care and empathy.
And, on the other side, people wanting to see Rust introduced have their needs (that might not be entirely obvious besides the technical value) and might be faced with walls and gatekeeping and, as it happened in other projects, could be discouraged or burn out when they don’t see those needs understood, or their good will acknowledged.
So, please, let’s not disregard that this is a socially loaded topic, and people are already reacting in many ways. I hope that the discussion will be smooth but I will not bet on it.
and, sometimes, unexpectedly violent or vicious. Fortunately nothing of that sort is visible on this thread. ↩︎
To elaborate on “nothing to do with […] Rust”, the outage was down to:
We wrote a recipient service which preallocates memory as an optimization.
Since we wrote it to ingest data from one of our other services, we made the assumption the data would always be well formed and had it preallocate 3⅓ times as much space as we’re currently using.
We committed a bug to the sender service which caused it to produce data bundles that blew past that 3⅓ times safety margin.
It’s a logic bug. If they’d written it in C or C++ it would have been an ASSERT or SIGSEGV and, in Rust, it was doing the equivalent of dying with an uncaught AssertionError.
(Basically, a violation of “Assume ‘unreachable’ never is”.)
I think that in Linux they have seen a number of improvements by better defining the ownership of various parts of the kernel, so this would be very nice. I think it may be a while before it is feasible, but we shall see!
Very important to remember! Going into working on this proposal I was trying to be very cognizant that this topic can engender strong feelings. I think overall I have been pleased with how the discussions have gone so far. I think so far folks have demonstrated that they can engage on this topic in a level-headed and cordial manner. I hope that trend continues. I also hope it has been clear in my own communications that I hope to understand and care about the the views of people who may disagree with the proposal. I do not want to drive away current contributors. I will also re-iterate that if anyone would like to chat about concerns regarding the proposal not in a discussion thread, please feel free to DM me!
I don’t think so for a couple of reasons. First, RustPython are semantically different enough that I think merging the semantics would be a significant undertaking:
Second, introducing Rust to CPython would see direct benefits to the millions of users CPython has today. Improving RustPython might see benefits to it’s current user base, but to get current CPython users to switch would need to be earned through proven compatibility over a long time period. That sounds like much more of an uphill battle with unclear benefits.
This is one of the points that Greg KH[1] explicitly calls out repeatedly as a benefit of the Rust-for-Linux effort, even if Rust went away again (here’s a recent talk with a timestamped link): By forcing the C APIs to define their semantics, the C APIs themselves often were improved.
one of the most veteran Linux contributors, maintainer of LTS kernels and involved in every security issue ↩︎
Git 2 → Git 3 sounds like Python 3 → Python 4. Git isn’t a programming language, you only need an implementation of Git’s wire protocol to interoperate with others, don’t even need to keep the on-disk format compatible. I don’t think Git’s choice necessarily tells us what we should do for CPython. Since the authors of this pre-PEP have already decided to make Rust optional for CPython, there isn’t much more to discuss about making Rust required.
Git 3 is bringing in support for SHA256 commit hashes, which is a protocol break. (From what I remember, Rust is used to write the part which allows seamless SHA1 → SHA256 transitioning within a repo.)
Question related to RustPython: they look to have made good progress on compatibility but not great progress on performance so far.
Is there anything to be learned from why?
I.e. if it’s just lack of time and contributors then it probably isn’t very interesting to this discussion, but if they were finding some optimisations difficult in Rust it might be useful to know.
2 main reasons, which are mirror-side of each other.
First, the core language part of CPython is very well optimized. RustPython adopt a few parts of them, but not able to finish major parts of them.
Second, RustPython contributors (like me) didn’t have enough interests on the performance. So it happened just occasionally.
So… that’s not by technical issues. Just lack of driving to the performance. If anyone interested in enhancing performance of RustPython, I will do my best support.
A sample clone in RustPython; it is not PyO3 though.
The technical details will differ, but once Rust for CPython gains sufficient abstraction, it will reach a similar level in terms of cognitive load.
Personally, I think this makes a lot of sense for extension modules and some components, but not so much for the core of the VM (the interpreter, gc, and core objects).
The safety, or lack of safety, of those components depends not so much on the language they are written in, but the design. If you feed dodgy bytecode to the interpreter it should crash, or worse. That’s how it is supposed to work.
What is the proposed API for rust?
The section on implementation talks about using bindgen, but what is the underlying C API that you are binding? The limited API or the warts-and-all “unlimited” API?
You mention PY_OBJECT_HEAD_INIT but that just couples rust code to the deep internals of the VM.
If we are introducing a new API, then it should be a good one. It should follow the design on HPy, or its successor PyNI. The linear typing of HPy handles should work very well with rust’s ownership model.
Yes, I think this is the best we can look forward to if the plan to progressively replace the core runtime implementation with Rust were still on the cards (which I assume it is, just unofficially for now - it doesn’t make any sense at all to “put Rust in CPython” without that ambition, since you can already write an extension module using Rust without our permission[1]).
And we are already working to improve the semantics of our C APIs as quickly as we can without destroying our existing user base - adding Rust won’t help here (it’ll let us get away with more breakage for some people “because Rust”, but it’ll upset other people who still want minimal breakage, and on average I expect it’ll make no real difference).
What the current proposal comes down to is “can we make some stdlib extension modules optional for some people”, where the answer is obviously “yes” as we already make a number of them optional for users who don’t have/support/want certain third-party libraries.
So the slightly higher-level question is “are we willing to make some modules optional based on compiler choice, rather than based on access to the core functionality required by the module”. (For example, if I don’t want/have OpenSSL, then the _ssl module is obviously useless and best omitted. But if I don’t want/have Rust, why should I miss out on _base64?)
That question is the real, practical impact that we have to decide on. Everything else is hypothetical and achievable in this way or in other ways. But if we’re not willing to have a more inconsistent stdlib across our userbase, then the overall question seems to be moot, at least until Rust can be assumed to be as available as make+GCC.
I’ll also note here that the SC has previously approved putting obviously core functionality on PyPI (subinterpreters), so until we rotate through to an entirely new SC, I don’t think there’s a case for “has to be in core” other than taking advantage of our popularity (which has been earned through stability, so we shouldn’t ruin it by actively destabilising it). ↩︎
While I think some view that as an end goal, a PEP to allow Rust extensions in the stdlib still stands on its own merit–there should be a formal decision whether to allow that expansion.
Even if the ultimate goal was “let some optional extensions be Rust” it would still need a PEP, no?
I like the idea of moving CPython towards Rust, but I feel the current proposal is so conservative that it doesn’t really get us there.
The idea is to allow optional extension modules to be written in Rust. That basically means either new accelerators for modules currently written in Python, or completely new stdlib modules (which are rare). They have to be optional, which means that we must also maintain a Python (or C!) version, at least for existing modules. The concrete suggestion is base64, which is currently fully in Python.
The “optional” part means that this proposal will make the stdlib more inconsistent across users. There are many ways the stdlib is already inconsistent in this way, but I feel that’s generally a bad situation that we should avoid making worse: ideally, Python should be the same for all users, and users who use base64 shouldn’t have to think about the details of their platform to know its performance characteristics.
There might be another unfortunate consequence. Clearly, lots of people are very excited about getting Rust in CPython. But with this proposal, the only realistic way they can do that is by adding new accelerator modules. Those could be for modules that already have a C accelerator, but in that case we have to keep the C version too to avoid creating a regression for users without Rust. Or it could be for modules that are currently fully in Python. But I’m not sure how many such modules there are for which an accelerator really makes sense: in most cases, if we’d wanted a C accelerator, we’d have written one already. Therefore, we might end up with Rust code that is mostly replacing Python code, not C code, and that isn’t terribly high-impact.
To get safety wins from using Rust, we have to stop using C. Not necessarily everywhere, definitely not everywhere immediately, but at least somewhere. So for this proposal to be accepted, I feel there must be a clear path towards making it possible to replace existing C code with Rust, even if that means making Rust required.
IMHO that “clear path” must be conditioned on the bootstrap issues being solved. As in: bootstrapping CPython-with-Rust should not be harder than bootstrapping CPython-without-Rust.
Once that happens, them I’m fully +1 for Rust in CPython, including core parts.