On my personnal test, slowdown was 40%, but the antivirus of windows may be guilty.
We can’t actually test the performance, since the nogil branch is a fork of 3.9 with extensive modifications to counteract the slowdown of the extra looking and reference counting overhead.
Consequently, comparisons to 3.9 are meaningless, and because the nogil branch lacks the improvements in 3.11, comparisons to 3.11 are also meaningless.
Was comparing python 3.9 to nogil python 3.9, on windows. Unfortunately, there is no benchmark done for windows on CI systems, nor on GitHub - faster-cpython/benchmarking-public: A public mirror of our benchmarking runner repository
To assess how it reacts on ‘user’ Windows, measuring it (roughly) would be nice.
Very cool proposal! As an average Python dev who sometimes had to work with concurrency and parallelism, I ask this (maybe obvious) question: If this get implemented, why would we devs ever use multiprocessing over multithreading?
If you have the problem of a long running program growing in memory size then it is useful to kill off a worker process and spin up a new one.
I work on an service written in python that uses this technique.
The service runs for months at a time without a restart (only restarting when we roll out an updated code).
You cannot do that resource management with threads.
This is such a big change and breaks the backward compatiblity, to me it is very similar to what python community has done when moving to python 3.
looking back, 2 to 3 was such a slow (and maybe painful) move due to the incompatibility. but i believe the community has learned so much and i believe this time we should be able to handle this transition much smoothlier.
I would think it is better to kick off python 4 for this pep. we can have python3 and python4 in parallel just like we did for python 2 and 3.
Developers have learned so much from last transition, and i think we will be in a much better shape if we do this again. it is a much cleaner way for doing such things, although it may appear there is more work to be done.
I think it’s possible to introduce this without the needing to support two ABIs.
Maybe that is a necessary goal for this to be accepted?
A lot of people seem to compare this change to the transition from Python 2 to 3. I might be good to remind the readers that the GIL removal has very little chance to break a Python-only codebase, unlike the transition from 2 to 3.
The breaking change impact seems an order of magnitude smaller if I read this document correctly.
The concern is that this is a breaking change for extension maintainers.
Distribution
This PEP poses new challenges for distributing Python. At least for some time, there will be two versions of Python requiring separately compiled C-API extensions. It may take some time for C-API extension authors to build
--without-gil
compatible packages and upload them to PyPI. Additionally, some authors may be hesitant to support the--without-gil
mode until it has wide adoption, but adoption will likely depend on the availability of Python’s rich set of extensions.To mitigate this, the author will work with Anaconda to distribute a
--without-gil
version of Python together with compatible packages from conda channels. This centralizes the challenges of building extensions, and the author believes this will enable more people to use Python without the GIL sooner than they would otherwise be able to.
What is the recommendation for distributors other than Anaconda? Are we encouraged to build and offer a separate nogil build of Python (similarly to how we currently build the debug build)?
After reading through the PEP, I stand by my initial reaction that this is exciting and impressive. However, I agree with those saying that the compiler option is problematic. My blunt summary of this PEP is that it proposes to introduce a second “officially blessed” Python version, like PyPy or micropython. While the PEP makes the desire for multithreading clear, does it need to be part of CPython?
What if it instead was introduced as a another distribution, potentially called PyNG (for NoGil or “Next Generation”)? I think, likely naively, that it would make it easier for Python distributors to separate the projects and it would help package maintainers advertise the scope of their package.
Furthermore, PEP 690 (Lazy Imports) was introduced for similar reasons (improving performance for a subset of Python applications), with a similar opt-in mechanism (run-time flag -P
), and it was rejected due to concerns that it would put significant burden on package maintainers. I think this PEP would do well to address PEP 690 and it’s rejection in the Related Work section.
Also, what will be the process for the builds distributed by python.org? Specifically, will people using the standard installers on Windows (or “casual user” distributions like the Windows Store distribution) have access to nogil builds outside of Anaconda?
To put this another way, will nogil be targeted as a “specialist” option, which you should only be using if you have a specific need for it (e.g., scientific/ML use, which is what Anaconda targets)?
I understand that the compiler option is to maintain a single cpython branch , but otherwise as in doubts of indirect effects, explicity would be nice:
Python-3.12…3.99 for gil , and Python-4.12…4.99 for nogil are released in parrallel
My_wheel-1.2.3-py3py4.whl
… universal wheels or not , it’s made explicit, no guess work, no need of dedicated site, … and has long that nogil is not estimated mature, you release binaries with the permanent rc tag.
It would handle the situation where nogil for linux is ready, and makes an environmental impact, but not nogil for windows or wasm
It would also break Python-only codebases that (directly or indirectly) depend on GIL-only libraries.
I would forgo 10-20% slowdown from 3.11 (and perhaps even 3.10) speeds for single-threaded code.
I became a programmer at the tail end of the Python 2 line, so with the following two caveats that: A. memory is fading, and B. I was not responsible for huge amounts of legacy Python 2 code, the following things seem worth recalling about the 2 vs 3 split.
- For quite a number of years, it wasn’t possible to run the same codebase in v2 and v3;
six
and later Python 3 versions eventually changed that, but still it did require modifying the Python parts of the codebase. - For quite a few years, many programmers didn’t realize a clear advantage to moving to Python 3, especially for those who didn’t have unicode problems. Rather, one had annoyances, such as
print
becoming a function, and your code getting slower than 2.7. - The ergonomics of Python3.5+ really made the upgrade unquestionably worthwhile (at which point Python 3 was ~7 years old.)
The incentive dynamics are different now, as I see it, at least.
- Running multiple interpreters for some part of the workload is not as difficult if both
enable-gil
anddisable-gil
interpreters can run the same Python codebase withoutsix-type
modifications. - The pressure to utilize the entire CPU is intense.
- Any single-threaded slowdown is accompanied by easier multi-core programming, and memory savings.
- Thus the slowdown shouldn’t appear as apparently arbitrary to python programmers, at least not as did the slowdown from Python 2.7 to Python 3000.
- The core devs understand the “existing code” problems vastly better than they did before.
- Breaking changes won’t be made as lightly as before.
- Many migration difficulties are likely to have much more documented solutions and even PRs 3rd party packages.
I don’t want any of the above to be taken as making light of the risk of splitting the community in 2 vs 3 fashion, or as provoking the core devs to be frivolous.
But I do think there’s some room to consider that the upside potential for a GIL-less python is immense, and to presume that the community will be more motivated to transition more quickly than the in 2 vs 3 era.
Hats off to the core devs for their rigorous blending of innovation and caution! Thank you for Python and increasingly, thank you for Faster Python!!
As maintainer of PyO3 (Rust bindings to create native extension modules as well as Rust binaries embedding Python), I’m hugely excited to see this! Safe concurrency is a common selling-point for Rust, and many times I’ve had users asking about how to do multithreaded Rust and Python.
It would take some work for PyO3 to support the new ABI proposed here. It should be possible to get to a point where all Rust extensions could be built for both ABI flags and PyO3 would (mostly) hide the details away from extension authors.
I think the resulting API which PyO3 would be able to offer would be simpler if the GIL were removed. (For any familiar to Rust, we model the binding of the GIL to a thread as a Rust “lifetime”, which is a somewhat advanced Rust topic, so we necessarily have to throw Python users in the deep end if they choose to use PyO3 to make their first foray into Rust.)
On the distribution side, I think that rather than terming it as “no GIL” I suggest the new variant should be termed something more like “multithreading optimized”. I’d speculate that would be easier to communicate to users less familiar with Python internals, which would help drive adoption.
I’m sure that PyO3 users would be keen to start experimenting with this functionality as soon as it were available, and I’m sure it would help with adoption if there were official distributions of this rather than leaving it to third-party distributions like Anaconda.
If we did not make this a 4.0, potentially python.org downloads could come in the existing default variant as well as the “multithreading optimized” variant, while extensions move towards supporting the new ABI. After a few Python releases, maybe the default can then be switched and the official downloads instead offer default or a “single-thread backwards-compatible” build.
Regardless of whether this becomes 3.13n or 4.0, I think it’s inevitable that PyO3 would have a protracted migration period (potentially even 5 years) while existing with-GIL Pythons reach end-of-life. Taking this pain seems worth it to me and we’d do what we can to make it easy for extensions built in Rust to straddle the two variants over that period.
I don’t see it as “huge”. To be more specific:
- unlike the py2/py3 transition, this would mostly concern C API code, not pure Python code (I have no doubt that some Python code out there may rely on the GIL being present, but it’s certainly a small minority of all Python code written)
- unlike the py2/py3 transition, the incompatibility is only in one direction (code that works without the GIL should work with the GIL as well), which largely eases the migration
- the changes required are much more limited than for the py2/py3 transition; in particular, you don’t have to think about redefining visible API semantics to accomodate the bytes/str separation
However, I do think this would be important enough to warrant a major version number bump to Python 4.
Critical sections don’t compose in the sense that starting a new critical section may suspend a previous one. If you need to lock two structures at once, you must use the Py_BEGIN_CRITICAL_SECTION2 function, which locks two mutexes “simultaneously” in one critical section. The APIs do not handle locking more than two mutexes at once, but I haven’t seen a need for that within CPython or in C API extensions.
Like Antoine, I don’t think the change is huge. I agree with his points and I’ll add a few extra bits of context:
- “pure” Python code doesn’t require any changes to continue working. The exception is that there are some bugs that are triggered rarely with the GIL (maybe 1 in a million runs) but are triggered much more often without the GIL because thread “interleavings” happen much more often without the GIL. I’ve found and fixed a few of these bugs in CPython (and occasionally written a few of these sorts of bugs in packages I’ve worked on.)
- Most C API extensions don’t require any changes, and for those that do require changes, the changes are small. For example, for “nogil” Python I’m providing binary wheels for ~35 extensions that are slow or difficult to build from source and only about seven projects required code changes (PyTorch, pybind11, Cython, numpy, scikit-learn, viztracer, pyo3). For four of those projects, the code changes have already been contributed and adopted upstream. For comparison, many of those same projects also frequently require changes to support minor releases of CPython.
I don’t have a strong opinion about version numbering, but if the PEP were adopted and labeled as Python 4, I would like to avoid using the version bump as an opportunity to introduce other backwards incompatible changes.
EDIT: six → seven projects
This PEP doesn’t introduced any large breaking API changes, but making it Python 4 would: a large number of scripts and instructions for Linux run Python as python3
, which I assume wouldn’t work with Python 4