PEP 703: Making the Global Interpreter Lock Optional

As maintainer of PyO3 (Rust bindings to create native extension modules as well as Rust binaries embedding Python), I’m hugely excited to see this! Safe concurrency is a common selling-point for Rust, and many times I’ve had users asking about how to do multithreaded Rust and Python.

It would take some work for PyO3 to support the new ABI proposed here. It should be possible to get to a point where all Rust extensions could be built for both ABI flags and PyO3 would (mostly) hide the details away from extension authors.

I think the resulting API which PyO3 would be able to offer would be simpler if the GIL were removed. (For any familiar to Rust, we model the binding of the GIL to a thread as a Rust “lifetime”, which is a somewhat advanced Rust topic, so we necessarily have to throw Python users in the deep end if they choose to use PyO3 to make their first foray into Rust.)

On the distribution side, I think that rather than terming it as “no GIL” I suggest the new variant should be termed something more like “multithreading optimized”. I’d speculate that would be easier to communicate to users less familiar with Python internals, which would help drive adoption.

I’m sure that PyO3 users would be keen to start experimenting with this functionality as soon as it were available, and I’m sure it would help with adoption if there were official distributions of this rather than leaving it to third-party distributions like Anaconda.

If we did not make this a 4.0, potentially python.org downloads could come in the existing default variant as well as the “multithreading optimized” variant, while extensions move towards supporting the new ABI. After a few Python releases, maybe the default can then be switched and the official downloads instead offer default or a “single-thread backwards-compatible” build.

Regardless of whether this becomes 3.13n or 4.0, I think it’s inevitable that PyO3 would have a protracted migration period (potentially even 5 years) while existing with-GIL Pythons reach end-of-life. Taking this pain seems worth it to me and we’d do what we can to make it easy for extensions built in Rust to straddle the two variants over that period.

21 Likes

I don’t see it as “huge”. To be more specific:

  • unlike the py2/py3 transition, this would mostly concern C API code, not pure Python code (I have no doubt that some Python code out there may rely on the GIL being present, but it’s certainly a small minority of all Python code written)
  • unlike the py2/py3 transition, the incompatibility is only in one direction (code that works without the GIL should work with the GIL as well), which largely eases the migration
  • the changes required are much more limited than for the py2/py3 transition; in particular, you don’t have to think about redefining visible API semantics to accomodate the bytes/str separation

However, I do think this would be important enough to warrant a major version number bump to Python 4.

17 Likes

Critical sections don’t compose in the sense that starting a new critical section may suspend a previous one. If you need to lock two structures at once, you must use the Py_BEGIN_CRITICAL_SECTION2 function, which locks two mutexes “simultaneously” in one critical section. The APIs do not handle locking more than two mutexes at once, but I haven’t seen a need for that within CPython or in C API extensions.

Like Antoine, I don’t think the change is huge. I agree with his points and I’ll add a few extra bits of context:

  • “pure” Python code doesn’t require any changes to continue working. The exception is that there are some bugs that are triggered rarely with the GIL (maybe 1 in a million runs) but are triggered much more often without the GIL because thread “interleavings” happen much more often without the GIL. I’ve found and fixed a few of these bugs in CPython (and occasionally written a few of these sorts of bugs in packages I’ve worked on.)
  • Most C API extensions don’t require any changes, and for those that do require changes, the changes are small. For example, for “nogil” Python I’m providing binary wheels for ~35 extensions that are slow or difficult to build from source and only about seven projects required code changes (PyTorch, pybind11, Cython, numpy, scikit-learn, viztracer, pyo3). For four of those projects, the code changes have already been contributed and adopted upstream. For comparison, many of those same projects also frequently require changes to support minor releases of CPython.

I don’t have a strong opinion about version numbering, but if the PEP were adopted and labeled as Python 4, I would like to avoid using the version bump as an opportunity to introduce other backwards incompatible changes.

EDIT: six → seven projects

12 Likes

This PEP doesn’t introduced any large breaking API changes, but making it Python 4 would: a large number of scripts and instructions for Linux run Python as python3, which I assume wouldn’t work with Python 4

6 Likes

I discussed this with Peter Wang at PyCon US back in May, but have not worked out all the details like which channel it would be available in. The likely contenders would be “main”, “conda-forge”, or a channel I maintain specifically for this purpose. Either way, the recipes will likely be based conda-forge recipes, so having support from the conda-forge team would be great. I’ll reach out to them and try to sync up with Peter.

I think the PEP should address this, but I’d like to gather more feedback before proposing something. What do you think the recommendation should be for other distributors? I don’t have a sense if distributing multiple versions of Python on Fedora, for example, is too much of a burden or provides sufficient value.

At the PyCon 2022 language summit, I suggested not distributing the --without-gil build on python.org initially, but after reconsidering this, I don’t see a good reason not to. Again, I’d appreciate suggestions and feedback on this.

You might be able to do a per-thread freelist, but that’s basically what mimalloc provides. Mimalloc’s freelists are based on block size instead of concrete type, but otherwise pretty similar. There might still be a small performance advantage of essentially re-creating the freelists outside of mimalloc to avoid some of the indirections involved in calling the allocator, but it might be a better use of effort to focus on optimizing those code paths.

1 Like

Yeah, I think this should addressed in the PEP, but I don’t know what the behavior should be. I might lean towards issuing a warning. Do people have suggestions?

I think the --without-gil build should be available from python.org, and I think the PEP should cover how packages suitable for use with the nogil build would be distributed on PyPI (from what you’ve said, I’d imagine that many packages could just work without change, but some would need different builds - I’m not clear how you’d do that without a new packaging tag and requiring existing projects to rebuild or at least retag their wheels…)

After all, unless nogil is intended to forever remain a specialist option, you’ll need to address those matters at some point, and so why not in the PEP?

7 Likes

Some reasons:

  • It will double the amount of 3.x builds (& testing!) required by any medium-and-up-sized Python project, which is a substantial burden.
  • It will also plague infrastructure that has been built with the assumption that there’s only one ABI per python (minor) version, which has been true ~forever.
  • I don’t know if the pip resolver would be capable of distinguishing the ABIs when trying to resolve the installation of a package, but I suspect not. This would be a huge usability issue (installing a package compiled for the wrong ABI, or falling back to source installs of packages not yet nogil-ready, etc.)

Not counting the debug ABI (which isn’t generally something people distribute), there hasn’t been a python version with two productive ABIs, so the above list is just the start of a likely large number of unintended side effects that someone needs to fix. IMO the onus is on the PEP to argue why the benefits of having parallel ABIs outweigh all that work.

I understand that it sounds like an appealing option to have nogil be available more quickly, and you strengthen that point with…

… but IMO it’s going to take a while to digest this either way, and the acceleration provided by parallel ABIs risks ending up being a mirage that will cause very high costs for maintainers.

On that topic: By my reading of the stable ABI promises, if you intend to break that, a major version bump is unavoidable. And certainly, other pent up changes will attach themselves to such an occasion – e.g. the packaging ecosystem is considering various flavours of large overhauls, c.f.

But that’s to be expected IMO – nogil is an amazing piece of work, but it’s still not realistic to ask that all other brewing / pending changes in the rest of the Python ecosystem give nogil “exclusivity” on a major version bump.

4 Likes

Great! I have patches to PyO3 that I use to build a compatible version for the “nogil” fork. I’ll open a PR with those changes. I think the PR will be useful to discuss the implications for PyO3 in the context of the actual code.

6 Likes

Yes, I think it needs to be part of CPython to be useful. CPython is much more widely used than any other Python implementation. CPython solves a coordination problem: that’s where most of the work on improving Python will be done because that’s where the most users will get value. As a practical matter, I don’t think it will be possible to maintain this long term as a separate fork of CPython.

9 Likes

I’ll describe the process and changes to build wheels for --without-gil. I’m not sure which of these details should be in the PEP, because a lot of these tools are not usually covered by PEPs.

  1. The manylinux images should include a --without-gil build. For example, based on the PEP, there would be a Python installation in /opt/python/cp312-cp312n/ as well as /opt/python/cp312-cp312/ and the nine existing Python installations.
  2. The cibuildwheel GitHub action should support cp312n (the --without-gil build).
  3. The setup-python GitHub action should similarly support 312n.
  4. For a project that uses cibuildwheel, like SciPy, you might change this line to include the item [cp312n, 3.12].

For (2) it would be helpful to have installers available on https://www.python.org/ftp/python/.

If you were building a wheel by hand, you would just need to run pip wheel . with a --without-gil install, just like you would for other Python versions. Pip already handles the appropriate abi tags.

There aren’t any changes necessary to pip, wheel, or pypi. For context, I’ve made most of these changes in forks of the mentioned projects (i.e., cibuildwheel, setup-python) to build C API extensions for the “nogil” proof of concept.

FYI ‘main’ has Terms of Service restrictions that prevent certain commercial organizations who do not pay Anaconda from downloading: https://legal.anaconda.com/policies/en/?name=terms-of-service#anaconda-repository

conda-forge has a lot of good infrastructure but you may face the issue that two packages can’t have the same name and there’s no name-spacing on conda other than the channel name itself.

Irrelevant to this conversation (except that it means Anaconda can take full responsibility for making it work in their own repository, and we don’t have to solve their problems for them - can focus on conda-forge’s needs).

There are build distinguishers all through conda. It ought to work just fine, though the user experience would likely be quite different from what users may expect. Having a separate repository would definitely be simpler.

1 Like

You don’t need different package names, because the metadata in conda(-forge) is good enough to allow distinguishing the ABI of a python version, and let packages depend on it. This is what allows conda-forge to publish builds for pypy, for example.

Conda is really the least problematic aspect here (which is why it’s a good candidate to deal with distribution through that) – if the main distribution channel were conda-packages, we’d be ready today[1] to do parallel ABIs with basically zero effort (aside from having 7 rather than 6 python flavours in the build matrix[2]).


  1. from a publishing & consumption standpoint, not source compatibility of course ↩︎

  2. currently CPython 3.7-3.11 & PyPy 3.8 / 3.9 – Pyston & GraalPy might be added at some point as well… ↩︎

3 Likes

Just an FYI in case using ‘main’ was being thought about the “blessed” way to get it that many organizations would not be able to participate in getting it. If this isn’t the case then as you say it’s irrelevant.

I’m not an expert but my understanding from past experience is build distringuishers are problematic because not everyone puts restrictions on them by default, so for example if a new version of the “nogil” Python came out before the “gil” version of Python then users who were expecting the “gil” version might accidentally get the “nogil” version.

But maybe there’s a better way to handle it so users who haven’t restriction build distinguishers won’t get a “nogil” version accidentally.

Oh great!

None of the above details need to be in the PEP, IMO. However, you said that many extensions would not need changes, which I assumed meant that the existing binaries would work. That would need some discussion in the PEP, to clarify how projects would mark whether their binaries worked with nogil or not.

Conversely, if nogil requires separate binaries with a different ABI for all extensions, then I think a discussion of the issues mentioned by @h-vetinari are warranted. How will storage on PyPI be affected by essentially doubling the number of binaries? How will we ensure libraries are available for nogil given the additional work it demands from maintainers, etc.

As I say, not having python.org builds might delay the need to address these issues, but they will hit eventually, and they should therefore be covered in the PEP.

1 Like

Sure, not many extensions would need changes. That sounds plausible.

But how do you determine if an extension needs changes? To me it sounds like looking for reliance on the GIL needs a full audit of the code. Also, test suites tend to be single-threaded and concurrency errors tend to be random/rare, so I’m afraid tests won’t help much to find the issues.