PEP 703 (Making the Global Interpreter Lock Optional in CPython) acceptance

First I want to distinguish between Python the language and the (cpython) implementation. The nogil/multicore work is an implementation detail, AFAIK it should have no effect on the language definition (except maybe clarifying the memory model). So technically speaking it is really “Multicore CPython”.

You say that cpython already works on multicore setups, but… it doesn’t? When running the Python language, cpython runs on a single core. If you’re referring to multiprocess, I’d say it doesn’t count; the language as implemented by cpython is single-core.

What does “running the Python language” mean exactly?

If I add two Python integers, it invokes a C function that implements the addition of two Python integers. If I add two NumPy arrays, it invokes a C function that implements the addition of two NumPy arrays.

If adding two Python integers is “running the Python language”, why wouldn’t adding two NumPy arrays also be “running the language” as well?

Use Numpy, Pandas, Numba, Cython, Dask… and you can easily get the benefits of several CPU cores. You may not relate to that ecosystem, but it exists and is an essential part of Python’s growing popularity.

Why doesn’t it? multiprocessing has been part of the standard library for 10+ years, and is widely used. So is its cousin concurrent.futures.

You don’t get to choose what is “Python” and what is “not Python”. There’s nothing special about the multiprocessing module that makes it less Python than the threading module. There’s nothing special about NumPy or Dask that makes it less Python than Pillow, Django or SQLAlchemy.

Yes, removing the GIL is a big leap forward and will help exploit CPU parallelism in more workloads. No, it doesn’t mean that previously CPython was “monocore”. Saying so is just misrepresenting the current state of the CPython implementation.

2 Likes

I’m not sure why that suggestion struck such a cord, but this is a warning to check the language you use when you respond. You can express that you disagree with something without suggesting that someone is deliberately misinforming, lying to, or insulting others.

2 Likes

@davidism Since you’ve edited my post, can I ask you to make it clear that you’ve edited it?

(I don’t think this is the first time that such a mention is requested on this forum, by the way)

3 Likes

There is an edit icon on every post that has been edited that shows the diff of each edit. Please get back on topic.

2 Likes

I am already confused by calling Python ‘Multicore’. I just did a test with a simple TCP server, and I can make use of each CPU core, at least by 50%, despite the Global Interpreter Lock (GIL). My assumption has been that Python already runs on multiple CPU cores.

1 Like

Isn’t any non-dry term, i.e. something with branding appeal, for GIL removal going to be some level of technically inaccurate?

And I would push back on the idea that CPython can’t be associated with being “monocore”. In my experience as of Python 3.12, without spawning subprocces, any pure Python code will almost certainly perform no worse, if not in fact better, if you reserve and pin it to a single core on your CPU.

Maybe I’m missing some obvious example where this isn’t the case though.

TBH, I’m not sure you would need any “branding appeal” for something that is already technically appealing. People who would benefit from it probably know what the GIL is, or have heard about it, and the implications of “no GIL” are more explicit and more easily understood than a vague “multicore”.

It depends what you call “pure Python”. If it means that you don’t write non-Python code, then it’s easy: just call into NumPy or any other numerical library that releases the GIL before doing computations.

A more sophisticated example would be using Numba with nogil.

1 Like

As per this discussion: PEP 703: Making the Global Interpreter Lock Optional (3.12 updates) - #14 by bluetech

As per What is "Pure Python?" - Stack Overflow, which to me is commonly what is understood as “pure Python”, I mean:

mean it’s all implemented in Python, and not (as is sometimes done) with parts written in C or other languages

So no, for this definition of pure Python, I would exclude numpy, or any third party package that requires a compiler, or any call to another process, to install their source distribution.

I would think this isn’t true for asynchronous code, even if it’s pure Python?

Why would you exclude Numpy or any third-party package, but not CPython itself?

Are you even sure that the Python-level dependencies you’re using are all pure Python? They might have a C accelerator here and there.

In any case, even with only CPython and the stdlib, you can still benefit from multiple cores, for example using multiprocessing or concurrent.futures.ProcessPool, or for example by calling zlib or hashlib from multiple cores.

1 Like

I think this is straying off topic. The SC will weigh in on a name choice for the C macro (we’ve been asked to) and given the community contention around the term multicore we’re more likely pick something different.

no need to discuss what is and isn’t parallel here.

8 Likes

I still like the name Unlocked Python.

… but I’m likely in the minority.

It’s nice because it no longer has the gil, and sounds swell.

For explaining it to people: “Traditional Python is typically locked to one piece of python code at a time in a single process. Unlocked removes that limit so multiple threads of python code can run at once in a single process.”

It sounds nice, makes sense, and explains nicely for folks unfamiliar.

1 Like

How about parallel python?

Considering the version removes a lock, how about something like “unlocked iterpreter” or “unlocked python”?

I think “parallel” is a good word to use as a base. I think it needs to be a little bit more specific as what removing GIL gives us over the standard build is specifically parallel threading (more specifically, threading module), right? Multiprocessing in Python can already achieve running different operations in parallel which some people here started discussing vigilantly/ad nauseam. On the other hand, the threading can’t achieve the same thing since GIL locks the whole interpreter instead of locking the specific resources that the thread actually needs.

So the gist of it is - threading is what can only be concurrent with a standard build, while with no GIL it can actually run in parallel.

Therefore, I suggest “parallel threading” or if you want it to be on the nose then “truly parallel threading” :smile:

Some potential issues with that name that I see are:

  • some people could not immediately recognize the difference between concurrent and parallel but I’m not sure if any other term would make it a clearer difference than parallel when compared to concurrent.
  • “parallel threading” is two words, not one word, and so could perhaps be a bit long for use in the C macros/function names. So maybe there, “parallel” would be enough. But for the user-facing name (and in the documentation of those C macros/functions), I think using two words works all right. Prefixing it with the aforementioned “truly” could work to market it better in something like release highlights :slight_smile:
  • I suppose that in a realm of CPUs with only a single core/thread, “truly parallel threading” still ends up being concurrent since there’s no hardware support to actually run multiple operations at the same time but I’m not sure if that matters - for people running Python on such CPUs, those builds will just end up working roughly the same since on the thread switch, Python will be unlocking GIL either way (overhead of a switch could be a bit smaller with no-GIL but that’s something that would fall under a micro-optimization, I imagine).

I like nogil, for the reason that it is the only term that is completely accurate and also what the build has traditionally been known as. I’m very unconvinced that we need a marketing term that avoids using a negative to convince people that it’s useful. People are going to use whatever build of Python comes with all the packages that they want. The people who need convincing are package authors who know what the GIL is.

I don’t think any of the alternatives are good:

  • “Free Threading” is pretty ambiguous and not a well-defined term. I’ve seen it refer to a programming model without any locks at all, which is not the case here. If you google “free threading” many of the top results are discuss.python.org forums about nogil.
  • As people have mentioned, Python can already use multiple cores, even if you restrict yourself to the standard library.
  • “Unlocked” sounds like a marketing term
6 Likes

I find the heated discussion a bit odd… it’s just a name, after all :slight_smile:

In the past, we’ve always called this “free-threading”… Greg Stein was the first (IIRC) to try such a patch back in 1999. And even Sam and the SC use the term, so why not simply stick with that instead of having heated discussions ?

At the end of the day, it’s all going to be Python.

11 Likes

A name that will hopefully only be relevant for a release or two, at that!

2 Likes

FWIW, I appreciate the thought folks have put into naming this feature. I’m also confident that, at this point, the Steering Council has a good sense of an appropriate name to use, particularly for the technical aspects like the feature macro, as @gpshead said. Furthermore, I agree with @malemburg pretty much entirely.

One thing I want to clarify is that CPython already supports multi-core (AKA parallel programming), even with a GIL. I don’t just mean multiprocessing or Dask or releasing the GIL for blocking calls. I mean actually executing Python code in parallel, not just concurrently (which the GIL normally prevents in multi-threaded programs).

As of 3.12 you can use multiple interpreters (“subinterpreters”) that don’t share the GIL. (See PEP 684.) That means Python code can truly run in parallel in two threads if those threads are using different interpreters. Unfortunately, in 3.12 the feature is only accessible via the C-API.

I do have a PEP that proposes a stdlib module to expose the feature to Python code (PEP 554), but it didn’t make it in time for 3.12. (I’m also in the process of replacing PEP 554 since it 7 years old and full of the accumulation of 7 years of discussion.) My plan is to target 3.13 and publish a PyPI module in the coming month or two to use for 3.12.

As @malemburg said, PEP 703 is strictly about supporting free-threading (with a single interpreter).

8 Likes