Alternate design for removing the GIL

The issues with the compiler i am thinking about are discussed here https://lwn.net/Articles/793253/ and other related articles.
I assume that you will need solutions just like the linux kernel does.

OK, but given that projects to eliminate the GIL have been ongoing for many years, and in particular @colesbury has spent a huge amount of time getting his current work to its current state, do you consider offering a new design at this point, with no implementation to back it up, to be a realistic suggestion?

What do you expect to happen here? Should the SC put PEP 703 on hold while someone tries to implement your idea? Should we implement PEP 703 and then rewrite it if your idea turns out to be a better approach? What about the other internals work (notably the “Faster CPython” work) that’s trying to get to grips with how PEP 703 affects their plans, approaches and timescales? Should they factor in yet another possibility that may or may not work when the details of implementation start getting tackled?

Talking about alternative approaches is fine, and very interesting (if you like interpreter internals :wink:) but this list is for proposals which are intended to be added to Python. I’m not sure your idea fits that classification. As a general “bouncing ideas around” conversation, I guess it’s fine, but don’t be surprised if the work on deciding whether PEP 703 gets accepted simply ignores it…

10 Likes

I somewhat tried to say this in my preamble. Sorry if I wasn’t too clear.

But I don’t really have any expectations.

Yes, these sorts of code takes a while to build, and an order of magnitude or two more time to debug. And yes, it would be very disingenuous to have a random person just show up and expect someone else to do a lot of hard work. And that’s not even starting to talk about how it impacts other projects.

I did intend it to be more of a “bouncing of ideas”. I guess the complexity of something like this made me want to specify it more formally, (to say that it is possible for this to work).

I would expect a group like “Faster CPython” to not even consider a large change without good performance numbers. Let alone having no numbers at all.

Good question.

I guess looking at the 3 main up sides:

  • Not breaking the ABI
  • JIT optimizations
  • maybe performs better?

Assuming PEP 703 gets accepted, then breaking backwards compatibility would be a sunk cost. Also, I’m guessing the “Faster CPython” would be happy with what they can JIT. So the top two no longer really matter. And at a guess, I don’t think PEP 703 is leaving enough performance on the table to be able to justify a change in implementation.

So if PEP 703 does get accepted then I doubt there would be any reason to accept this into Python.

Why bother talking about this then? 2x points:

  1. If PEP 703 does not get accepted then most of your arguments don’t apply anymore.

  2. There may be some ideas in this that Sam could use to improve on PEP 703 (eg, like how I’m abusing that _Py_Dealloc isn’t part of the stable API to get backwards compatibility of the ABI). I don’t know his design well enough to make any comments tho.

I would be somewhat surprised if this isn’t ignored.

I’m sorry. From the informal description on the ideas category I was under the impression that this post would of been ok.

Is there somewhere better that something like this should be moved to? Or maybe should the title be changed?

2 Likes

NoGIL movement definitely needs a better design for removing GIL. It seems unlikely that users would be willing to accept a XX% performance penalty for releasing the GIL.

On the other hand, in modern times, the cost of inter-process communication (IPC) and data serialization between processes in real-world applications is unlikely to account for a 10% reduction in processing time. If this is the case, maybe the algorithm is inherently serial and parallelization is not possible.

This is totally dependent on a) what the users are doing and b) what X is. Lots of users would gladly make that trade, at the right number.

7 Likes

Have you not seen the extensive discussions between the Faster CPython people and the PEP 703 people? There’s already a lot of debate about how 703 impacts the performance work. Your proposal is way behind where they are on that.

You don’t need to guess, there’s an extensive discussion here. And PEP 703 will be a bunch of extra work for the faster CPython team. So it’s far from no longer mattering.

Fair. But I doubt there would be any appetite for another run at the GIL for quite some time.

It’s not so off-topic that it’s inappropriate, I didn’t mean to give that impression. But it is unlikely to generate anything but speculation. If that’s all you want, then fine.

I still have to see a real-world use case where using threads (plus nogil overhead) is faster than using multiprocessing. I have been following the discussions about nogil, I didn’t see any number, only a fib() algorithm which can be done the same using multiprocessing.

What I’m trying to say is that the “parallelization problem” would still be there in the nogil world. You still have to use sync primitives (queue, locks, etc…). I believe that free threading would give developers false hopes.

Today’s your lucky day, there’s already an example linked in the other thread. He even compares it directly to using multiprocessing.

From what I can see, the people who are looking forward for free threading in python are well aware of what that entails and how it works, because they’re already going into other languages to accomplish it. The point is that we’d rather not have to do that.

2 Likes

I did see this use case, it can be done the same (using the same code) using multiprocessing.

What about the well-established user base, are they willing to pay a x% nogil tax? The way I see it, they will need x% more CPU cores to offer the same service, i.e., x% cost increase in CPU power. To think that many services operate on a 0.X% net revenue.

Note that I’m not against the free threading, but we have to think about its effects on the current user base.

Hmm, did you see the line where the author says:

Using a multiprocessing.Queue in the same place results in degraded performance (approximately -15%).

1 Like

I prefer not to comment on the work of others, but theoretically speaking, that is not a parallel problem at all (multiprocessing.Queue is redundant). Just download and write in drive.

The PEP has a section on multiprocessing, and the thread PEP 703: Making the Global Interpreter Lock Optional (3.12 updates) has had numerous people explaining why they would benefit from nogil while they don’t benefit from multiprocessing. If you want to comment constructively on this topic, you have to address those. It sounds like you’re ignoring them, which is not productive. Thank you.

7 Likes

I don’t want to fight with you, but if you actually read the backblaze post it clearly explains why this is not so.

1 Like

I don’t believe that asking for real-world use case comparison between threads and multiprocessing is nonconstructive.

Nonconstructive is hoping that removing GIL will solve parallelization problems, or thinking that GIL is stopping us from solving parallel problems.

Read Mark Shannon comment: PEP 703: Making the Global Interpreter Lock Optional (3.12 updates) - #9 by markshannon

For the purposes of this discussion, let’s categorize parallel application into three groups:

  1. Data-store backed processing. All the shared data is stored in a data store, such a Postgres database, the processes share little or no data, communicating via the data store.
  2. Numerical processing (including machine learning) where the shared data is matrices of numbers.
  3. General data processing where the shared data is in the form of an in-memory object graph.

Python has always supported category 1, with multiprocessing or through some sort of external load balancer.
Category 2 is supported by multiple interpreters.
It is category 3 that benefits from NoGIL.

How common is category 3?
All the motivating examples in PEP 703 are in category 2.

Look closely at performance overhead numbers. Also, we don’t know what the overhead will be in the algorithms that we hope to benefit from GIL removal, because I haven’t seen any number yet (If I have missed any number, please show me). In other words, would this benefit outnumber the performance overhead.

Frankly speaking, why should I have to pay more for something I don’t use? It’s like paying a toll for a road I never use. If it weren’t for the performance overhead, we wouldn’t be discussing this at all, and the PEP would have been accepted quietly.

If you can demonstrate a GIL removal design without any overhead, I would gladly put ‘NOGIL’ on a t-shirt and wear it proudly all year round.

But if you’re just ignoring the existing examples and comments from people who work on various packages because you don’t believe them. This isn’t a recipe for a productive discussion.

5 Likes

These problems can be addressed in the same way using multiprocessing. I am referring specifically to problems that cannot be solved by multiprocessing alone.

Because there’s only one road, and all CPython users are on it. We have to make decisions and trade offs for everyone, not just you. (Or me: removing the GIL won’t benefit me in my current job.)

Will removing the GIL from CPython cause a slowdown for single threaded users? Almost certainly. Will it be better for the community as a whole? That’s the hard decision.

12 Likes

The PEP author and numerous people on the various discussion threads spent effort into explaining the details of why nogil would work for them and multiprocessing would not work or work less well. You make this sweeping claim but without backing it up in any way. I’m not going to try to argue about this or respond further; just be aware that claims without justification are rarely convincing.

6 Likes

I’m literally asking for help here, I’m not making any claims. I’m trying to implement a “3. General data processing where the shared data is in the form of an in-memory object graph.” problem with real-life usage using free threads and multiprocessing, but IPC is not the bottleneck.

If free threading provides better performance results, I will also embrace it.

Please keep this topic focused on the alternative proposal to PEP 703, not on the merits of removing the GIL as a whole as that would be considered off-topic. If the conversation continues to veer off I will go through and hide all posts not related to the direct merits to this alternative proposal.

14 Likes