tl;dr We have other viable options to support the multi-core needs of Python users. We don’t need to feel like free-threading is the only option.
(FTR, this is an expansion of what I proposed to present at the language summit this year. In fact, this first part is essentially the summary and outline I submitted.)
PEP 703 and the free-threaded build of CPython have brought a lot of attention to Python’s multi-core story, which has historically been murky at best. There are a variety of things we can do to improve that story, including better docs, expose existing functionality, or even remove the GIL.
Most importantly, we need to clearly understand and agree about what users actually need solved. This is especially important before we make a final decision about PEP 703.
The main point here is that, when it comes to multi-core parallelism, we need to be clear about:
- what Python’s users really need
- what solutions we offer
- what solutions we could offer
- what their downsides are (and any secondary benefits)
Clarity in this area will help us both make good decisions and communicate better with users.
I’m not aware of any significant analysis along these lines (other than my own meager-but-best effort back in 2015).
Summary:
Part 1: status quo
- use cases for Python in a threaded application
a. in Python programs
b. in extension modules
c. in embedded Python
d. what data do users want to share between threads - status quo solutions
- threads w/GIL
- multiprocessing
- distributed
- multiple interpreters (C-API-only)
- deficiencies in the status quo
- threads: GIL (incl. blocking embedded threads unnecessarily)
- multiprocessing: slow, extra system resources
- no stdlib module for subinterpreters (PEP 734)
- subinterpreter rough corners
- documentation (howto) - show users how to do concurrency with Python
Part 2: possible improvements
- stdlib “interpreters” module (PEP 734)
- better interpreter perf
- interpreter/threading helpers in stdlib (e.g. proxies, immutable)
- get rid of GIL (PEP 703)
Part 3: is the free-threading build necessary?
- being unnecessary does not mean PEP 703 should be rejected
- alternatives allow us to make better decisions
Regarding part 3, I think it’s important that we have consensus about any available alternatives to removing the GIL. We shouldn’t need to feel like we have to accept PEP 703. To be clear, I’m not against removing the GIL; I only think we should make an informed decision.
Status Quo: Use Cases for Python in a Threaded App
This is an area where I think we could have substantially more clarity.
- in Python programs
- ???
- in extension modules
- ???
- in embedded Python
- ???
- what data do users want to share between threads?
- ???
Status Quo: Solutions
- threads w/GIL
- multiprocessing
- distributed
- multiple interpreters (C-API-only)
Status Quo: Deficiencies
- threads: GIL (incl. blocking embedded threads unnecessarily)
- multiprocessing: slow, extra system resources
- no stdlib module for subinterpreters (PEP 734)
- subinterpreter rough corners
- users tend to not have much understanding of how they will be impacted by free-threading
Possible Improvements
- stdlib “interpreters” module (PEP 734)
- better interpreter perf
- interpreter/threading helpers in stdlib (e.g. proxies, immutable)
- get rid of GIL (PEP 703)
Is the Free-threading Build Necessary?
First of all, it’s important to note the following:
- Sam and his team have been responsive, highly collaborative, and never uncooperative
- it may be easier for some users to take advantage of free-threading than the alternatives
- being unnecessary does not mean PEP 703 should be rejected
- alternatives allow us to make better decisions
The question of necessity is partly a function of the following:
- who actually benefits from free-threading? (what are the motivating use cases?)
- how do those users benefit, and how much?
- what new costs offset those benefits?
- what new costs does everyone else face?
FWIW, PEP 703 does describe a number of motivating use cases. I don’t mean to suggest it doesn’t but, rather, that it would help to have a clear, broad analysis of use cases for multi-core parallelism that’s independent of PEP 703. That would put us in a position where we could better assess the options for supporting all the use cases.
The other part of the equation involves what alternatives are available. I’m most familiar with the use of multiple interpreters, but that isn’t the only viable alternative. The same questions from just above should be answered for the alternatives, so we can measure where the different solutions overlap and where they don’t. And I wouldn’t be surprised if there was a significant amount of overlap. (We just can’t be so sure yet.)
Conclusion
We shouldn’t feel like we have to accept PEP 703. We have viable alternatives that don’t have the same downsides. I’m not opposed to us keeping free-threading, as long as we are deliberate about accepting the costs. However, we must not do it solely because there doesn’t seem to be any other way to meet certain users’ needs. That just isn’t the case.