PEP 779: Criteria for supported status for free-threaded Python

I got the impression that part of the reason people want to mark the free-threaded build supported is so that library authors will start working on making their code thread-safe and report issues caused by the free threaded build. To the extent that they can’t do that without better documentation, I think it’s relevant.

If we can agree that “encouraging library authors to check their code for thread safety” isn’t a motivating factor for making free threading “supported”, then I’m happy to agree that this discussion is unrelated to the support status of free threading.

Ok, so to repeat what was already said: you can already write multi-threaded code even with the GIL. The GIL exists precisely so that users can write multi-threaded Python code without crashing the Python runtime. Many people do that, for all kinds of purposes.

What removing the GIL does is enable more parallelism while keeping the Python runtime crash-free [1]. It does not change anything to what library authors are expected to provide, guarantee or test. Library authors are free to make their code thread-safe or not.

I would say there are two motivating factors:

  1. Have them check that their library works (e.g. passes its test suite) on free-threaded Python. This is not the same as being thread-safe. This is mostly about not depending on subtle scheduling details of GIL-enabled Python (and, perhaps, not doing gory low-level things with ctypes etc.).
  2. Have them distribute wheels for free-threaded Python.

  1. assuming all bugs are fixed, of course :slight_smile: ↩︎

It still affects uptake of developers trying to make their code no-GIL compatible. It’s hard to deal with free-threaded issues when have this nagging voice in the back of your head asking you awkward thread safety questions about code that may or may not be broken, possibly with or without the GIL, but you have no idea which. You either wrap everything in locks, wrap nothing in locks and hope someone else will tell you when you’re wrong or put off the whole free threading migration off until you have more information.

It does not really, because “no-GIL compatible” does not mean “thread-safe”. See above, and please read carefully, because we’re rehashing the same thing over and over.

  1. Is there any need to do this if the library’s test suite doesn’t create multiple threads? Assuming not, this is simply a case of adding the free-threaded build to CI, which is fairly trivial, as it appears that the setup-python action supports the free threaded build.
  2. Given that I’m only talking about pure Python projects here, this isn’t relevant as pure Python wheels are architecture-independent.

OK. So if the authors of PEP 779 agree that they don’t expect pure Python projects to do anything[1], then I’m willing to drop this matter. I still think that the messaging around free threading is setting expectations too high in the mind of the average user, but that’s a different topic.


  1. with the exception of enabling a free-threaded run in CI if the project test suite uses threads ↩︎

I’m aware that no-GIL compatible != thread-safe but it still has an impact. As a library author, I find it hard to treat free threading thread-safety as actionable when I don’t even know what GIL-enabled thread safety means. Knowing that such issues were probably there all along doesn’t help.

Yes, I think it should be trivial for most libraries out there.

My experience (both as a library author and a CPython maintainer) is that GIL-enabled thread safety has never really existed for Python code. If you’re writing C code, then you will need to be more careful about shared state (assuming you have any) or borrowed references.

I’m not sure either. I’m asking what it means from the perspective of a Python programmer that list.append is atomic. The suggestion was to add that statement to the docs but I’m not sure what relevance it has to someone writing Python code: what exactly is the observable effect of list.append being atomic?

In the dict.setdefault case the observable effect of it being atomic is that it guarantees to return the same value to every thread that calls the method with the same key.

Just a note that currently it’s supported only on the main branch. It should hopefully be in the next release.

IMO while it’s nice that GitHub maintains setup-python, the Python community might want to consider an alternate implementation or a fork so the community has more control over shipping new Python versions via the action. Right now it takes quite a while for new versions and bugfix releases to show up.

You can also use setup-uv, which has supported free-threaded Python for a while and recently gained support for specifying a python-version argument, which makes it somewhat more of a drop-in setup-python replacement.

3 Likes

I would stress once again that it is generally unrelated to the discussion about free threading. Do you want to open a separate discussion for this, if it is important to you? There’s too much derailing going on here.

1 Like

The effect is that if multiple threads are appending to the same list, the list won’t be corrupted, and all values will end up in the list. If it wasn’t atomic, that would be an invalid use of list.append, and it’d require a user-managed lock, but since it is, that’s totally acceptable without the lock (although not recommended at high frequency).

2 Likes

The python runtime does not crash.

As I understand the free threading work, it give you a python runtime that will not crash itself. So either with the GIL or without the GIL the python runtime will not crash.

If you have python code or python extension that has threading bugs those bugs will exist in the GIL or freethreading pythons. But the GIL will have made it harder to hit the threading bugs in the past.

What I expect to see is users that do not understand threading or race conditions to get into trouble more often with free threading and will need help.

5 Likes

This is the thing that’s worried me from the start.

It’s not that Python becomes less thread-safe, it’s that the absolute number of thread-safety issues that occur will rise because people will start writing parallel code with only the lowest level synchronisation primitives available.

Historically, Python had so little benefit to letting multiple threads manipulate shared objects that people would invest their time elsewhere. “Restructure your algorithm to fit numpy/etc. and use their parallelism” was the instruction (and I imagine still will be the best approach for many cases), and the parallelism experts would work on the building blocks for the rest of us to merely assemble. With free threading available, we’ll certainly have a lot more people trying to trivially partition an algorithm across threads (like Paul’s example) and they’ll start discovering that it was always unsafe.[1]

Being ready to deal with these questions/complaints/concerns/etc. is a part of this process that we should not avoid or ignore, IMHO.


  1. Personally I like subinterpreters here because they enforce controlled data sharing between partitions, meaning the obvious way to do it is likely to have little to no contention. But that’s very off topic for this thread, so I’m only mentioning it in an un-quotable footnote so nobody replies to this part :slight_smile: ↩︎

11 Likes

Thank you. This is precisely the point I was trying to make with my example, and I completely botched it. I appreciate you making the point far more clearly.

Agreed. This sounds like a more reasonable way to word things in order to better manage expectations.

Not sure whether 3.15 would be a suitable target for the “supported” release, though. There’s still a lot of work ahead of getting to that point. My list was just what I came up with in half an hour, there’s definitely more.

3 Likes

Thanks for the PEP! I appreciate the effort to set clearer expectations surrounding next steps.

Assuming you’re able to share, are there any specific learnings or experiences after joining the team that led to such a dramatic shift?

This comment just caught my attention, since you had already thought quite long and hard about this issue as an SC member (and fellow core dev). I myself fall into the “reasonably-skeptical-of-the-tradeoffs-but-it-feels-inevitable-so-I’ll-just-deal-with-it” camp, and would honestly love the peace-of-mind from being convinced that it will “absolutely work out” for us as CPython maintainers. :slight_smile:

The lottery doesn’t worry me much. I’m personally a bit more concerned that most (not all) of the current free-threading work and maintenance in the CPython core is, to my knowledge, thanks to Meta’s pledge to commit three engineer years through the end of 2025. That post notes that in addition to landing PEP 703, this includes “ongoing improvements to the compatibility and performance of nogil CPython”… which I guess means a similar level of involvement that we saw from Meta engineers pre-PEP-703, but now with an emphasis on free-threading.

I know these people, and I don’t doubt their personal commitment to CPython’s success one bit. But they’re working full-time on maintaining the free-threading build right now, and I think we as a project should make sure that we can handle the ongoing burden once they’re no longer able to do this ~120 hours a week, especially once people start “demanding” that we keep it working. It might even make sense for the PEP to address this explicitly.

10 Likes

I was blocked by work yesterday, this thread has advanced very quickly, but since you addressed some of my points directly, here’s my response.

The issue is not whether 703 is a good idea or worth doing – I support that goal and want to see it succeed. But given its ~unprecedented impact, a lot of the discussions around the acceptance of PEP 703 dealt with what how the community can handle such a change. To my mind, the technical feasibility was already proven at that point (even if details still needed a lot of work to hammer out).

We’re now in a phase where folks have proved the technical feasibility also for complicated third-party libraries, but we’re still very far from being able to gauge “enough community support”, when basically only the very lowest-level libraries have so far been made compatible, the vast majority of packages has not even attempted (much yet released) a compatible version, and key pieces of Python’s installation infrastructure[1] are still missing.

Declaring the implementation stable is fine, but it’s way too early IMO to determine the overall outcome of the nogil experiment on an ecosystem level – if there was a progress bar of compatible packages, we’d be in the low single digit percentages at best.

I think the communication from the SC has been clear that nogil is the intended future (barring unforeseen problems that we’re all trying to determine and shake out), so I think every community-driven project should take it as a mandate to attempt compatibility, and provide feedback where things don’t work. In some sense we’re missing a rung between “experimental” and “stable”, so I like the separation (exp.-alpha, exp.-beta, stable) that @ncoghlan proposed.

No, I’m not equating this with the default. What I’m saying is that stable releases are expected to have broad support. To this end, regular CPython releases get a lot of upfront testing, e.g. the Fedora team doing mass rebuilds and raising PRs to fix issues long before the release (to a degree, conda-forge started doing the same). I’m not aware of any large-scale effort so far to do this for the nogil builds, and I’m expecting (from previous experience) a lot of issues to be surfaced by this. Granted, a large portion of those will be routine and “just need to be fixed”, but given the vast amount of packages out there, we may also uncover trickier problems that require deeper surgery, or even changes in CPython.

When I say declaring things stable is premature, I mean that – at the very least – there IMO needs to be an attempt at a broad nogil rebuild in some distribution(s), to see what problems arise at scale, as well as the rate at which the community is able to absorb the required fixes.


  1. like being able to specify dependency constraints ↩︎

5 Likes

I agree. It seems like the crux of the issue is a disagreement about whether the new free-threading builds need to be more explicit about what is and isn’t guaranteed than the old GIL builds. I certainly can understand how it seems “unfair” to require that additional explicitness, but I think in practical terms it must be provided.

The fact that various things weren’t guaranteed but still worked with the GIL is an unfortunate consequence of a lot of things being loosely or incompletely specified in Python. We shouldn’t perpetuate that problem by allowing a similar ambiguity in the new builds. It will just lead to confusion and pain down the road.

Maybe part of the issue here is a difference in how we’re interpreting these terms like “experimental” and “supported”. Like, your statement above gives me the impression you see “supported” as closer to “experimental” than to “default”, but I would read “supported” as very close to “stable enough to be the default”. That means we shouldn’t call it supported until we’re within sight of making it the default.

That’s not what I’d call a “for users” framing. :slight_smile: PEP 387 is not user-facing documentation, it’s a specification of internal development processes. Most users don’t even know what PEP 387 is. What they know is what is in the docs on docs.python.org (if we’re lucky!). Even insofar as they do know about PEP 387, the issue is not the breaking of backward compatibility within the future evolution of free-threaded builds; it’s the breaking of compatibility when migrating to the free-threaded builds from the GIL.

As @pf_moore said earlier, this doesn’t mean that code can’t break during that transition, but it means that there has to be guidance on how to do it. In my mind, for the free-threading build to be “supported”, that transition has to also be explicitly supported and mapped out. That may mean a lot of detail about what you used to be able to get away with in threaded code that you no longer can get away with, and yeah, maybe that’s unfortunate, but I don’t see any way around that.

2 Likes

FWIW, 🧵 Free-Threaded Wheels does exist. The “333/360” headline figure is somewhat misleading[1] though, since the bulk of that proportion is the 310 pure Python packages in the top 360 (by download volume). Of the 50 packages with binary extension modules, 23 are already shipping free-threading compatible wheels in one form or another[2].


  1. Idea: only show the top 360 packages with binary extensions · Issue #15 · hugovk/free-threaded-wheels · GitHub ↩︎

  2. the scan only detects that nominally compatible wheels exist. It can’t tell if they re-enable the GIL or not, or if they actually work correctly. ↩︎

Thanks! I’m aware of that tracker; I believe we can agree though that the python community is much much larger than the top 360 packages by download volume[1]. There’s too many fundamentally important packages not on that list even off the top of my head (pytorch, cupy, spacy, geopandas, etc. etc.), but concrete examples obviously skew towards specific fields pretty quickly.


  1. quite heavily skewed towards workflows that generate lots of downloads, like cloud infrastructure (as also indicated by boto being at the top). ↩︎

4 Likes