I think the discussions are heading into a few different directions, most of which aren’t really on point for PEP 779. I don’t mind relitigating PEP 703 (it’s not hard to show the positive impact it will have, and that it’s well worth the cost), and I don’t mind making a re-evaluation of PEP 703 part of the requirements listed in PEP 779 if the SC wants to re-visit their decision, but I don’t think it’s helpful to use this thread for that discussion. I would like to focus the discussion in this thread on the promotion from phase I to phase II.
Normally when we introduce potentially breaking changes, we don’t have phase I. We start at phase II: the feature is available, but off by default in case it breaks someone. Phase I for PEP 703 is explicitly to prove to the SC and the Core Devs that the implementation is sound. It’s also to stop people from relying on it too much, because the implementation was yet to be proven sound. I think the ‘experimental’ tag is now only going to delay and hamper further work.
Concretely, for PEP 779, I would like to hear from more people what they need from us (the people working on the PEP 703 implementation), to be convinced the implementation is sound enough (and maintainable, desirable, stable enough).
That said, I do think all the discussion has been informative so I’m going to respond to a few comments ;p
How would keeping the CPython implementation in a state of flux, with no guarantees for stability, improve that situation? As I mentioned, we have seen pushback from packages on PRs for free-threading compatibility, and I don’t blame them at all. Signing up to support this when the feature is explicitly experimental and subject to more change than PEP 387 normally allows is a big ask.
If your concern is that users will be too demanding, I’m not sure how PEP 703 is different from past features, be it asyncio or binary wheels or what not. I think it’s a good idea to remind everyone in all communication that it’s still going to take a long time for most packages to support free-threading in a meaningful way. I don’t think claiming the feature is experimental when it is not is the way to do that.
Concretely, what would you use as the criteria for determining whether this CPython feature is ready to start its life like any other CPython feature?
This sounds like you’re equating this PEP as proposing for the free-threaded Python build to be the default. I would very much interpret the “this is premature” response to be because the feature is still experimental, which is the thing this PEP proposes to change. We are not talking about making it the default here.
Warning on improper concurrent access would be a really nice thing, but it’s expensive. It’s basically what ThreadSanitizer does for C/C++. I’m not sure we’ll ever be able to make that work as a -X flag, given how much overhead that will likely require, but I do hope we can eventually create something TSan-like for Python. It’s going to need use-cases – real-world threading problems – to motivate and guide it, though. It’s not really something you can make without knowing which actual problems you need to detect.
As for adding things to a list without a lock, I just want to mention that list.append()
is safe. All the list operations are atomic. You will not lose items if multiple threads call list.append()
. (Of course, something like 'if item not in mylist: mylist.append(item)` is not safe, but that’s not something we can fix in list objects.)
I think pushing the Stable ABI to 3.15, and starting it in an early alpha, is the prudent thing to do. I don’t think we need to wait for the Stable ABI to land for PEP 703’s implementation to be considered supported, though. (I wasn’t thinking of the Stable ABI, as such, when I wrote "provide feedback on APIs and ABIs.)
Do you think Stable ABI support should be required for the PEP 703 implementation to be considered supported?

The alternative, IMO, is to present free threading as “a specialised tool that will help people skilled with threading to use it more efficiently”, so that people don’t have unrealistic expectations.
Threading is absolutely a specialised tool! Nobody should ever doubt that. I’m sorry if anything I said suggested otherwise. (Please point me at it so I can fix that.) I tried to make it clear that free-threading is not a magic go-fast button and that code will probably have to be adapted, possibly redesigned, to make full use of it. Threading is definitely very complex, and not just because of thread-safety. Writing correct threaded code is hard (even without free-threading), and writing performant threaded code is doubly so.

I think it’s important to be explicit about these things, like by saying “This build is experimental, which means [whatever] is not guaranteed” or “This build is supported, which means you can rely on [whatever]”. The PEP has some of these in the form of hard performance targets, but for my tastes too much is swept under the rug with “proven, stable APIs”. In particular everything I see in the PEP seems to be talking about internals, but I’d be a bit less nervous if some of this were surfaced to end-user documentation (e.g., “in pure Python code you can do this with lists in threads and it is guaranteed to work but you can’t do this other thing”).
I think for users, the language is pretty straight-forward (even if the implications are not): the build is experimental, which means PEP 387 does not apply. Once it’s supported, PEP 387 applies.
PEP 779 isn’t targeted at users, though, it’s targeted at the Core Devs, because it’s specifically about whether they think the PEP 703 implementation is stable enough to support. So, yes, the PEP talks about internals, because that’s the thing that matters for phase II. That’s what phase I was about.
Documenting the exact semantics of concurrent access is something we should do, although it’s also something we have to figure out. We don’t have a concrete list (yet), because it’s very much a trade-off between performance and thread-safety (or thread-consistency), which we make with real-world use-cases in mind. For example, should using the same iterator from multiple threads guarantee sequential consistency (each item is produced once, and only once)? Even the people working on fixing these issues don’t always agree Additionally, we don’t want to guarantee too much from the outset because it’s very hard to revert that kind of promise once you’ve made it.