Please take race conditions seriously when discussing threading

Liz · May 28, 2025, 9:17pm

@pf_moore With the exception of making and sharing an iterator, they are threadsafe in that they will not crash. Some aspects of using them have stronger properties. One of these is dict.setdefault, which people have already pointed out is consistent. As another example of an only partially stronger property: deque’s pop and popleft will currently never result in two threads popping the same item, but if you need a stable orderding with append or appendleft, you have to synchronize yourself. Some other parts of the standard library rely on that, so I’m okay with using that as an example of something else that could be documented and could be made a guaranteed property of the language, but I don’t want to enumerate too much further for someone to rely on incorrectly because the stronger properties aren’t guaranteed and I didn’t even check for all currently supported versions of CPython, let alone other implementations.

The lack of guarantees at the language level makes it very hard to use many builtin datatypes confidently right now, and that probably has to change for freethreading adoption to happen.

barry · May 28, 2025, 9:40pm

Historical precedence: IIRC when Jim Hugunin began building JPython, he and Guido^[1] had many conversations about corners of underspecified Python-the-language semantics. There was a strong desire not to define the language by the implementation^[2]. That’s not to say some things weren’t left as implementation-specific, just that each of those decisions was a deliberate choice. Other behaviors were locked down in the language semantics, which provided guidance to other alternative implementations.

I see lots of parallels^[3] here too, although I don’t have a sense of the scope of the work to do that.

and of course many others ↩︎
as opposed to other popular “scripting” languages at the time ↩︎
pun intended! ↩︎

oscarbenjamin · May 28, 2025, 10:11pm

It is good not to define the language by the implementation. However for the language to be useful as more than just a collection of possibly compatible implementations it is useful to define some things as being part of the language. At least it is good to know where the boundary is between what is or is not specified even if many things are unspecified.

barry · May 29, 2025, 12:45am

That’s exactly what I’m trying to say . I think we’re going to basically have to decide which behaviors are defined by the language and which are defined by the implementation. It’s probably a good idea to start collecting a list somewhere^[1] of Things That Will Need Deciding.

e.g. a GitHub issue if one doesn’t already exist ↩︎

pitrou · May 29, 2025, 7:41am

Interestingly, the “seealso” box in queue — A synchronized queue class — Python 3.13.3 documentation claims that deque has “atomic append and popleft operations” even though the collections module doc doesn’t say anything like that. @rhettinger

Liz · May 29, 2025, 8:12am

I wasn’t aware that was documented anywhere, does that count as a guarantee of behavior? Atomicity is what provides consistency across concurrent uses.

ncoghlan · May 29, 2025, 1:23pm

For ThreadPoolExecutor (and thread pools in general), I think the benefit statement is fairly straightforward:

with the GIL, running CPU bound code in a thread pool will still pretty much be limited to one core (outside libraries that delegate to native code and release the GIL while doing so). To use all cores with pure Python code, you have to use a process pool instead, and incur either the much higher serialisation overheads associated with data serialisation, or the higher complexity associated with cross-process memory sharing.
with a free-threaded CPython, and zero changes to already thread safe user code, running CPU bound code in a thread pool will be able to effectively expand across all available cores without incurring the overhead of communicating between subprocesses (or even the lower overhead of communicating between subinterpreters)

While you don’t get the same memory safety guarantees that subprocesses or subinterpreters give you ^[1], there are a lot of CPU bound problems where that concern doesn’t come up because the pools are processing pure call-and-response functions with no stateful side effects.

The key point I take from Mark’s initial post in this thread is to emphasise just how much work that “already thread safe” constraint is doing in my statement of the potential benefits above. It isn’t that no qualifying code exists, it’s that much of the code that is believed to qualify does not in fact do so. Since the GIL has the ability to mask many potential thread safety errors by ensuring data races always have a consistent winner, making the free threaded interpreter the default will expose those problems as multiple CPU cores get simultaneously involved.

(Where I really hope we see subinterpreters shine is in offering low overhead enforced implementations of the goroutine-style Communicating Sequential Processes model that inspired Eric in the first place, as well as the Erlang/Elixir Actor based concurrency model) ↩︎