How to know what is safe in threaded code, round 2

fonini · June 30, 2024, 2:30am

This is a sequel to How to know what is safe in threaded code (a thread in which I did not participate).

I skimmed over most of that thread, but it didn’t seem to go anywhere. On the contrary: the discussion seemed to go in (very small) circles for most of the time. My impression was: on one side, a few people – aware that writing concurrent code is hard and that there are a lot of surprising pitfalls and if you want to do it right you need to know what are the guarantees provided by the system you’re using (i.e. Python) – were asking for better documentation about Python’s memory model and thread-safety guarantees; on the other side, some people argued that (my wording) Python is a high-level language and therefore it does what a naive user would expect, so that documenting it explicitly would be unnecessary and would only clutter the docs.

Right now, I don’t see any official documentation on the thread-safety of Python’s builtin types.

Did I fail to find this documentation, or does it really not exist?
Does this mean that the community’s feeling really is that “Python does whatever a naive user would expect” is enough specification?

han-solo at IRC was kind enough to point me to Library and Extension FAQ #What kinds of global value mutation are thread-safe. Even in the “dev (3.14)” version, this FAQ entry looks outdated because it doesn’t acknowledge free-threading: it basically says (my wording) “because of the GIL, operations on objects of builtin types that look atomic really are”, which again seems to point in the direction of “you can just trust your guts” as Python’s only documentation about thread-safety.

To bring this discussion back to life, let me present a few examples. For each of the examples below, I believe Python should provide official documentation somewhere either stating the guaranteed behavior, or stating that the behavior is implementation-defined. The emphasized “somewhere” here means that such specification doesn’t need to be scattered all around the docs, cluttering everything: it could be concentrated on a separate “stdlib thread-safety guarantees” page in the Language Reference for example.

1: Ordering of memory operations

This was mentioned by the OP linked above, and I believe was the most controversial issue.

global_value = 0
global_flag = False

def thread1():
    global_value = read_something()
    global_flag = True
    do_something_else()

def thread2():
    while not global_flag:
        some_work()
    print(global_value)

Is the print on the last line guaranteed to see the value returned by do_something()? Please note that this is not trivial: if thread1 assigns to the variables global_value and global_flag in this order, usually this does not imply that thread2 will see these assignments happing in the same order.

Suggested documentation update: somewhere in the docs for the threading module, I would add one of the following phrases (depending on which of them is actually correct):

The synchronization primitives offered by this module are not required if all you need is sequential consistency: the evaluation of all Python statements is always sequentially consistent across threads. For example, if you only ever use the methods set, clear, and is_set in a specific Event object, then this Event could be just a shared boolean variable.

or:

All synchronization primitives offered by this module provide sequentially consistent ordering. E.g. if you do something before unlocking a Lock, then this something will be visible to any other thread that successfully acquires this Lock after that. Note that plain Python code by default does not provide this guarantee.

2. `list(global_mutable_container)`

As mentioned here, this idiom is common when one wants a snapshot of a mutable container. E.g., instead of for element in global_mutable_container: ..., usually one will instead do for element in list(global_mutable_container): ... to avoid iterating over an object that could be concurrently modified by another thread.

Now, is this really safe, or is the race window just very small? There are other similar questions. Is the .copy() method of builtin containers thread-safe? What about dataclasses.replace? What about copy.copy and copy.deepcopy? I agree that most of these maybe are pretty much clear: if the docs don’t say they’re thread-safe, then they’re not. However, as far as I’m aware, currently a few of these are actually atomic under the GIL (constructing builtin containers from other builtin containers, and calling their .copy() methods), and these atomicity “guarantees” are being ported to the new free-threaded CPython, because otherwise there would be no way of taking a snapshot of a mutable container that you don’t “own”.

If list(global_mutable_container) is the (maybe de-facto) recommended idiom for that, why not document it explicitly as thread-safe?

3. `dict.setdefault`

This is right on the edge of what I would consider “looks atomic” in Python. It’s a read-and-update operation, so at first you might be wary. However, it’s a very simple operation on a builtin type that can’t call custom user code. And indeed, when you look at the source code, that Py_BEGIN_CRITICAL_SECTION(self) looks very much like acquiring a lock to me. I guess that this is one of the cases in which the operation was atomic because of the GIL, and now there’s a lot of code in the wild that relies on this atomicity, so the new free-threaded Python needs to keep it atomic.

If users could be convinced of either side, why not explicitly document dict.setdefault as either guaranteed or not to be thread-safe?

pitrou · June 30, 2024, 9:43am

Well, the problem is that list(global_mutable_container) itself invokes iteration under the hood, and since iteration can invoke arbitrary Python code, that it not thread-safe either. There is also the issue that the GC can theoretically run at any point (is that still the case currently? @markshannon ) and a GC run can invoke arbitrary Python code, such as finalizers.

Perhaps this could be solved by introducing a separate __list__ protocol where the container would lock itself before building the list object, preventing any concurrent mutation.

A dict lookup can definitely call custom user code by virtue of calling __hash__ and __eq__ on keys.

fonini · June 30, 2024, 1:50pm

Yes, of course, I was talking about constructing builtin containers from other builtin containers. My understanding is that this is already safe, and relied-upon to be safe internally, it’s just not documented anywhere.

Well, even if that is true, any other code that wants to access this specific builtin container which is being iterated over will need to acquire some kind lock (either the GIL, or the post-GIL per-object lock), so that this iteration is still guaranteed to see a “snapshot” of the object? Right?

And even if I’m wrong and list(some_other_set) could either fail or return an inconsistent result, ^[1] then that’s all the more reason to make that very explicit in the documentation, because it would be very surprising.

Oops, true. However, I still feel that CPython should be able to promise that if setdefault doesn’t call external Python code then it is atomic.

Again, I’m pretty much sure it can’t, because of that link to #116621 I posted above. ↩︎

pf_moore · June 30, 2024, 2:25pm

Every time this discussion comes up, people seem to have these sorts of “Oops, true” moments. That says to me that it’s harder than it looks (and decidedly non-obvious) to document the guarantees - even if we restrict the discussion to things that are guaranteed atomic by the code right now, and ignore questions of whether it’s an intended guarantee or an accident of the implementation.

So maybe the way forward is for someone (I don’t know if this is something you’d be interested in taking on) to write a “language/stdlib thread safety guarantees” document of the form you suggest, and publish it for review and correction by the community. Once there’s some level of consensus that the document is accurate and useful, then it could be submitted for inclusion in the documentation.

Otherwise, we’ll likely continue to have this sort of thread, where people discuss the situation, but nothing concrete really comes from it.

BrenBarn · June 30, 2024, 7:16pm

That sounds good, but an important question is: if after it undergoes review, a later “oops true” occurs and it turns out the document did not accurately characterize the thread safety of Python, will that be considered a doc bug or a code bug? In other words, once such pronouncements get into the docs, are they descriptive or prescriptive?

pf_moore · June 30, 2024, 7:58pm

That’s for the document to state up front. It will be easier to get approval for a descriptive document, as it doesn’t demand any sort of commitment from the core devs. On the other hand, a prescriptive document might be what people are hoping for - although I’d be inclined to say that practicality beats purity here, as with many other things.

flyinghyrax · June 30, 2024, 9:01pm

This all hits on something I learned in the last thread. ^[1] Which is that one reason the docs aren’t there yet for these things is that the questions just aren’t settled yet. I imagine that as more people try the free threaded builds there will be cases where behavior may be changed if it causes too much breakage.

So we could document whatever the current behaviors are, but would that be premature?

@fonini thank you for collecting these example cases!

caveat, of course I may have learned entirely the wrong thing ↩︎

MegaIng · June 30, 2024, 9:26pm

Some of the questions aren’t settled. But for example, the semantics of example 1 is to 100% clearly defined, and if it is ever changed a lot of stuff will break. But making this argument doesn’t seem to convince people, so I am not going to argue it again/further.

How to know what is safe in threaded code, round 2

1: Ordering of memory operations

2. list(global_mutable_container)

3. dict.setdefault

2. `list(global_mutable_container)`

3. `dict.setdefault`