Understanding "list(set) should be atomic in the free-threaded build"

Stefan2 · March 12, 2024, 1:59pm

I just saw this issue and am wondering.

The code

def _cleanup(): 
     # check for processes which have finished 
     for p in list(_children): 
         if (child_popen := p._popen) and child_popen.poll() is not None: 
             _children.discard(p)

is said to construct a list from a set, while the set may be concurrently updated, and that in the free-threaded build, this may lead to RuntimeError: Set changed size during iteration.

How so? Is list(_children) becoming lazy, somehow iterator-like? So that it isn’t fully built anymore before the for statement starts the iteration?

@colesbury

jamestwebber · March 12, 2024, 2:10pm

The call to list itself must iterate over _children to create the list. If I understand the issue correctly, the problem is that _children can be modified by another thread during that operation, leading to an error.

That is to say, ignore the body of the for statement entirely. The error happens during list(_children)

Stefan2 · March 12, 2024, 2:12pm

And that other thread is already executing the loop body before the list is fully built?

Stefan2 · March 12, 2024, 2:16pm

Wait, you mean the “concurrently updated” doesn’t refer to the updates shown in that code (those _children.discard(p) calls)? The issue is about any list(some set)? Then I find that example misleading…

jamestwebber · March 12, 2024, 2:16pm

The other thread could be doing something totally different, this issue is unrelated to the for loop. I think Sam was just providing an example from the stdlib to show that this needs to be fixed, but it ended up being confusing.

You have two threads that use a set, call it _children. In one thread you create a list from it with list(_children). If another thread adds or removes elements from _children while you are creating the list, you get a RuntimeError because list(_children) is iterating over the set and it changed size. That’s why Sam raised the issue to make list(some_type) atomic ^[1].

(all of this is just my best understanding of the issue, I might be wrong! But I think this is it)

the problem can affect other builtin objects too ↩︎

Stefan2 · March 12, 2024, 2:22pm

Yeah ok, if it’s not about the updates in that loop, then it’s clear. Creating a list from a set seems like an ordinary thing to do (ok, probably less common than the other way around), so the way the issue is written, specifically calling out that “pattern” in the code, made me think it’s about those updates shown in that code.

barry-scott · March 12, 2024, 2:39pm

Tha claim in the issue seems wrong to me.

It assumes that the list(set) will retain a reference to the set and if the set changes then the iteration will fail.

But that is not how it works. The list is fully formed from the set and then the set is no longer referenced. You are right that there is an assumption that list(set) is lazy.

I think you can reply with that analysis and close the issue.

kknechtel · March 12, 2024, 3:04pm

This doesn’t matter.

Introducing free-threading means that another thread could modify the set during the process of the list “being fully formed” from the set. At the Python level, creating a list from a set is a single, currently-non-interruptible operation. But at the machine level, it is very many operations, and by default there is nothing to prevent another thread from using the set in any way in between.

Avoiding that sort of problem (and many others) is what the GIL got you - at the cost of making threading essentially useless for actually boosting performance (by taking advantage of additional processors).

The claim has nothing to do with the for loop in the Python code. That’s the point of the rest of the discussion in the thread.

barry-scott · March 12, 2024, 4:49pm

Totally agree with you, and I assume a lock must be used to prevent this happening.

Isn’t this a case of assuming the core devs will not do a sane jobs for free threading? Surely this is something to verify once there is a version of free threading to code review and test?

jamestwebber · March 12, 2024, 4:59pm

This topic is literally about core devs fixing the issue in CPython so that this works correctly.

colesbury · March 12, 2024, 4:59pm

What @jamestwebber and @kknechtel wrote sounds correct to me!

barry-scott · March 12, 2024, 11:02pm

Okay… I was not aware that there was enough code in place to know this would be broken yet.
I’m clearly not keeping up. I’ll step back.