In the free-threading build several iterators are not thread safe under concurrent iteration. Work is in progress to prevent concurrent iteration to corrupt the interpreter state, but it is acceptable for an iterator to return “funny” results. For example, concurrent iteration over enumerate(range(10)) can lead to values (5, 6), instead of only pairs (a, a). For more information see gh-124397 Strategy for Iterators in Free Threading and PEP 703 Container thread-safety.
To make an iterable thread-safe, we will create a new object in itertools. The current working name itertools.serialize, but we do not like the name. A rough Python equivalent is:
class serialize(Iterator):
def __init__(self, it):
self._it = iter(it)
self._lock = Lock()
def __next__(self):
with self._lock:
return next(self._it)
I would think that usually it will be passed an existing iterable, and so spelling it thread_safe() (or something like that) would be intuitive to me. i.e. “give me a thread-safe version of the iterator I want to use”.
It’s a tricky bit of naming because of how iterables are implicitly coerced into iterators in a lot of code.
The fact that it is in itertools kind of suggests that it is to do with iter, but a bit longer name to make it more explicit could be also an option.
Especially when it is imported via from ... import ....
Then somewhere far down in code:
b = thread_safe(a)
might be lacking a bit of context to figure out what it is quickly as the name is a bit too general which can be easily used to implement some local function.
Thus, I would not be against (and probably in favour of):
We should definitely do this - nobody should need to be wrapping enumerate or range.
But we probably also need a primitive for wrapping up pure-Python iterators that weren’t written with threading in mind (currently that’s practically all of them, and anyone trying to run in free-threaded is going to be blocked if they don’t have their own mechanism).
In the GIL build, we get a RuntimeError if we try to use the same generator created by a generator function simultaneously in different threads. So these wrappers would be useful even in the GIL build.
I agree that 5% is acceptable for reliable semantics. I also don’t expect raw iterator performance to be a major bottleneck in overall performance numbers.
Now if one wanted to introduce non-thread-safe iterators for performance, then I could see such a library existing on PyPI.
But personally, I think the stdlib should prefer unsurprising semantics over raw performance when we have to make a choice between the two options.