Remove DummyThread from threading._active when thread dies

fabioz · December 17, 2023, 6:13pm

This has bothered me quite a bit already: If some code calls threading.current_thread() in a thread that’s not created from the threading module it lingers at threading._active forever (or until another thread with the same thread.ident is created, which isn’t that uncommon).

My suggestion would be to remove it when the thread dies.

Some sample code to do that (in threading.py) would be something as:

_thread_local_info = local()

def _remove_dummy_from_thread_active(ref, thread, tident):
    with _active_limbo_lock:
        if _thread_active.get(tident) is thread:
            _thread_active.pop(tident, None)

class _TrackRef:
    '''
    A helper class to track when a thread dies.
    '''

# When the DummyThread is created, setup an automatic teardown so that the dummy thread dies when the related cpython thread has its thread local info collected:
_thread_local_info._ref = _TrackRef()
_thread_local_info._on_die_ref = weakref.ref(_thread_local_info._ref, partial(_remove_dummy_from_thread_active, thread=t, tident=t.ident))

Does that sound reasonable? Is there any reason why that shouldn’t be done?

guido · December 17, 2023, 8:43pm

Could you explain why the dummy thread object lingering in _active is a problem for you?

Regarding your solution, I’m guessing that your solution works because local() has sufficient magic that it can track the lifetime of the underlying (Posix or NT) thread even if it wasn’t created by Python. I trust that the weakref stuff works – your code snippet looks like you took it from some library where you’ve had it working for a long time. I tried for a little bit to understand how local() works but there’s a maze of #ifdef and other indirections that makes this a bit hard to follow. Probably @eric.snow is the expert here.

Anyway, assuming you can motivate this enough and that someone like Eric (or @storchaka ?) approves of the solution I don’t see why we can’t address this, especially if you submit a PR with a test case (you might have to add something to _testcapi or _testinternalcapi to set up the proper scenario).

fabioz · December 18, 2023, 10:29am

My use-case is the pydevd debugger and I just stumbled into a bug related to that on a case where:

User created a thread using an unmanaged thread (say, thread.start_new_thread so, it’s not that difficult to reproduce), then that created a dummy thread in threading._active, then the user exited that thread and created a threading.Thread which then had the same thread.ident from the unmanaged thread (that is very common to happen). The debugger (which can intercept things very early in the thread creation using sys.monitoring) tried not to disrupt things creating dummy threads unless it’s absolutely sure it’ll be a dummy thread, so, it used threading._active[get_indent()] and ended up with the wrong thread reference (which was later cached internally in a thread local storage, which had bad repercussions because the new thread then proceeded to create a new entry in threading._active which became unsynchronized with the existing one).

Another bad thing is that those dummy threads are always re-reported as still living when in fact they aren’t (as it’s an ever-growing list), so, in cases where the user creates many unmanaged threads after having thousands of entries in threading._active makes the debugger slower on cases it has to list those threads.

My solutions at this point would be either fixing it with that patch living in the debugger to fix the standard library or not using it anymore (maybe using sys._current_frames() to identify the living threads, but still referencing it afterwards to identify thread names).

As a note, that code isn’t that tested that much right now (I’ve just created it and I’ll probably add it to pydevd to track the thread lifecycle – I can do a release in the debugger and let it live there for a while before providing a PR to Python… one change I’ll probably do is using __del__ instead of the weakref.ref callback as I’m worried that the callback could be collected before the actual reference which’d make the ref callback not be called).

guido · January 16, 2024, 11:10pm

Hey @fabioz, have you worked out the kinks in your solution yet? I agree that this is worth fixing, and if you’d like it fixed in 3.13 I suggest that you create an issue in the CPython tracker and submit a PR that fixes the issue – we’ll bounce ideas back and forth on that PR about the exact right solution. (Make sure to add me as a reviewer.)

fabioz · January 22, 2024, 2:13pm

Ok, @guido I’ve created gh-114423: Remove DummyThread from threading._active when thread dies by fabioz · Pull Request #114424 · python/cpython · GitHub for that. I don’t think I have the power to add you as a reviewer though…