Should we deprecate `threading._PyRLock` in 3.13?

sobolevn · June 1, 2023, 8:22am

While working on `threading.RLock` must not support `*args, **kwargs` arguments · Issue #102029 · python/cpython · GitHub I was thinking about whether this is good idea or not. Right now I think that deprecating threading._PyRLock is good idea indeed, here are my points.

For quite a long time threading._PyRlock does not work as it should, see https://github.com/python/cpython/issues/57906
TLDR:

If a signal occurs during _Py_RLock.acquire() or release() and then operates on the same lock to acquire() or release() it, process hangs or assertions can be triggered.
This bug has existed for ages. People who want sane behaviour should just switch to Python 3. Closing.

Moreover, since Python3 we have _thread.RLock that is used in several places of stdlib directly:

So, any other interpreters that are using CPython’s stdlib already have to adapt _thread module. Example from RustPython: https://github.com/RustPython/RustPython/blob/058f8c55005a8ae93275957952e20a0ae9bf6968/Lib/functools.py#L21 and https://github.com/RustPython/RustPython/blob/058f8c55005a8ae93275957952e20a0ae9bf6968/vm/Lib/python_builtins/_thread.py (they are using _dummy_thread for now).

And the last thing: for CPython users won’t feel anything, because _thread module is a required module for CPython: https://github.com/python/cpython/blob/201440e97a3433ad7691163c4a17d4c2721fd7c6/Modules/Setup.bootstrap.in#L22 and since it is always present, regular usages of threading.RLock won’t be changed.

The only thing that will change with new deprecation notice is direct creation of _PyRLock (right now the stdlib does not have any usages of it). However, there are some usages of _PyRLock in the wild, but mostly they are used to monkeypatch _PyRLock: https://github.com/search?type=code&q=_PyRLock() and they can easily silence the warning / ignore the missing class.

So, to sum things up, we have:

Legacy implementation with known problems
That almost no one uses
With a working default substitution that people rely on
While legacy implementation can be easily (from the technical point of view) deprecated

Are there any arguments to keep things as-is?
Are there any valid use-cases for _PyRLock that we should continue to support?

Please, share your opinions

vstinner · June 1, 2023, 11:21am

I suggest to remove it, it’s impossible to handle properly signals with locks in pure Python code. There are still race conditions around _tstate_lock in the threading module because of that, whereas it’s implemented in C.

See Thread spuriously marked dead after interrupting a join call · Issue #90882 · python/cpython · GitHub and Race condition in Thread._wait_for_tstate_lock() · Issue #89437 · python/cpython · GitHub

Currently, theading.py uses the following code:

        try:
            if lock.acquire(block, timeout):
                lock.release()
                self._stop()
        except:
            if lock.locked():
                # bpo-45274: lock.acquire() acquired the lock, but the function
                # was interrupted with an exception before reaching the
                # lock.release(). It can happen if a signal handler raises an
                # exception, like CTRL+C which raises KeyboardInterrupt.
                lock.release()
                self._stop()
            raise

IMO it should be rewritten in C to handle properly signals and make sure that the thread and its _tstate_lock remain in a consistent state.

eric.snow · June 1, 2023, 3:35pm

This might be a broader question. How much of threading.py is the way it is because we used to support “dummy” threads (for platforms without thread support)? Threads-less builds haven’t been supported since 2017 (see https://github.com/python/cpython/pull/3385), so should we consider generally what parts of threading.py would be more appropriately implemented in C at this point? (I’m only suggesting we think about it, not that we jump into a blanket a re-write.)