The warnings module currently doesn’t offer a thread-safe (or async-safe) way of filtering warnings. Most seriously, the catch_warnings context manager is not thread-safe.
Fixing this without breaking existing code is hard to do. There is much code that accesses, mutates and even assigns warnings.filters. So, my initial strategy to fix this was to add an additional layer of filtering, using a contextvar. See my draft PR for a prototype. The design is similar to what’s done for decimal contexts. There is a local_context() context manager that can be used like this:
with warnings.local_context() as ctx:
ctx.simplefilter(...)
...
In order to provide easier forwards compatibility (code that can work with both old and new versions of Python), the context class supports an API that overlaps with the warnings module itself. That way, you can do things like this:
import warnings
try:
warn_ctx = warnings.get_context
except AttributeError:
def warn_ctx():
return warnings
with warn_ctx().catch_warnings():
warn_ctx().simplefilter(...)
...
with warn_ctx().catch_warnings(record=True) as log:
...
Some initial feedback on this approach suggests that the try/except block is ugly and there are many more uses of catch_warnings() compared to code that accesses warnings.filters. Searching the top 500 PyPI projects, there are only about six (6) lines of code that touch warnings.filters while there are 700+ lines that use catch_warnings(). So requiring all that code to switch to a new thread-safe context manager would seem quite disruptive compared with fixing the code that monkeys with warnings.filters.
Based on this observation, I created an alternative PR that makes catch_warnings() thread-safe. It does so by changing it to use a contextvar (thread-local, async safe). The amount code changes to make the CPython unit test suite pass was actually quite small. There are a number of potential issues with this approach though. Code like this would not work as expected, for example:
with warnings.catch_warnings():
warnings.simplefilter(...)
...
warnings.filters.pop(0)
...
In current CPython, threads don’t inherit the context from the spawning thread. So your warning filters would go back to the default when you spawn a thread. I suggest that we change that so threads inherit the context by default.
When I put on a “what would this do to existing code?” hat and ponder the catch_warnings() use of a contextvar and new threads inheriting a copy of the current context by default changes… I’m expecting that will trip some existing code up, but I don’t have specific examples of how in mind.
pondering: When a concurrent.futures thread pool is being used, should the context also be plumbed into that from the location of the pool work item submission? It feels like an overreach. (but the threads would now inherit a copy of the context active at pool creation time - that feels more natural)
I left a perhaps gross practical suggestion on that PR: Only change this default behavior immediately in still experimental free-threading builds?
IMHO it should. It’s quite “natural” that the context for a pool-submitted task is inherited from the task-submitting code, rather than from the pool-creating code.
By the way, what kind of state is currently stored in contextvars?
That probably just means nobody has bothered to port anything else over to it yet.
Anything that currently relies on thread locals[1] really ought to use contextvars now (and we should be ensuring that contextvars are separated on separate threads, i.e. they implement the equivalent of thread locals for us).
You might also want to grep for PyContextVar_New, NumPy uses that. That said, I don’t see any uses on GitHub outside of NumPy, CPython, and one use in HPy.
This default change is a behavior change. Expect that to trip someones existing code up. I don’t have the context (pun intended) as to how disruptive that could be. Per the PEP-387 breaking change policy we’d want to wait at least two releases. Which isn’t so satisfying given the reasons you want this.
But a compromise could be considered (unsure if this is really wise) if needed: Change the default sooner than the deprecation when running in an experimental free-threading cpython build?
Changing the default behavior on the free-threaded build is an idea I was thinking of as well. Willingly creating free-threaded vs non-free-threaded behavioral differences seems bad, in general. However, maybe it’s acceptable if non-free-threaded behavior eventually matches?
We could make a new flag that affects both the behavior of Context inheritance and also warnings.catch_warnings. For people who want to opt-in sooner, set the env var or -X option, e.g. PYTHON_CONTEXT_INHERIT=1. For free-threaded builds, maybe it’s okay if that was turned on by default?
File "/home/realkumaraditya/cpython/main.py", line 7, in work
print(f"Processing request {req_id.get()}")
~~~~~~~~~~^^
LookupError: <ContextVar name='req_id' at 0x7facd5892b30>
I think this behavior should be changed to inheriting the context by default regardless of it being used to implement thread-safe warnings.
Bear in mind that loop.run_in_executor would still need to be changed with this new behaviour because otherwise the context will be from where the thread in pool is spawned not from where the task is submitted