Thread safe warning filters

nas · January 3, 2025, 10:54pm

The warnings module currently doesn’t offer a thread-safe (or async-safe) way of filtering warnings. Most seriously, the catch_warnings context manager is not thread-safe.

Fixing this without breaking existing code is hard to do. There is much code that accesses, mutates and even assigns warnings.filters. So, my initial strategy to fix this was to add an additional layer of filtering, using a contextvar. See my draft PR for a prototype. The design is similar to what’s done for decimal contexts. There is a local_context() context manager that can be used like this:

with warnings.local_context() as ctx:
    ctx.simplefilter(...)
    ...

In order to provide easier forwards compatibility (code that can work with both old and new versions of Python), the context class supports an API that overlaps with the warnings module itself. That way, you can do things like this:

import warnings
try:
    warn_ctx = warnings.get_context
except AttributeError:
    def warn_ctx():
        return warnings

with warn_ctx().catch_warnings():
    warn_ctx().simplefilter(...)
    ...

with warn_ctx().catch_warnings(record=True) as log:
    ...

Some initial feedback on this approach suggests that the try/except block is ugly and there are many more uses of catch_warnings() compared to code that accesses warnings.filters. Searching the top 500 PyPI projects, there are only about six (6) lines of code that touch warnings.filters while there are 700+ lines that use catch_warnings(). So requiring all that code to switch to a new thread-safe context manager would seem quite disruptive compared with fixing the code that monkeys with warnings.filters.

Based on this observation, I created an alternative PR that makes catch_warnings() thread-safe. It does so by changing it to use a contextvar (thread-local, async safe). The amount code changes to make the CPython unit test suite pass was actually quite small. There are a number of potential issues with this approach though. Code like this would not work as expected, for example:

with warnings.catch_warnings():
   warnings.simplefilter(...)
   ...
   warnings.filters.pop(0)
   ...

In current CPython, threads don’t inherit the context from the spawning thread. So your warning filters would go back to the default when you spawn a thread. I suggest that we change that so threads inherit the context by default.

gpshead · January 3, 2025, 11:59pm

When I put on a “what would this do to existing code?” hat and ponder the catch_warnings() use of a contextvar and new threads inheriting a copy of the current context by default changes… I’m expecting that will trip some existing code up, but I don’t have specific examples of how in mind.

pondering: When a concurrent.futures thread pool is being used, should the context also be plumbed into that from the location of the pool work item submission? It feels like an overreach. (but the threads would now inherit a copy of the context active at pool creation time - that feels more natural)

I left a perhaps gross practical suggestion on that PR: Only change this default behavior immediately in still experimental free-threading builds?

pitrou · January 4, 2025, 9:35am

IMHO it should. It’s quite “natural” that the context for a pool-submitted task is inherited from the task-submitting code, rather than from the pool-creating code.

By the way, what kind of state is currently stored in contextvars?

nas · January 6, 2025, 6:27pm

In CPython, its basically just decimal that uses it. Looking at the top projects from PyPI, I got the following list of references to ContextVars.

steve.dower · January 6, 2025, 9:57pm

That probably just means nobody has bothered to port anything else over to it yet.

Anything that currently relies on thread locals^[1] really ought to use contextvars now (and we should be ensuring that contextvars are separated on separate threads, i.e. they implement the equivalent of thread locals for us).

Python ones, not C/native thread locals. ↩︎

ngoldbaum · January 6, 2025, 10:13pm

You might also want to grep for PyContextVar_New, NumPy uses that. That said, I don’t see any uses on GitHub outside of NumPy, CPython, and one use in HPy.

nas · January 6, 2025, 10:24pm

Greg’s comment on the PR to change context inheriting:

This default change is a behavior change. Expect that to trip someones existing code up. I don’t have the context (pun intended) as to how disruptive that could be. Per the PEP-387 breaking change policy we’d want to wait at least two releases. Which isn’t so satisfying given the reasons you want this.

But a compromise could be considered (unsure if this is really wise) if needed: Change the default sooner than the deprecation when running in an experimental free-threading cpython build?

Changing the default behavior on the free-threaded build is an idea I was thinking of as well. Willingly creating free-threaded vs non-free-threaded behavioral differences seems bad, in general. However, maybe it’s acceptable if non-free-threaded behavior eventually matches?

We could make a new flag that affects both the behavior of Context inheritance and also warnings.catch_warnings. For people who want to opt-in sooner, set the env var or -X option, e.g. PYTHON_CONTEXT_INHERIT=1. For free-threaded builds, maybe it’s okay if that was turned on by default?

kumaraditya303 · January 8, 2025, 1:17pm

I think the current behavior is a bug rather than the intended behavior.
For example the following code feels “natural” to work but it doesn’t:

import asyncio
import contextvars

req_id = contextvars.ContextVar("req_id")


def work():
    print(f"Processing request {req_id.get()}")


async def main():
    req_id.set(42)
    loop = asyncio.get_event_loop()
    // performs blocking operation in another thread
    await loop.run_in_executor(None, work)


asyncio.run(main())

Error:

  File "/home/realkumaraditya/cpython/main.py", line 7, in work
    print(f"Processing request {req_id.get()}")
                                ~~~~~~~~~~^^
LookupError: <ContextVar name='req_id' at 0x7facd5892b30>

I think this behavior should be changed to inheriting the context by default regardless of it being used to implement thread-safe warnings.

guido · January 9, 2025, 4:29am

Curious what @yselivanov’s view is on this.

graingert · January 9, 2025, 3:30pm

Bear in mind that loop.run_in_executor would still need to be changed with this new behaviour because otherwise the context will be from where the thread in pool is spawned not from where the task is submitted

nas · February 19, 2025, 5:32am

I’ve created a new PR that makes the warnings.catch_warnings() context manager safe for use in concurrent programs (multi-threaded or using async coroutines and tasks). It introduces two new flags (set by command-line -X options or by env-variables). The flags will be set by default on the free-threaded build of CPython. I added a new section to the “warnings” module help.

nas · March 11, 2025, 11:26pm

FYI, I’m looking for any additional feedback on my PR to change the warnings module. I’ve added some additional sections to the free-threaded howto docs that explain the behavioral differences with threading and warnings.

nas · April 1, 2025, 7:50pm

Just a heads up that I’m planning to merge this PR sometime before the first 3.14 beta. I’ve done some testing on 3rd party libraries and their unit tests pass with the change. Most encouraging is that the pytest context manager that uses catch_warnings() still works.

ngoldbaum · April 1, 2025, 8:13pm

Out of curiosity, is it also now thread-safe? Despite the warnings at Flaky tests - pytest documentation I’ve never actually seen a flaky test due to pytest.raises and AFAIK flakiness due to pytest.warns is because warnings.catch_warnings isn’t thread-safe. It’d be really nice if the pytest.warns issue was also as rare as any possible issues from pytest.raises.

nas · April 1, 2025, 9:28pm

That’s a good question but I don’t know the answer. I was looking specifically at pytest.warns() . It’s built on top of WarningsRecorder which is a subclass of warnings.catch_warnings. My testing consisted of verifying that unit tests that use pytest.warns() with the new context-aware warnings module still pass as before. I checked numpy and scipy, if I recall correctly.

Looking at the pytest source, I don’t see anything obviously non-thread-safe (e.g. manipulation of module globals, for example). Doing a proper code audit would take some time. I suppose it would be interesting to write some pytest cases that run inside concurrent threads and then see if those functions behave as expected.