Getting Rid of Daemon Threads

eric.snow · October 22, 2024, 8:27pm

Daemon threads are a regular source of pain in core development and have been for years. I’d like to get rid of them. The impact on the Python community is the deciding factor; I suspect low impact but may be wrong. Your feedback will help.

Proposal

At the very least I’d like to do the following:

[3.12+] (or earlier) clarify docs about daemon threads and add a warning (see gh-125857)
[3.14] (or earlier) add a deprecation note to the docs (“soft deprecation”)

If the impact on users is not too much then I’d like to move toward getting rid of daemon threads entirely:

[3.14] emit a deprecation warning when Thread(..., daemon=True) is called or thread.daemon is set to True (thread.setDaemon() is already deprecated)
[3.27] remove support for daemon threads

Motivation

daemon threads are a consistent cause of crashes in CPython
daemon threads add complexity to key parts of CPython’s runtime
the docs have not been clear about the use cases for daemon threads, that they should be avoided, and the alternatives

Impact

Community usage:

TBD

Existing users of daemon threads can generally update their code to get the same effect from non-daemon threads. See gh-125857 for examples.

Context

(expand)

What Are Daemon Threads?

From the docs:

A thread can be flagged as a “daemon thread”.  The significance of
this flag is that the the entire Python program exits when only
daemon threads are left.  The initial value is inherited from the
creating thread.  The flag can be set through the daemon property
or the daemon constructor argument.

| Note: Daemon threads are abruptly stopped at shutdown.  Their
| resources (such as open files, database transactions, etc.) may
| not be released properly. If you want your threads to stop
| gracefully, make them non-daemonic and use a suitable signalling
| mechanism such as an Event.

History

Daemon threads have been a part of the threading module since it was added in 1998 (Python 1.5.1). (The low-level “thread” module was added in 1992, without daemon threads.)

Why were daemon threads part of that initial API? Mostly because Java threads had them–the threading module was essentially copied from Java’s threading API. Unfortunately, in the case of daemon threads, we eventually learned that what made sense for Java did not make sense for Python.

Note that in Python 3.12, interpreters created using Py_NewInterpreterFromConfig() default to disallowing daemon threads. In 3.14, those created by InterpreterPoolExecutor never allow daemon threads. Likewise for the API in PEP 734.

The Pros and Cons of Daemon Threads

In theory, daemon threads help keep low-priority background tasks from blocking application exit. This could be especially useful when your background task calls some long-running third-party non-Python function that doesn’t offer any way to interrupt it or call it for short intervals.
Outside of that case, there aren’t real benefits over the alternatives.

However, in the last 26 years (since Python 1.5), we’ve learned some of the downsides of daemon threads:

harder to reason about what happens during finalization
adds significant complexity to CPython thread and finalization code
a consistent source of bugs, especially crashes, in CPython
conceptually, “daemon” threads is easy to confuse with “daemon” processes, which are in some ways opposites

Ultimately, the cost to CPython maintenance and stability over the years has been not insignificant and continues to be so.

One issue that has been a contributing factor is that the docs do not provide a clear message about what daemon threads are for, nor that they should be avoided otherwise. Thus, people end up using them when they shouldn’t.

Daemon Threads in Other Languages

language	daemon threads
Java	same as Python’s
C++	always (explicit join)
C	always (explicit join)
Rust	always (explicit join)
Go	goroutines-only
C#	yes
Haskell	no?
Erlang	no?
Clojure	always (explicit join)
Ruby	always (explicit join)

pitrou · October 22, 2024, 9:17pm

It would be nice to get an idea of which packages have a non-trivial use of daemon threads (perhaps using a code search on GitHub?). For example dask.distributed has a number of them – some of which I plead guilty for: Code search results · GitHub

I’m +1 on the idea of emitting a warning, but perhaps we should make it either RuntimeWarning or ResourceWarning, to ensure that people see it.

As for the removal, I think it should be far enough in the future so that people have time to migrate code away (but 3.27 seems very far away, or did you mean 3.17?).

I’m not sure these examples are very helpful. Ok, if you can do with checking a boolean flag from time to time, then you’re set. But otherwise this leaves you the choice between the private threading._register_atexit function or the very dubious PyThreadState_SetAsyncExc function?

I would certainly try to avoid PyThreadState_SetAsyncExc in production code. It may induce misbehavior by interrupting code that does not expect to be interrupted. It may be silenced by an exception-catching clause, or because the exception is raised when running a destructor. Yes, it’s not necessarily worse than daemon threads, but it’s not obviously better either (depends on the exact use case?).

pf_moore · October 22, 2024, 9:37pm

When I’ve needed daemon threads, it’s usually been the case of “Long-running, uninterruptible, third-party task” in terms of the examples in the linked issue. Basically I’ve had something that I need running in the background, but I have no easy way to terminate it short of process termination. Unfortunately, I’m on Windows, so signal.pthread_kill isn’t an option. I guess I could use the Windows Terminate Thread API, but it’s a lot of work to wrap it myself compared to just letting process termination handle things.

It’s not something I’ve needed to do often, but it is sometimes an expedient solution to a problem that’s difficult to handle any other way.

I’d be OK with removing daemon threads if we (1) documented threading._register_atexit, (2) exposed PyThreadState_SetAsyncExc at the Python level, and (3) provided a cross-platform version of signal.pthread_kill. That way we could document a reliable alternative for daemon threads, for people who either do need them or aren’t aware that there are better options.

cameron · October 22, 2024, 10:33pm

Daemon threads are a regular source of pain in core development and
have been for years. I’d like to get rid of them. The impact on the
Python community is the deciding factor; I suspect low impact but may
be wrong. Your feedback will help.

Well, in my personal codebase I have quite a few.

Most of them are little tickers, like a “live” status bar. These can
all be modified to poll a flag and exit. I’d be entirely happy for that
flag to be a global such as threading.shutting_down or something.

However some of them are workers consuming a queue. As such, they’re
blocked until a queue item arrives - I’d be quite miffed to do some kind
of horrible timed-out-queue-next so that I could interleave some flag
check. Provided my queue shutdown is nice and orderly this won’t be
necessary. I’m pretty sure I have some things where this isn’t so easy,
though I’d need to go looking.

But the other side effect is that this is a breaking change. Suddenly a
million programmes with tickers and other unimportant tasks simply…
won’t exit. Because their daemon threads will just sit around, where
formerly they would quietly expire.

I’m assuming that such a change will actually prevent creating “daemon”
threads in the first place.

As an outsider to the internals, is there a reason that daemon threads
can’t be shut down by raising some ThreadingShutdown exception?

This would allow tidy shutdown of daemon threads with the opportunity
for clean up of resources they hold open, without foisting gratuitous
polling on an otherwise blocking workflow.

Motivation

daemon threads are a consistent cause of crashes in CPython

I remain quite surprised by this, but if threads are just… cut off
them I suppose there may be a million tiny edge cases.

daemon threads add complexity to key parts of CPython’s runtime

This surprises me also. But again, I’m ignorant about the
implementation.

the docs have not been clear about the use cases for daemon threads, that they should be avoided, and the alternatives

This seems to me to stem from the rather rude “abruptly stopped at
shutdown” situation. Surely a polite exception would produce better
semantics. Followed by rudeness for those which decided to catch the
exception and not quit.

njs · October 22, 2024, 11:00pm

Trio relies heavily on daemon threads. I don’t remember the full details because thread pool design is always very intricate, but at the time I definitely came to the conclusion that daemon threads were the only reasonable option and realistically I’m not going to revisit it without some incredibly compelling reason. IIRC anyio tried very hard to avoid using them based on general “daemon threads are evil” vibes and eventually gave up and switched to using them too, though I wasn’t directly involved there and am not clear on the details.

PyState_SetAsyncExc and pthread_kill are much worse than daemon threads, IMO. Daemon threads have well-defined semantics and work reliably today; PyState_SetAsyncExc and pthread_kill don’t work and cannot work reliably. I’m not familiar with threading._register_atexit, but I don’t see how it would help, since the fundamental problem is that the only safe way to kill a thread is to kill the enclosing process, and that’s a fact about operating systems that is out of our control.

gpshead · October 22, 2024, 11:24pm

Except that daemon threads don’t actually work reliably. They’re attempting to run and use Python interpreter resources after the runtime has been shut down upon runtime finalization. As in they have pointers to global state for the interpreter. Long ago these were often static globals in the process so they got away with it. But we’ve significantly turned our interpreter state into what it always should’ve been: an allocated controlled per interpreter allocated and managed set of memory.

Now daemon threads running means they attempt to access that after it has been freed. Who likes debugging that?

pthread_kill is 100% unusable and should be ignored. It is not possible to safely forcably kill a thread.

gpshead · October 22, 2024, 11:30pm

From experience, you will find that lots of things rely on them (I’m not even including what Nathaniel mentioned) because they are temptingly convenient sounding… But the concept is at odds with being able to ever finalize an interpreter.

Realize however that even if we get rid of daemon threads, extension module code can and does spawn its own threads that are not tracked by Python. Those often do want to call into the Python runtime (at which point we dynamically create a Python thread state for them IIRC). Those are realistically an alternate form of daemon thread (I don’t believe we join them upon finalization) and those are never going to be forbidden.

So there isn’t a lot we can do. We’re already doing migitation steps like trying to have threads that attempt to reenter a finalizing runtime just hang forever because our C API surface for GIL acquisition doesn’t have a way to indicate any form of error.

njs · October 22, 2024, 11:42pm

Yeah, and IMO when these are irreconcilable the solution is to give up on finalization, not daemon threads. Finalization can’t be made reliable/deterministic in general – this is hardly the only issue with code that’s unprepared to run in an interpreter that’s half-way shut down, and finalization’s already full of weird hacks like clearing module namespaces in some heuristic order and running the GC “hopefully enough times”. I don’t mind making the attempt at supporting finalization as a best-effort you-get-to-keep-both-pieces kind of feature, but we shouldn’t give up fundamentally useful primitives like daemon threads for it.

cameron · October 23, 2024, 12:53am

And this is what surprises me. I’d have expected that the shutdown of
the runtime which involve descheduling all the threads first. IMO,
ideally by raising some (imaginary) ThreadingSHutdown exception in
each Thread. And then a little later: rude termination. And then
after that shutdown of the runtime.

csm10495 · October 23, 2024, 2:29am

Should this change be a pep?

I’ve seen them used a bunch for fire-and-forget tasks. Daemon threads are common enough that I figure a pep should be written with alternatives for various cases, etc. as part of an official removal plan.

I guess having a way to easily stop threads on shutdown would work out similarly here.

gpshead · October 23, 2024, 3:08am

A PEP? If we were going to do it… perhaps yes to record all of this.

Instead, we could wind up just doubling down on the “if threads the interpreter knows have touched it before still exist in the process at the OS level, disallow complete finalization.”

fwiw “just raise a nice exception” in the threads is a similar problem as to why we can’t do anything other than hang them when they attempt to reenter the CPython runtime: There is no error reporting mechanism at that level to allow them to see they got an exception in a GIL enabled world because there is no error handling and thus exception path available in the GIL acquire code path. We could inject a pending exception into some, but guaranteeing all doesn’t seem feasible. And this needs to be fundamentally uncatchable anyways as we can’t let a daemon thread keep the process alive. By definition. So it isn’t really an “exception” kind of thing.

(i have no idea what the free threading implications are for daemon threads other than I presume “worse” as in “more likely to crash due to things having disappeared and there’s no common reentrancy point that can check for that”)

gpshead · October 23, 2024, 3:16am

I’m very curious as to the historical why’s of that kind of conclusion from either trio or anyio. (I trust they’re real, I just want to page in more understanding)

h-vetinari · October 23, 2024, 4:16am

Sidenote: pretty sure this is assuming the acceptance of PEP 2026, with would mean that 3.27 gets released in Oct. 2027, two years after the release of 3.14.

njs · October 23, 2024, 6:20am

I know one issue is:

For various reasons, you have to do DNS lookups using the OS getaddrinfo (it’s the only thing that really knows how DNS is configured locally, cf even Go eventually giving in and adding a getaddrinfo fallback)

But, this might block for an arbitrary amount of time, and you want network connections and other things that do DNS lookups to be cancellable and responsive (obviously no-one is writing web browsers in Python, but to get the intuition, you want hitting “stop” on your web page load to actually stop, that’s a thing that should be possible to implement with an IO library)

And once you have called getaddrinfo there is no way to stop it, except waiting for it to return or exiting the process.

Therefore: the way await trio.socket.getaddrinfo(...) works is to internally kick off a call to real-OS-getaddrinfo on a daemon thread, then puts the calling task to sleep until either the getaddrinfo completes or the Python-level operation gets cancelled. If it gets cancelled, then we cut the thread loose to continue running in the background (which is safe b/c getaddrinfo is side-effect-free and will complete eventually), and the await trio.socket.getaddrinfo(...) returns immediately.

Example of why this matters: run a Python script. Realize you passed in the wrong arguments, hit control-C, expected behavior: script exits immediately. Currently with Trio this generally works even if the script did a DNS lookup, because we unwind the user’s Python code and exit, and then the lingering daemon thread in getaddrinfo gets forcibly terminated by the OS. If it was a regular thread, then the unwinding would get “stuck” and you’d be sitting there hammering control-C waiting for getaddrinfo to return. This will happen eventually – it has internal timeouts – but it might take, say, 5-10 seconds.

I think it’s actually impossible to get the desired behavior here without either using daemon threads, or else calling os._exit at the end of your script? And Trio can’t call os._exit because it’s a library.

(I assume this is part of why every OS and Java all provide something semantically equivalent to daemon threads as an option.)

I do get that the underlying issue here is that Eric has a larger project of making Python interpreters into things that can be reliably created and destroyed within a process. And conceptually I absolutely get how that’s the elegant thing, and it’s ugly and frustrating that Python wasn’t built that way from the beginning. I’m not happy to be pointing out that we’ve painted ourselves into a corner. But the reality is that for however many decades now in practice we’ve always had 1 Python interpreter = 1 process, and they terminate together. And daemon threads are a widely-used part of our public API that rely on that fact. So I think we have to cope somehow.

pitrou · October 23, 2024, 7:35am

Nathaniel J. Smith:

Example of why this matters: run a Python script. Realize you passed in the wrong arguments, hit control-C, expected behavior: script exits immediately. Currently with Trio this generally works even if the script did a DNS lookup, because we unwind the user’s Python code and exit, and then the lingering daemon thread in getaddrinfo gets forcibly terminated by the OS. If it was a regular thread, then the unwinding would get “stuck” and you’d be sitting there hammering control-C waiting for getaddrinfo to return. This will happen eventually – it has internal timeouts – but it might take, say, 5-10 seconds.

I think it’s actually impossible to get the desired behavior here without either using daemon threads, or else calling os._exit at the end of your script?

It’s probably possible by spawning your own getaddrinfo thread using C or C++ (or Rust!) code.

“Every OS and Java” made decisions about their threading APIs long before any of those issues were widely understood, IMHO.

ncoghlan · October 23, 2024, 11:39am

Possible improvements that seem feasible to me:

make allowing daemon threads in subinterpreters an off-by-default feature. Apps can then decide to turn them on (if they’re needed and the subinterpreter will live for the lifetime of the process), or leave them off and require that any operations that require them be handled in the main interpreter (or a dedicated subinterpreter with daemon threads enabled) rather than each subinterpreter handling the affected tasks directly.
outright refusing to finalize subinterpreters with running daemon threads outside the context of full Python runtime finalization
adding ResourceWarning when there are any living daemon threads when runtime finalization starts (similar to the existing warning for unawaited coroutines).

Actually dropping daemon threads entirely doesn’t seem feasible (for the various reasons given above).

barry-scott · October 23, 2024, 11:42am

That cannot be done reliably. Think of a thread that the python code called into a library written in C that blocks execution for a long time.

Liz · October 23, 2024, 1:12pm

Is there a reason to prefer getaddrinfo over getaddrinfo_a (GetAddrInfoExA on windows) or a reason why these apis arent enough for async name resolution?

I’m assuming that I must be missing a reason this isn’t sufficient given the amount of time you’ve put into trio, but I wonder if whatever reasons those are might be resolvable on the timeline for daemon thread removal.

colesbury · October 23, 2024, 2:26pm

I don’t think getting rid of daemon threads actually solves any problems and it will be a massive pain for our users.

Our interpreter shutdown process is complex and buggy, but the problems are not specific to daemon threads. We have the same issues with:

non-daemon threads when you press Ctrl-C (or otherwise have an exception during shutdown)
Threads created from C via PyGILState_Ensure() or similar

Subinterpreters disallow daemon threads, but I don’t think that improves the shutdown situation. The test_interpreters test suite often “core dumps” if you press Ctrl-C.

I think the comparison to other languages is backwards: threads in C++, C (pthreads), and Go (goroutines) behave like Python and Java’s daemon threads: non-main threads do not prevent process shutdown. They don’t have “non-daemon” threads, where the main thread implicitly calls join() on other threads after the main function/module returns.

Kwpolska · October 23, 2024, 3:54pm

trio is pure-Python. Adding C/C++/Rust code to spawn a thread would significantly increase the maintenance burden and decrease usability for users on niche platforms or on platforms the maintainers can’t easily build wheels on, as well as require writing threading code in those languages (which can be painful).

I can’t really see why daemon threads need to be removed, if they largely work and solve problems well enough in production. Finalization tends to be hard (cf. Java deprecating object finalizers 7 years ago), and sometimes patching the finalization error handler to print “Thank you for playing Wing Commander!” is better than trying to solve all the minor edge cases out there.

If this were accepted, there should be viable alternatives accessible to pure-Python code that solve typical problems currently solved with daemon threads. Also, a two-year deprecation timeline seems way too fast.