Memory managment with threading

i came across this quote from @barry-scott in the PEP 703 post:

If you have the problem of a long running program growing in memory size then it is useful to kill off a worker process and spin up a new one.

doesn’t Python do it’s own memory management magically in the background somewhere? being new to threading as a now-i-need-to-manage-them concept, and working on a project that has seven continuously (aka: from 2months to 2 years non-stop) running threads, this quote has thrown me a bit of a wobbly: how do you kill a continuously running thread and spin up a new one without losing… uh… well, continuity? particularly in light of the other part of that same quote:

You cannot do that resource management with threads

so now i’m confused about whether or not i need memory management in my use-case (which sounds like i do given my project is also written in Python and runs for a loooong time) and whether or not it can even be done using the threaded architecture i just figured out how to use. can someone clarify this plz?

I think the first question is to answer the conditional in Barry’s quote: do you have that problem? If you have threads that are running for months to years, it sounds like you don’t have any problems with memory growth.

The problem he describes is usually one of leaking memory. Python can manage its own memory just fine [1], but extensions written in other languages are more common culprits (notably C, others too). If you aren’t able to fix the memory leak because it’s someone else’s code or just too hard, then restarting the process every once in a while is a way around it.

  1. as long as you don’t hold on to stuff forever ↩︎


I’m going to throw a curveball here. Memory management is still something you may need to do manually. It’s just that certain cases are handled for you. When something is no longer used, it gets cleaned up.

The key is the ‘no longer used’. I’ve seen many times where little things wind up slowly (more/less) leaking memory. This matters more in long-running processes since the little bits add up.

Consider this (contrived) example:

from functools import cache

def add(a: int, b: int):
    return a + b

It adds numbers and caches the result. As far as the interpreter is concerned here, the cache must be fully persisted.

We can call add a few times, and it works great. The problem is my worker process calls it millions of times a minute with different value combinations. Let’s simulate that here:

import random

def get_random_number():
    return random.randint(0, 100_000_000)

for _ in range(10_000_000):
    add(get_random_number(), get_random_number())

After running that, my Python process is using almost 2 gigs of memory!

Now lets apply this same logic with a worker process (not thread yet). We could have a slow leak (or a fast one like this). When it’s really slow, its sometimes easier to just say ‘restart once a day’ or at some other interval. Typically that restart could just be a parent process killing the child and spawning a new child. That clears our leaks (for that process) and the new one can now run with less memory in use (for now at least).

There is a common misconception that memory leaks can’t happen in Python. They can and do… its just that they are at a higher level than the textbook examples in C/C++.

Now onto the ‘resource management with threads’ thought: I’m going to say that you actually can do resource management with threads, but it isn’t exactly typical and can even be detrimental in some cases.

Here’s an interesting case for using a process worker instead of threads on Linux:

On Linux our app could be running in a cgroup or some other memory-limiting apparatus. If the process with all our worker threads runs out of memory, it could be killed by the OOM-killer, ending our entire app. If each worker is a process, maybe just one (worker) process would get killed by the OOM-killer instead of all threads in a single process (at once), allowing our process to continue on. After all: the OOM-killer only kills full processes, not single threads.

In terms of killing a single thread. POSIX does provide pthread_cancel() and pthread_kill(). Windows similarly has TerminateThread(). These aren’t exported in native Python because its generally considered bad practice to kill a single thread. What if a thread is midway through IO, or some other operation? What if the thread is holding something like a threading.Lock? At least if a process is killed it can be cleaned up by the OS. It isn’t really easy for just a thread to be cleaned up. So killing a thread could leave the program in a weird state.

A better way to have an exit-able long running thread is to just periodically check a threading.Event or something similar for a signal to finish and exit. Then in the parent, set the event, call .join() and move on after. With long running operations, it could be tough to find a place to regularly check for the event and clean up. So that may also lead to someone using a Process instead since it can always be terminate()'d.

Now for memory usage as resource management with threads:

If a thread runs away wasting/leaking memory the other threads are in the same process space so even if the bad thread exits, we’re still generally stuck with possible lingering memory issues.

So how can we work around this?

In terms of the same memory space, we can sort of work around this via using both regular local and thread-local storage. Anything that is local/thread-local to the thread will be freed up when the thread exits (as opposed to global/shared memory which can be left behind).

Of course these are just some general pieces of guidance, every app is different and has its own quirks and needs.


wow! this is gold! thanks guys. i am so glad i asked this question.