I’m going to throw a curveball here. Memory management is still something you may need to do manually. It’s just that certain cases are handled for you. When something is no longer used, it gets cleaned up.
The key is the ‘no longer used’. I’ve seen many times where little things wind up slowly (more/less) leaking memory. This matters more in long-running processes since the little bits add up.
Consider this (contrived) example:
from functools import cache
@cache
def add(a: int, b: int):
return a + b
It adds numbers and caches the result. As far as the interpreter is concerned here, the cache must be fully persisted.
We can call add
a few times, and it works great. The problem is my worker process calls it millions of times a minute with different value combinations. Let’s simulate that here:
import random
def get_random_number():
return random.randint(0, 100_000_000)
for _ in range(10_000_000):
add(get_random_number(), get_random_number())
After running that, my Python process is using almost 2 gigs of memory!
Now lets apply this same logic with a worker process (not thread yet). We could have a slow leak (or a fast one like this). When it’s really slow, its sometimes easier to just say ‘restart once a day’ or at some other interval. Typically that restart could just be a parent process killing the child and spawning a new child. That clears our leaks (for that process) and the new one can now run with less memory in use (for now at least).
There is a common misconception that memory leaks can’t happen in Python. They can and do… its just that they are at a higher level than the textbook examples in C/C++.
Now onto the ‘resource management with threads’ thought: I’m going to say that you actually can do resource management with threads, but it isn’t exactly typical and can even be detrimental in some cases.
Here’s an interesting case for using a process worker instead of threads on Linux:
On Linux our app could be running in a cgroup or some other memory-limiting apparatus. If the process with all our worker threads runs out of memory, it could be killed by the OOM-killer, ending our entire app. If each worker is a process, maybe just one (worker) process would get killed by the OOM-killer instead of all threads in a single process (at once), allowing our process to continue on. After all: the OOM-killer only kills full processes, not single threads.
In terms of killing a single thread. POSIX does provide pthread_cancel()
and pthread_kill()
. Windows similarly has TerminateThread()
. These aren’t exported in native Python because its generally considered bad practice to kill a single thread. What if a thread is midway through IO, or some other operation? What if the thread is holding something like a threading.Lock
? At least if a process is killed it can be cleaned up by the OS. It isn’t really easy for just a thread to be cleaned up. So killing a thread could leave the program in a weird state.
A better way to have an exit-able long running thread is to just periodically check a threading.Event
or something similar for a signal to finish and exit. Then in the parent, set the event, call .join()
and move on after. With long running operations, it could be tough to find a place to regularly check for the event and clean up. So that may also lead to someone using a Process
instead since it can always be terminate()
'd.
Now for memory usage as resource management with threads:
If a thread runs away wasting/leaking memory the other threads are in the same process space so even if the bad thread exits, we’re still generally stuck with possible lingering memory issues.
So how can we work around this?
In terms of the same memory space, we can sort of work around this via using both regular local and thread-local storage. Anything that is local/thread-local to the thread will be freed up when the thread exits (as opposed to global/shared memory which can be left behind).
Of course these are just some general pieces of guidance, every app is different and has its own quirks and needs.