I’m currently in a project that requires the use of threads. The whole system works the following way:
We can have N remote devices connected to a server that also works as a web server.
The remote devices need to send data through the network, so I’m using sockets.
When the server starts, a thread is launched to listen for the devices requests to connect.
On acceptance, this thread will create a new thread (or N threads if all the devices can connect).
These threads will receive the data from the devices.
Whenever there is a socket error or timeout, the thread should be killed and a new one should be created.
Everything works fine this way, except that everytime there is a connection dropout, and a new thread is created, the threading.active_count() keeps increasing.
This leads to a huge memory usage, and eventually the whole process is killed, due to lack of memory.
Also I want to note one thing, the devices are in remote places and we rely on radio link, so time to time the devices keep disconnecting. The communication between the server and the devices uses a philosophy of peer-to-peer, so both ends have a listener and a client.
The code is similar to this:
After searching on the internet, I found out that, if we use the “break” keyword, we exit the while loop. And also, by my understanding, a thread is killed after all the instructions inside the function are finished. So the break in my code, would exit the while loop and terminate the thread. For some reason it is not happening this way.
But here is a situation I forgot to mention: If I shutdown the script in the remote device and I re-run it again, the threads actually die (only the last ones created). If not, they stay alive.
Wondering if it has anything to do with the fact that I’m creating a new thread inside another thread or even if it is because I have a webserver (wsgi) running.
Suggest you add logging so that you know if in fact your thread functions exit. Put a log at the start and the end of each thread function, include the thread ID so you can tell them apart.
Ok, I found out the problem! So, the goal was to exit the threads whenever there is a timeout. For this to happen the socket needs to have a timeout set. I misunderstood the sockets documentation, I thought that when we set the timeout on the client side (on the remote device), the socket.connect function would, lets say, create a duplicate on the server side. Basically I thought the socket.accept in the server side would return a socket with the same configuration as on the client side. That is not how it works. After setting a timeout for the socket, the timeout happens, and the exception is caught, leading to the exit of the thread.
What was happening, was that the thread was always stuck in socket.recv(). Everytime the client side disconnected, the listener created a new thread. So the result was hundreds of threads stuck in socket.recv().
Now that I put a timeout, the thread counter doesn’t increase.
Anyway, thank you for helping me.
Regards