The note about atexit
running after threads are joined (which is true, I double checked the finalisation code) makes me wonder if we could improve things by switching that to a 2-phase shut down process.
Specifically, add a new atexit.register_early
callback list that gets executed before the thread join. Callbacks registered that way would be able to tell non-daemon threads to shut down, unlike regular exit handlers.