Concerns regarding deprecation of "fork()" with alive threads

For example, if you have a lock and you fork, the new process has a single thread, all other threads are gone. What should be the state of the lock?

The most logical answer is that the mistake was already made when “the new process has a single thread.” All threads should be forked, including the one holding the lock, which is what was implemented in Solaris.

See Thorsten Ball - Why threads can't fork for a good explanation. Even forkall obviously can’t fork things external to the process, such as the file system, network, etc so forks will always be dangerous; but I do think the forkall approach seems like a better place to draw the line between what you can fork and what you can’t than fork has. Surprised it hasn’t been standardized anywhere.

This state of affairs is really, really unfortunate for speculative execution of Python processes. I’d love to be able to cheaply take a running Python process, clone it exactly into a throwaway “sandbox” process, do Python operations that would normally update mutable state, but still be able to go back to the original process and everything would be as if I never did my exploration. This would be especially great for reproducible science.

However, because Python processes can have multiple threads (and often do, when scientific packages are being used), the very cheap fork which almost does exactly what we need won’t work. It seems like the cheapest possible way with respect to memory to achieve the same goal would be to use something like CRIU to dump the entire process into a tmpfs and then restore it twice with mmap and MAP_PRIVATE. Not only would this be incredibly slow but it would also make the resulting Python processes slower, I think, because a mmap’ed stack would not behave in the way that optimizations expect it to with respect to page faults etc.

1 Like