Why preexec_fn in subprocess.Popen may lead to deadlock?

I’m trying to create a new thread in preexec_fn and do some fancy stuff, bug only get an unexpected behavior.

I notice these in cpython:

static PyObject *
subprocess_fork_exec(PyObject *module, PyObject *args)
{
.....
    /* We need to call gc.disable() when we'll be calling preexec_fn */
    if (preexec_fn != Py_None) {
        need_to_reenable_gc = PyGC_Disable();
    }
......
do_fork_exec(......)
......
}
Py_NO_INLINE static pid_t
do_fork_exec(......)
{
......
    if (preexec_fn != Py_None) {
        /* We'll be calling back into Python later so we need to do this.
         * This call may not be async-signal-safe but neither is calling
         * back into Python.  The user asked us to use hope as a strategy
         * to avoid deadlock... */
        PyOS_AfterFork_Child();
    }
......
child_exec(.....)
......
}
Py_NO_INLINE static void
child_exec(.....)
{
......
    reached_preexec = 1;
    if (preexec_fn != Py_None && preexec_fn_args_tuple) {
        /* This is where the user has asked us to deadlock their program. */
        result = PyObject_Call(preexec_fn, preexec_fn_args_tuple, NULL);
        if (result == NULL) {
            /* Stringifying the exception or traceback would involve
             * memory allocation and thus potential for deadlock.
             * We've already faced potential deadlock by calling back
             * into Python in the first place, so it probably doesn't
             * matter but we avoid it to minimize the possibility. */
            err_msg = "Exception occurred in preexec_fn.";
            errno = 0;  /* We don't want to report an OSError. */
            goto error;
        }
        /* Py_DECREF(result); - We're about to exec so why bother? */
    }
......

I’m new to cpython source code, and really don’t know which lock is involved here may lead to deadlock.
BTW, why we need to disable GC before fork?

Appreciated for any information :smiley:

It is … complicated.

On Unix processes are spawned in two steps. First a program clones itself (fork, see man fork(2)), then it loads and runs an executable image (execv*, see man exec(3)). If the program calls any non-async-signal-safe function between fork and exec, it can lead to a deadlock. The prexec_fn is called between fork and exec (hence the name) and it can call unsafe function like malloc.

https://man7.org/linux/man-pages/man7/signal-safety.7.html

1 Like

Thanks a lot.
BTW, creating a new thread in preexec_fn would do nothing, due to exec would fork another process and kill the caller process of preexec_fn.

It seems the ‘fork’ version of Popen in CPython’s multiprocessing library does not using exec system call. Instead, it just executes Python stuff right after os.fork().

How does this avoid the deadlock issue you mentioned above?

class Popen(object):
    method = 'fork'

    def __init__(self, process_obj):
        util._flush_std_streams()
        self.returncode = None
        self.finalizer = None
        self._launch(process_obj)

    ......

    def _launch(self, process_obj):
        code = 1
        parent_r, child_w = os.pipe()
        child_r, parent_w = os.pipe()
        self.pid = os.fork()
        if self.pid == 0:
            try:
                os.close(parent_r)
                os.close(parent_w)
                code = process_obj._bootstrap(parent_sentinel=child_r)
            finally:
                os._exit(code)
        else:
            os.close(child_w)
            os.close(child_r)
            self.finalizer = util.Finalize(self, util.close_fds,
                                           (parent_r, parent_w,))
            self.sentinel = parent_r

    def close(self):
        if self.finalizer is not None:
            self.finalizer()

It is not safe when combined with threading. You have to be very careful or use a different type of runner.

:thinking:Got it. Thanks a lot!