subprocess._USE_VFORK escape hatch broken, fix or remove?

I’m working to generalize the strace validation that vfork is used when expected (cpython/Lib/test/test_subprocess.py at main · python/cpython · GitHub, motivated by adding a test to gh-120754: Reduce system calls in full-file readall case by cmaloney · Pull Request #120755 · python/cpython · GitHub). Unfortunately with recent strace binaries that test hasn’t been working because clone2 only exists on Linux ia64 and strace exits with an error when it is passed on x86_64 which causes the test to be skipped). In fixing it I’ve found that at some point the subprocess._USE_VFORK flag stopped working[0].

Wondering whether there is a strong preference to fix the flag or to remove it + the documentation around it. The escape hatch was added in April 2022 and backported to Python 3.10 (gh-91401: Add a failsafe way to disable vfork. by gpshead · Pull Request #91490 · python/cpython · GitHub to resolve potential undefined behavior with subprocess using vfork() on Linux? · Issue #91401 · python/cpython · GitHub)

@gpshead and @vstinner who worked on the escape hatch + strace test

[0]
Sample code:

import subprocess
subprocess._USE_VFORK = False
try:
    subprocess.check_call(
            ['/bin/true'], **dict())
except PermissionError:
    if not False:
        raise

command which can be run with subprocess (What the test runs):

['/usr/bin/strace', '--trace=%process', 'python', '-X', 'faulthandler', '-I', '-c', "import subprocess\nsubprocess._USE_VFORK = False\ntry:\n    subprocess.check_call(\n            ['/bin/true'], **dict())\nexcept PermissionError:\n    if not False:\n        raise"]
execve("<path>/python/build/python", ["<path>/pytho"..., "-X", "faulthandler", "-I", "-c", "import subprocess\nsubprocess._US"...], 0x7ffe3b1c5470 /* 65 vars */) = 0
clone3({flags=CLONE_VM|CLONE_VFORK|CLONE_CLEAR_SIGHAND, exit_signal=SIGCHLD, stack=0x79a0b6254000, stack_size=0x9000}, 88) = 260561
wait4(260561, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 260561
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=260561, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
exit_group(0)                           = ?

Off the top of my head (which is full… so does that mean much?) – I’m not aware of anyone actually using the escape hatch.

We had concerns that someone may need it during the initial implementation, but does anyone have references to people or projects that found a practical need and are willing to share those publicly?

if it isn’t behaving right, that’s a sign that we aren’t really testing it reliably…

1 Like