Is time to remove os module spawn* functions?

While translating the os module, I noticed some possible errors in spawn* functions docs.

The docs says:

os.spawnl(mode, path, ...)
os.spawnle(mode, path, ..., env)
os.spawnlp(mode, file, ...)
os.spawnlpe(mode, file, ..., env)
os.spawnv(mode, path, args)
os.spawnve(mode, path, args, env)
os.spawnvp(mode, file, args)
os.spawnvpe(mode, file, args, env)

    Execute the program path in a new process.

There is a misunderstanding between path and file. However, all functions declare the file parameter.

In addition, the document suggests using the subprocess module instead of these functions. This module was introduced by Python 2.4 (more than twenty years ago) by PEP 324 and concerning backward compatibility the PEP says that spawn* functions are “expected to be available in future Python versions for a long time, to preserve backwards compatibility”

I think this time to remove it is coming.

1 Like

To help you with your immediate translation confusion: The ones that say “path” in them (and don’t have a “p” in the name) take a /path/to/executable as their second argument. For example:

>>> os.spawnl(os.P_WAIT, "/bin/echo", "echo", "Hello, world")
Hello, world
0

But the ones that say “file” (and do have a “p” in the name) take, instead, the name of a command. Personally, I’d prefer to name it cmd rather than file as that more clearly says what’s going on. Significantly, this command will be searched for, and thus should NOT include a full path.

>>> os.spawnlp(os.P_WAIT, "echo", "echo", "Hello, world")
Hello, world
0

So that’s the difference, that’s what you’re trying to capture. There’s basically a matrix here. Every function is either l format (line up the arguments in the call) or v format (pass a single value with a collection of arguments); every function is either p or not-p (will search $PATH or won’t search); and every function is either e or not-e (will replace the environment, or won’t replace the environment). Thus there are eight functions.

The API isn’t great by any means (it’s closely imitating the C functions), but it works, and unless there’s a good reason to remove them, I’d be inclined to keep them.

6 Likes

(BTW, I moved this into “Ideas” since it’s not the discussion of an actual PEP document; it may also do better if moved into “Python Help” as more of a question than a proposal.)

1 Like

One issue is that they don’t use the subprocess module and so doesn’t restore signal handlers and don’t close file descriptors. The subprocess handles many small corner cases.

4 Likes

See also How to deal with unsafe/broken os.spawn* arg handling behavior on Windows discussion.

Thank you, @Rosuav.

Firstly, this sounds weird to me. But then I realized that “p” in the names refers to the PATH environment variable (the function will look for it), not the “path” parameter.

However, the real problem is that function definitions do not have a “path” parameter. They work differently, but they all define the “file” parameter.

My first idea was to fix the doc before suggesting removing these functions.

So, what about fixing the doc? Can I go ahead with this? This will align with the function definitions and, I think, will remove this little confusion about p and the “path” parameter.

I agree with you about cmd. How hard is renaming this?

1 Like

I support deprecating these legacy os APIs. Though we should try to do some research to figure out if anything remains that uses them first before deciding that.

We could also re-implement them on top of subprocess. But the potential for subtle behavior differences makes that even more painful, I’d rather reduce our maintenance burden and API surface and let people who do depend on subtle behaviors in their specific scenario discover that on their own during a deprecation period.

The os.spawn* API docstrings all use text of the form:

“Execute file with …”

“Execute file (which is looked for along $PATH) with …”

Those are accurate. The naming convention for the APIs with the ‘l’, ‘v’, ‘p’, and ‘e’ monikers all originate from the underlying C exec* APIs.

If you pass a fully specified path name to a p function it should just behave the same as its non-p variant that should cause the execp() variant under the hood to skip the $PATH check.

We cannot rename the arguments. Existing code may depend on them as keyword argument names. And changing the documentation to not match the argument name carries the burden to needing to explain in the documentation what the actual argument name is while not using that name in the docs so that people encountering a use of keyword arguments can make sense of things. A nice way to do that is a style question for docs folks to consider. This isn’t the only stdlib API with imperfect parameter names on things intended to be positional, and thus name unseen, but long predating positional-only arguments.

“cmd” is inaccurate as a “command” can be read to imply the use of a shell and potential additional parsing of an arbitrary string which these APIs not do. The existing “file” feels accurate enough as “file” is intentionally ambiguous as to whether or not it is fully specified and fully specified paths always work. Hindsight could prefer the name to be “executable”, but that isn’t going to happen because we’re not going to change the API.

5 Likes

Yeah. I had seen these docstrings. The problem is in the docs at docs.python.org.

We don’t appear to even have internal consistency on the parameter names. Lib/os.py and Modules/posixmodule.c (which appears to contain the implementation for Windows?) differ on when they call the argument “file” vs “path”.

1 Like

We could deprecate them, leave them for a couple of years before removing them, and in the meanwhile explain in the docs how to replace them (with a note that there may be subtle differences/improvements).

This is a work I’d be interested in doing.

5 Likes

Correct.

Ohh, I see what you mean. Relevant if people pass file as a keyword argument, which would be highly unusual, but technically possible.

Technically it’s a backward incompatible change (in case anyone was passing it by keyword), but given that the docs don’t even match the function now, I would support renaming it, and deprecating the functions (not removing them any time soon though). I would also be inclined to make those into positional-only parameters, but that’s again a (technically) incompatible change, so that’s going to need some careful consideration.

1 Like

We can soft deprecate these functions: Glossary — Python 3.14.0a0 documentation

A soft deprecation can be used when using an API which should no longer be used to write new code, but it remains safe to continue using it in existing code. The API remains documented and tested, but will not be developed further (no enhancement).

The main difference between a “soft” and a (regular) “hard” deprecation is that the soft deprecation does not imply scheduling the removal of the deprecated API.

Another difference is that a soft deprecation does not issue a warning.

2 Likes

While we are talking about os functions, there is still os.popen() which has a weird (“not regular”) API :frowning:

The close() method returns None if the subprocess exited successfully, or the subprocess’s return code if there was an error.
(…)
On Unix, waitstatus_to_exitcode() can be used to convert the close() method result (exit status) into an exit code if it is not None. On Windows, the close method result is directly the exit code (or None).

The result depends on the platform :frowning:

This function is bad since it uses shell=True by default (and it’s not possible to use shell=False) :frowning:

I would also suggest to soft deprecate os.popen().

3 Likes

Could os.system() be given the same treatment? Part of me dies whenever I see it used (although that is partly because seemingly most users use it either for invoking pip or doing file manipulations that should have been done with os.{rename,remove,...}).

2 Likes

Just to get the stats clear, on grep.app, filtering for Python usage only:

os.system is much, much, more common. Maybe we should more heavily discourage using it, but I don’t think it can simply be given the same treatment.


  1. I don’t have an opinion about os.popen, I just added it here as useful a data point ↩︎

1 Like

I confess I use os.system a fair bit for quick’n’direty debugging, and
bit like print(), eg:

 os.system("ls -ld %r" % (datafilename,))

The is at least slightly robust against odd filenames particularly is my
code made the filename. Not an argument for encouraging its use.

1 Like

People seem to be glossing over where these functions come from. They’re
POSIX defined interfaces, and the spawn* and spawnp* names use
path and file respectively. Example man pages from Linux and OpenBSD
illustrating this:

https://man.openbsd.org/posix_spawn.3

https://manpages.ubuntu.com/manpages/focal/man3/posix_spawn.3.html

Personally I’m against removing these functions - when one wants access
to the OS API, the os and/or posix modules are the available way to
do that.

By all means strongly encourage people to use subprocess for almost
everything in this space - it is flexible, portable and well behaved.
But removing the API to get at these OS interfaces? I’m quite -1 on
that.

2 Likes

Python r"os.spawn[elpv]*" APIs have nothing to do with posix_spawn.

On Linux, macOS, & BSD they use a not-safe-in-threaded-processes pure Python fork() + exec(). On Windows they use some windows specific spawn related C APIs.

These are really old. I encourage people to look at the code behind them before thinking they are something they are not.

https://github.com/python/cpython/blob/3.11/Lib/os.py#L835 - Linux et. al.
cpython/Modules/posixmodule.c at 3.11 · python/cpython · GitHub - Windows

3 Likes

A soft deprecation does NOT imply removal.

1 Like

Indeed. We should definitely be discouraging their use for anyone who wouldn’t have used the equivalent C functions. Getting things right with them is an unnecessary struggle, when subprocess.run is right there.

os.system is separate, I think, as that has a convenience factor that spawn* does not, and there’s no problem using system() in your own scripts on your own system. I’d bet people are more likely to get a casual system() call right than spawn*(), too, even with the quoting required.

Overall, we’re getting to the point where probably nobody should be reaching for the os module for anything unless they’re deliberately seeking compatibility with libc. And I think that’s a perfectly good purpose for that module to have. Reimplementing os functions in terms of higher-level functions is terrible layering - the higher-level ones (subprocess, pathlib, etc.) should be implemented using os, thereby saving users from having to use os directly but still allowing them to access low-level functionality.

So I’m +1 on documenting that there are better alternatives (which I believe we already do), +0 on soft deprecating (basically, we won’t try too hard to “fix” issues in OS behaviour), and -1 on removing anything from os or reimplementing it with higher-level modules.

9 Likes