Why is subprocess.list2cmdline not documented

Hello!

I have been using subprocess.list2cmdline for quite some time now, as its very useful, but the does not show up in the python docs, is there a reason for it, or is this going to be my first python PR :stuck_out_tongue: ?

It looks like this function is meant to be internal, as indicated by the comment at cpython/subprocess.py at 4bb1dd3c5c14338c9d9cea5988431c858b3b76e0 · python/cpython · GitHub.

2 Likes

Given that it gets used anyways we may as well document it properly. Feel free to loop me, gpshead, in on your issue/PR.

1 Like

I’d be curious as to what use case there is for it outside of preparing the arguments in subprocess.Popen. I don’t think we should make it public (and hence supported) without a good understanding of why people want it.

I feel like we decided at some point that it’s not cross-platform enough to be worth documenting? shlex.join is documented (and also accurately documented as not-very-cross-platform).

subprocess __all__ is incomplete · Issue #55047 · python/cpython · GitHub is the issue where this was previously discussed

Thanks. That issue also simply says someone was using the API, but didn’t clarify why.

To note, this also came up in

and the related

I’m aware of the old issue about it being undocumented and non underscore-prefixed. The reason I suggest we just document it at this point is that realistically it has a well defined purpose and behavior. It gets used. So it is public and not meaningfully changeable even if we wanted it to (a deprecation period for a name change would be required). Its name is not what we’d choose if it had been intended to be a public API given its platform specific (Windows) nature today, but this is hardly the only instance of that in our standard library.

Docs wise I’d just call out this oddity in its documentation: (1) It’s behavior is Windows specific regardless of name, (2) and users of subprocess should not call it themselves as subprocess.Popen calls it for them. Potential real world uses: Crafting a command line to display to a user, and as CAM pointed out: use with the legacy os.spawn* APIs.

I would continue to leave it out of __all__ in subprocess.py. Just adjusting the existing code comment about that to reflect modern reasoning in this discussion.

Alvaro: we’re curious what you use it for. :slight_smile:

2 Likes

Gregory P. Smith says:

Potential real world uses: Crafting a command line to display to a user,
and as CAM pointed out: use with the legacy os.spawn* APIs.

I have used this API ever since converting code to use subprocess, for
debugging, and to show the user the actual Windows command line to be
executed, given their inputs (which impact, but do not directly
contribute to command line).

I forget after so many years how I discovered it, to know it was
available. Probably from the python-dev mailing list, when the
subprocess module was being created.

1 Like

I love this function, sometimes i go to this funtion more that shlex.join, but thats on me because thats more musscle memory than a real argument. Now, in the real world, i’ve seen this function been utilized at meta in quite a bit, but i think the usages can be summarized in 2 ways:

  • debugging: just as a way to print the command thats about to get executed… something like logger.debug(f"about to execute: {list2cmdline(cmd)}").
  • when constructing scripts list2cmdline its useful for outputting individual lines, because unlike shlex.join, list2cmdline can take Path arguments and the output its not quoted, so shell variables are usable.

a search in github shows its been used by projects, but can attest if that usage is sane or not :stuck_out_tongue: (saw someone overloading subprocess.list2cmlist to their own private function… i take no responsibility for that)

i think i agree with you, if the method is there, and people are using it, its better to have documented. but its also fine not having it documented, and have the reason of it be part of the docstring :slight_smile:

1 Like

It should be documented that list2cmdline() is not ensured to work with shell=True. The CMD shell has its own command-line parsing rules. For example, its escape character is “^”, except the “%” character in a command line can only be escaped using %^, not ^%, and not even the %% trick that works in batch files. The reliability of this trick depends on no environment variable names starting with “^”. Also, it doesn’t even work right in a quoted string since “^” is a literal character in that case.

>>> _ = subprocess.call('echo %%windir%%', shell=True)
%C:\Windows%
>>> _ = subprocess.call('echo ^%windir%', shell=True)
C:\Windows
>>> _ = subprocess.call('echo %^windir%', shell=True)
%windir%
>>> _ = subprocess.call('echo "%^windir%"', shell=True)
"%^windir%"

That entails mapping list2cmdline() to each item separately. That’s potentially a problem. As documented, the first item in the argument list should never contain literal double quote characters, in which case list2cmdline() should raise ValueError. A quote() function could be implemented to support os.spawn*().

4 Likes

It’s not clear to me why there are separate pieces of functionality here in the first place. Both are intended to solve the same underlying problem - taking a list of arguments for a command, and turning them into a string that would be a valid command line for a shell - right?

Why not have platform-agnostic tools to do this - functionality that can build a valid command line for any specified OS, and which defaults to formatting it for the current OS? You know, like how pathlib works?

And why duplicate the functionality in subprocess, if shlex.join will do the job? If it doesn’t, what use is it?

How a command line is interpreted doesn’t depend just on the OS. You can have multiple diffetent shells installed, and the argument manipulation function can’t know which one you’ll use, or wheter you use one at all. Unless you tell it, and at that point you could have a different set of functions for each “supported” style. And that’s pretty much the status quo, with varying levels of “support”.

3 Likes

The shlex module is designed to support the syntax of Unix shells, not Windows.

On Windows, each process is responsible for parsing its command line into arguments. Most programs use the common C runtime argv parsing or WinAPI CommandLineToArgvW(). That’s what subprocess.list2cmdline() supports. In contrast to shlex.join(), subprocess.list2cmdline() currently ignores how the shell would handle the command line.

A quote() function that supports the CMD shell’s parsing rules would still have to support the C runtime’s argv parsing rules. It’s not trivial due to the need to prevent expanding environment variables when arguments contain the % character. It requires one to make assumptions about environment variable names (e.g. no name starts with ^, or no name contains double quote characters) and toggle quoting around % characters.

1 Like

I think a windows-cmd-prompt version of shlex join/split would be fair to add as a sibling to shlex. I saw someone tried this already: GitHub - jdjebi/winshlex … no idea how well it works.

I personally use list2cmdline in one direction and shlex in the other on Windows… as long as i don’t do something too weird, it tends to work.