Why is subprocess.list2cmdline not documented

aleivag · March 21, 2023, 4:57pm

Hello!

I have been using subprocess.list2cmdline for quite some time now, as its very useful, but the does not show up in the python docs, is there a reason for it, or is this going to be my first python PR ?

adang1345 · March 21, 2023, 5:10pm

It looks like this function is meant to be internal, as indicated by the comment at cpython/subprocess.py at 4bb1dd3c5c14338c9d9cea5988431c858b3b76e0 · python/cpython · GitHub.

gpshead · March 21, 2023, 5:16pm

Given that it gets used anyways we may as well document it properly. Feel free to loop me, gpshead, in on your issue/PR.

pf_moore · March 21, 2023, 5:26pm

I’d be curious as to what use case there is for it outside of preparing the arguments in subprocess.Popen. I don’t think we should make it public (and hence supported) without a good understanding of why people want it.

steve.dower · March 21, 2023, 5:26pm

I feel like we decided at some point that it’s not cross-platform enough to be worth documenting? shlex.join is documented (and also accurately documented as not-very-cross-platform).

AlexWaygood · March 21, 2023, 5:28pm

subprocess __all__ is incomplete · Issue #55047 · python/cpython · GitHub is the issue where this was previously discussed

pf_moore · March 21, 2023, 5:46pm

Thanks. That issue also simply says someone was using the API, but didn’t clarify why.

CAM-Gerlach · March 21, 2023, 6:44pm

To note, this also came up in

and the related

github.com/python/cpython

test_os fails on Windows if current directory contains spaces

opened 09:37PM - 21 Sep 17 UTC

serhiy-storchaka

type-bug tests OS-windows 3.11 3.12

BPO | [31548](https://bugs.python.org/issue31548) --- | :--- Nosy | @pfmoore, @t…jguk, @zware, @serhiy-storchaka, @zooba Files | <li>[test_os.log](https://bugs.python.org/file47163/test_os.log "Uploaded as text/x-log at 2017-09-21.21:37:05 by @serhiy-storchaka")</li> <sup>*Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.*</sup> <details><summary>Show more details</summary><p> GitHub fields: ```python assignee = None closed_at = None created_at = <Date 2017-09-21.21:37:06.004> labels = ['3.7', 'type-bug', 'tests', 'OS-windows'] title = 'test_os fails on Windows if current directory contains spaces' updated_at = <Date 2017-09-21.21:37:06.004> user = 'https://github.com/serhiy-storchaka' ``` bugs.python.org fields: ```python activity = <Date 2017-09-21.21:37:06.004> actor = 'serhiy.storchaka' assignee = 'none' closed = False closed_date = None closer = None components = ['Tests', 'Windows'] creation = <Date 2017-09-21.21:37:06.004> creator = 'serhiy.storchaka' dependencies = [] files = ['47163'] hgrepos = [] issue_num = 31548 keywords = [] message_count = 1.0 messages = ['302728'] nosy_count = 5.0 nosy_names = ['paul.moore', 'tim.golden', 'zach.ware', 'serhiy.storchaka', 'steve.dower'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue31548' versions = ['Python 3.7'] ``` </p></details> * PR: gh-99150 * PR: gh-103366

gpshead · March 21, 2023, 7:52pm

I’m aware of the old issue about it being undocumented and non underscore-prefixed. The reason I suggest we just document it at this point is that realistically it has a well defined purpose and behavior. It gets used. So it is public and not meaningfully changeable even if we wanted it to (a deprecation period for a name change would be required). Its name is not what we’d choose if it had been intended to be a public API given its platform specific (Windows) nature today, but this is hardly the only instance of that in our standard library.

Docs wise I’d just call out this oddity in its documentation: (1) It’s behavior is Windows specific regardless of name, (2) and users of subprocess should not call it themselves as subprocess.Popen calls it for them. Potential real world uses: Crafting a command line to display to a user, and as CAM pointed out: use with the legacy os.spawn* APIs.

I would continue to leave it out of __all__ in subprocess.py. Just adjusting the existing code comment about that to reflect modern reasoning in this discussion.

Alvaro: we’re curious what you use it for.

Glenn · March 21, 2023, 8:20pm

Gregory P. Smith says:

Potential real world uses: Crafting a command line to display to a user,
and as CAM pointed out: use with the legacy os.spawn* APIs.

I have used this API ever since converting code to use subprocess, for
debugging, and to show the user the actual Windows command line to be
executed, given their inputs (which impact, but do not directly
contribute to command line).

I forget after so many years how I discovered it, to know it was
available. Probably from the python-dev mailing list, when the
subprocess module was being created.

aleivag · March 21, 2023, 9:01pm

I love this function, sometimes i go to this funtion more that shlex.join, but thats on me because thats more musscle memory than a real argument. Now, in the real world, i’ve seen this function been utilized at meta in quite a bit, but i think the usages can be summarized in 2 ways:

debugging: just as a way to print the command thats about to get executed… something like logger.debug(f"about to execute: {list2cmdline(cmd)}").
when constructing scripts list2cmdline its useful for outputting individual lines, because unlike shlex.join, list2cmdline can take Path arguments and the output its not quoted, so shell variables are usable.

a search in github shows its been used by projects, but can attest if that usage is sane or not (saw someone overloading subprocess.list2cmlist to their own private function… i take no responsibility for that)

i think i agree with you, if the method is there, and people are using it, its better to have documented. but its also fine not having it documented, and have the reason of it be part of the docstring

eryksun · March 21, 2023, 10:41pm

It should be documented that list2cmdline() is not ensured to work with shell=True. The CMD shell has its own command-line parsing rules. For example, its escape character is “^”, except the “%” character in a command line can only be escaped using %^, not ^%, and not even the %% trick that works in batch files. The reliability of this trick depends on no environment variable names starting with “^”. Also, it doesn’t even work right in a quoted string since “^” is a literal character in that case.

>>> _ = subprocess.call('echo %%windir%%', shell=True)
%C:\Windows%
>>> _ = subprocess.call('echo ^%windir%', shell=True)
C:\Windows
>>> _ = subprocess.call('echo %^windir%', shell=True)
%windir%
>>> _ = subprocess.call('echo "%^windir%"', shell=True)
"%^windir%"

That entails mapping list2cmdline() to each item separately. That’s potentially a problem. As documented, the first item in the argument list should never contain literal double quote characters, in which case list2cmdline() should raise ValueError. A quote() function could be implemented to support os.spawn*().

kknechtel · March 22, 2023, 11:57pm

It’s not clear to me why there are separate pieces of functionality here in the first place. Both are intended to solve the same underlying problem - taking a list of arguments for a command, and turning them into a string that would be a valid command line for a shell - right?

Why not have platform-agnostic tools to do this - functionality that can build a valid command line for any specified OS, and which defaults to formatting it for the current OS? You know, like how pathlib works?

And why duplicate the functionality in subprocess, if shlex.join will do the job? If it doesn’t, what use is it?

encukou · March 23, 2023, 1:34pm

How a command line is interpreted doesn’t depend just on the OS. You can have multiple diffetent shells installed, and the argument manipulation function can’t know which one you’ll use, or wheter you use one at all. Unless you tell it, and at that point you could have a different set of functions for each “supported” style. And that’s pretty much the status quo, with varying levels of “support”.

eryksun · March 23, 2023, 7:20pm

The shlex module is designed to support the syntax of Unix shells, not Windows.

On Windows, each process is responsible for parsing its command line into arguments. Most programs use the common C runtime argv parsing or WinAPI CommandLineToArgvW(). That’s what subprocess.list2cmdline() supports. In contrast to shlex.join(), subprocess.list2cmdline() currently ignores how the shell would handle the command line.

A quote() function that supports the CMD shell’s parsing rules would still have to support the C runtime’s argv parsing rules. It’s not trivial due to the need to prevent expanding environment variables when arguments contain the % character. It requires one to make assumptions about environment variable names (e.g. no name starts with ^, or no name contains double quote characters) and toggle quoting around % characters.

csm10495 · April 7, 2023, 2:55am

I think a windows-cmd-prompt version of shlex join/split would be fair to add as a sibling to shlex. I saw someone tried this already: GitHub - jdjebi/winshlex … no idea how well it works.

I personally use list2cmdline in one direction and shlex in the other on Windows… as long as i don’t do something too weird, it tends to work.