I don’t see any of those problems as big as trying to guess is 42 a number or a string in an environment where you cannot reliably use quotes. More precisely,
type hints can be added and they have also other benefits,
unions can be handled by trying conversion with all types in the specified order (e.g. with int | float try integer conversion first and if it fails try float), and
ABCs can be handled by converting to concrete classes (e.g. with Mapping convert to dict).
Let’s imagine we come up with a practical, generalized approach that interprets functions and makes them usable as command-line. Then, sometime later, someone proposes a better heuristics. Now, do we know whether our shutil 1) works exactly the same as before, 2) breaks slightly and requires changes such as hints, or 3) breaks silently and end-users’ system changes behavior? If we convinced ourselves by saying “we know,” sometime later, another person proposes a better way to annotate shutil. Will we know whether the changes break its command-line interface interpreted by our heuristics?
We don’t truly know unless building a regression test for shutil. Which means we are effectively mandating the outcome, not the underlying implementation.
To me, a heuristics-based solution is okay to play with when I sit in front of a desktop. But I will not use it in an automated system. My proposal here in this thread implies a stable contract between the command-line interface and the end-users, if that is not obvious.
Considering I removed the pending label on the gh issue, here’s my 2c:
Progression bar should be considered separately honestly.
runpy is something I only discovered recently and I think I’m probably not the only one even though it’s been years I’ve been programming in Python.
Personally, I would implement python -m shutil [...] to be consistent with other modules as well (e.g., dis, symtable, etc). I don’t know why we shouldn’t do this for shutil since it appears that people would like a CLI (see below for additional discussion). Since we only expose some functions, we can use subparsers for that and that would be sufficient (a bit like git). It would be python -m shutil copy F1 F2. Typing wouldn’t be a problem.
Concerning cross-platform compatibility, we could either 1) disable flags that would not be supported everywhere or, 2) make the silently ignored, or 3) something else (?). Note that using the platform-dependent commands may not necessarily be an alternative because Linux commands such as cp or mv do not exist on pure Windows shells (on PowerShell, those commands exist IIRC).
Now, for the runpy alternative, I think it should be something we discuss separtely as it can affect any kind of module. I find passing arguments as JSON through the CLI really painful (for instance, jq is great but it’s hard to type inline jq commands because it’s barely readable). It wouldn’t be an issue if you write a script but invoking it in the terminal would be less convenient.
So there is a question: should shutil be considered a complete platform-agnostic replacement of copy and co. functions or not? if yes, it’s probably preferrable to strive towards a runpy solution while providing a minimal shutil CLI that can be invoked outside a script.
The minimal CLI would be used for inline commands. I don’t see why someone would use shutilin the terminal if they can actually use the platform command directly. Well I can see some reasons such as remembering only one syntax but apart from that I think that the caller would be more comfortable with the platform commands. Nevertheless, since I cannot know whether this is the case or not, I would be willing to simply expose the compatible commands (with or without the platform-dependent flags, that can be discussed later as well).
The runpy solution would allow complex copy/move scripts to be platform-agnostic. This also means that passing arguments through JSON would be easier. So it’s no more a blocker.
Concerning the issue itself, having a minimal implementation of shutil as a CLI, the same way as zipfile, could be a worthwhile addition. It would be possible to have a platform-agnostic solution without blocking future improvements of runpy. I am not saying that a PR can be opened and that the feature will be accepted as is, but I’d like to share my ideas and try to advance the discussion towards a “planned/not planned” for shutil, or “improve/do not improve runpy” conclusion (so that we don’t have yet-another-forgotten-idea-with-an-opened-issue).
I don’t think that a command line interface (CLI) for shutil is a good idea. The scope of this module is too broad. For example, copying a file, making an archive and getting the disk usage are too unrelated to be together in the same tool. On Linux, there are separated tools for that: cp, tar and df. Each tool has tons of options and its own manual page.
You can maintain your own CLI on PyPI if you consider it useful to you.
I did not ask for disk usage. The full list I asked for is copyfile, copystat, copy, copy2, copytree, rmtree, move, chown, which, make_archive, and unpack_archive. If you think make_archive and unpack_archive are premature for cli, they can also be excluded (and I never used them in CI so far).
As you may have noticed, using shutil from the cli does not require the user to be a Python user, and this has been the point of my post: a service to be provided to the external communities, to the users tortured by continuous integration systems. Asking them to manage using pip has been proven to be unsuccessful. I have been working for six employers so far, and every time, people have problems with pip. Wrong location of pip, bad version of pip, unsupported pip. And of course, people don’t read error messages. To them, they might rather be more willing to figure out writing .bat commands. If the goal is to provide the clean and cross-platform interface of shutil as a service, then putting PyPI in front of the potential users would be a disservice.
I don’t think it’s Python’s job to make up for shortcomings in these other tools and environments. I realize it sounds attractive and easy, but I’m still opposed.
We’re all developers, often not born to program Python without learning. The earliest wave of Python adopters were users of Perl, a language “proud” of shell-style scripting. They were followed by C and C++ users, followed by the scientific computing community, and so on. Python addressed the shortcomings of other tools, was happy to be embedded in other environments, and gained a reputation for solving problems, regardless of whether the people looking for a solution are willing to write good Python.
I see Python today reaching a stage that Perl had not reached: every mainstream consumer platform has a good Python interpreter to work with, and every CI image must provide Python. It is also the only non-shell (the language doesn’t accept bare words) that made into GitHub Actions’ shell list. Understand the fact in this way: CI users don’t want to do scripting, most likely satisfied with some simple, cross-platform file operations. If we do not satisfy them with python -m, their next candidate is shell: pwsh. If people start with PowerShell, their future solutions will lean towards it (no matter whether the “future” is a full program or a different CI, you want to replicate the semantics successfully deployed). That will be a good opportunity for PowerShell to grow a user base, just like the opportunity that Perl gave Python.
I’m not implying that we should compete against PowerShell or something. Be kind to users, and users will follow, simple as that, just like Python has always done. Python is a cornerstone in modern technology, not a new place for territorialism.
I don’t understand why you wouldn’t want their preferred candidate for shell scripting to be a shell language (either shell: bash and shell: pwsh are both cross platform). shell: python is useful for when a step involves more logic than subprocesses – a python -m shutil option provides no such benefit over purpose-built shell languages.
Unlike PowerShell subcommands, which are implemented with modules, the commands that are supplementary to bash are not truly cross-platform (not even from the same vendor on macOS). If you only use variable expansion + logic in Bash, that’s perfectly fine. And that is where python -m shutil shines - it replaces those subcommands in a predictable, cross-platform manner so that users can run them in the default sh, again in a cross-platform manner.
Any functions in particular? Yes, there are asymmetries between the various implementations of the core UNIX commands but shutil is so shallow that I don’t see that its functionality touches them.
copyfileobj(): Doesn’t make sense at the CLI level
copyfile(): Use cp (cross implementation)
copymode(): Kind of valid in that busybox’s chmod doesn’t support --reference, doesn’t make any sense on Windows
copystat(): This is so niche I don’t even know if it’s possible from the command line
copy(): Again same as cp
copy2(): Use cp -p (cross implementation)
ignore_patterns(): Doesn’t make sense at CLI level
copytree(): Use cp -r. This is actually easier than having to figure out how to expose the option to make copytree() use copy() instead of copy2()
rmtree(): Use rm -rf (cross implementation)
move(): Use mv
disk_usage(): Use du (although what script are you going to use this in?)
chown(): Not cross platform anyway
which(): Use which
A bunch or archiving methods: We already have python -m tarfile and python -m zipfile
get_terminal_size(): What script are you going to use this in?
for completeness’ sake, python’s copystat isn’t portable[1] (actually, the shutil module itself has a giant warning about metadata copying portability at the top), and already has documented platform behavior differences. It’s possible from the command line of each operating system python supports, but it’s rarely useful. In cases where it would be useful, the tools that might need the extended attributes usually have means of ignoring, consuming, modifying, and modifying using a reference. (ex. rsync)
using the definition of portable meaning the behavior is the same everywhere, which is the only useful definition for someone wanting to make shutil a runnable module to not need to branch on platform behavior in their scripting. ↩︎
vcvarsall.bat doesn’t run in sh mode, so most of your suggestions are not applicable, sometimes, not to mention 1/3 of them talk about the functions I did not ask.
Cross-platform ≠ I must support everything possible on each platform.
Cross-platform ⇒ I have a goal to achieve on those platforms, I interact with them in the same way to achieve the goal.
The goal, especially in CI, is unlikely to touch everything niche on all platforms. Copying last modification time is entirely possible a legit request, copying ACL is unlikely.
I still don’t understand what you want. Of the functions (or their options) you originally requested, which would you use that can’t already be expressed using cp/rm/…?
The standard UNIX tools are just executables so they work even if some activation script is forcing you to use cmd provided that you have them in PATH (which most CI providers already set up).
Almost all I originally requested could be expressed using cp/rm/… except 1) I don’t always have cp, rm, available; 2) when they are available, BSD and GNU utils can still have small differences, such as handling of the trailing slash. shutil has no such issue, because the behavior depends on the target not how do they look.
People do all kinds of things to make CI “work,” ranging from renaming a python to landing another pipenv. However, the point of CI has always been replicating what developers are most likely to do after cloning the project, not the opposite, forcing developers to learn what CI turns into. If an ordinary user comes and reads the workflow and leaves with an impression, “That environment is impossible!” that CI is a failure. So direction-wise, playing with PATH isn’t a good starting point.
W.r.t. the specific idea of mutating PATH, its consequence is hard to predict. Between 2006 and 2019, CMake rejected MinGW build when sh.exe presents in PATH, resulting in enormous wasted hours and tears. Ironically, that blocker may save more wasted hours if people follow your proposal because the initial motivation was to prevent mingw32-make.exe from picking up another environment’s sh.exe. Forcing developers to replicate the proposed, atypical environment, can cause more things to go out of control.