Deprecate extensionless script files to allow (future) subcommands?

steve.dower · March 2, 2023, 11:01am

Most of you are probably aware that there’s a lot of packaging discussions going on around tooling and interfaces. A very common expectation/assumption/hope from users is to merge all the functionality into a single tool.

I’m not proposing that (yet), so keep reading before you react

Most importantly to this topic, such a tool would likely have some kind of run command to launch Python. For example conda recently added a conda run <command> command that allows them to set up various state before launching the real command (typically, though not exclusively, Python).

If such a tool became wildly popular, or even integrated into our standard distributions, it would likely replace the direct use of python/python3[.Y] by users, and we’d be getting into a vastly more confusing state of commands than we already have. I’d like to avoid that.^[1]

Top candidates for tools right now seem to be hatch (which already has the functionality) and py (which already has the reach, and is under core-ish control). But I foresee one big compatibility risk that I think we can start dealing with now to make a potential future smoother.

Extensionless script files “look” like subcommands. Which means someone can today create a run file and then python run <args>. I have no idea how widespread this is, but I think the possibility would prevent us from ever adding subcommands to the main entry point. py run <args> would behave exactly the same, which would also prevent us from adding subcommands.

I know we can bikeshed endless alternatives (“hey first time terminal user, just type py --command=RUN <args> instead” ), but the reality is virtually every other tool like this and virtually every mock up that people create for us uses plain subcommands. I’ve already had to explain to people that we couldn’t possibly have python install <package> because someone somewhere might have called their script install on purpose. That doesn’t land well, because it seems intuitively wrong to the listener, and it’s only going to get harder.

So, the idea/proposal: deprecate and warn when the script file argument provided on the command line does not contain any slashes or dots. Slashes and dots either indicate a file extension, so anything.py doesn’t get a warning, or a relative/absolute path, so ./script is also okay. I don’t want to predict which specific subcommands would be important to preserve, because there’s a 100% chance of being wrong, but I think it’s safer and easier to explain that the script file needs to have an extension or a path, rather than just a name.

There’s no need to set an end date, and it can remain deprecated but still working forever if we never come up with a use for actual subcommands, but if we deprecate now then by the time we want to use them we’ll be able to do it.

Again, the premise here is that eventually users will demand subcommands, and that we want them to be able to keep running python or py^[2] rather than some completely different tool, for the sake of the existing knowledge and material.

What do people think?

Aside: PEP 582 was my proposal to let us get the most important part of this functionality without needing a new command or necessarily breaking existing commands. ↩︎
I think it’s fairly obvious that py shouldn’t be radically more restrictive than python here, but happy to spell out why in more detail if that’s contested. ↩︎

pf_moore · March 2, 2023, 11:27am

In general, I think this is reasonable. But I come from a Windows background, where Python files nearly always have a .py extension anyway. It’s possible that Unix users might have different views. But my suspicion is that most extensionless files are scripts with Python in a shebang line, so you wouldn’t be invoking the intepreter directly anyway.

I know you said you have detailed arguments on this, but one way we could avoid this being an interpreter matter is by having py run ... be the equivalent of python ..., and py ... being a shortcut for py run ... whenever the shortcut doesn’t clash with a subcommand. In other words, bite the bullet and add a py run subcommand now, but leave the existing form as a convenience.

Then, people can use the short form for interactive and one-off use, but the py run form for reliability. Much like Powershell recommends using full commands in scripts rather than aliases like ls.

Rosuav · March 2, 2023, 11:52am

That’s mostly true, but sometimes I need to run a script using a different interpreter. So if ./2fa runs the script in the default interpreter, python3.9 2fa should run that script in a specific interpreter. So I’d be -0.25, since python3.9 ./2fa would still work, but it’d be occasionally annoying.

steve.dower · March 2, 2023, 12:16pm

It wouldn’t just be run - it would be anything that looks like a subcommand (see my comment about definitely guessing wrong what commands we’ll need). And the default behaviour wouldn’t change for anyone - py <./X.py> and python <./X.py> would still do their thing.

Ah, shebangs… presumably just running 2fa would invoke a #! /usr/bin/python line that would turn into /usr/bin/python 2fa and cause it to warn as well.

Assuming a distro-scale transition time (4+ years), is there some other shebang line we could get to? Better to change something that’s generated and inserted rather than typed regularly by humans, and the point of raising the question now is to make that transition time long enough for something like this.

pf_moore · March 2, 2023, 12:22pm

My point was that you could allow py foo to run the script foo, but it’s up to the user to ensure that the script name foo doesn’t clash with a subcommand. If the user doesn’t want to check, or knows that the script name does clash, they can use py run foo and that’s 100% safe.

But I guess that’s a breaking change, and therefore we should warn before making it. Which is the point of your proposal… So yeah, point taken.

I’d really hope that the shebang mechanism would inject the full path to the script. But given the inconsistencies between shebang implementations, maybe that’s naive? Although, if the script xxx was in $PATH somewhere, and had a shebang #/usr/bin/python, then running xxx would invoke /usr/bin/python xxx which would break because Python doesn’t search $PATH. So I think the shebang machinery has to supply an absoute path, which won’t trigger the warning.

steve.dower · March 2, 2023, 12:26pm

Thank you, that makes sense.

So it really just is Chris’s case of “I used to run python3.9 2fa directly but now I have to add the ./ like when I run the script itself”, which is legitimate, but at least it’s interactive/muscle memory. Unless we think that kind of thing exists buried deep inside scripts that will never be fixed… in which case I’d be more concerned about any change.

pf_moore · March 2, 2023, 12:30pm

I think the risk is low.

It’s hard coding the (non-default) interpreter, so it would typically specify an older interpreter, which wouldn’t issue the warning anyway.
It’s “only” a warning, and the fix is as easy as adding ./, so it’s not onerous.

Yes, it’s a breaking change, but far from the most disruptive we’ve ever done.

malemburg · March 2, 2023, 1:06pm

There are a few more use cases you’d have to take into account:

On Unix, you typically put:

#!/usr/bin/env python3

into the shebang line. The env helper puts the name of the script into sys.argv (not the absolute path) and it’s not uncommon to create shell binaries without .py extension (a popular example is docker-compose).

It’s also not uncommon to put code into package directories with a __main__.py file. These are then run with just the package name:

python3 myapp

Again, sys.argv only has the package directory name, without slashes or trailing .py when called from the parent dir.

Wouldn’t it be easier to move such sub-command functionality to the py helper, instead of trying to add this to python itself (and then make py a first class citizen on Unix as well) ?

steve.dower · March 2, 2023, 2:09pm

Much easier, but then we also need to explain why I can python myapp but not py myapp, which seems an unusual edge case to have to cover (and will further push people away from the python command to avoid the inconsistencies).

In any case, developers will look to our consensus for these separate apps. If we say "it’s more important that users don’t have to type python ./myapp, then they’ll want to follow us, so we’d have to say “it’s important for us, but everyone else should do something different”.

We don’t get to make that decision, though we could certainly recommend it. One of the common requests is for a tool to do the download and install of Python itself, which I suspect goes so strongly against how most Linux distros work that they wouldn’t want it unless they can provide the builds that it gets (which goes so strongly against what the people asking for it want that they wouldn’t use it, and everyone loses).

encukou · March 2, 2023, 2:37pm

Sorry if this is going off-topic, but…

Whoa. What do the people want here? Who should provide the builds for a tool that’s part of the system? I’m assuming “first class citizen” means it is part of the system.
(Asking as a distro packager who’s trying to keep users’ best interest in mind – and hoping that’s not a contradiction ;‍)

barry-scott · March 2, 2023, 2:56pm

A typical unix system can have lots of commands that are implemented in python, perl, ruby etc that do not have the a file type extension.

Requiring the .py will be a problem for the UX of these commands.

steve.dower · March 2, 2023, 3:02pm

There’s a few hundred posts of discussion in the Packaging category, which often touches this area. But the answer is “we’re not sure yet”. I’m pretty sure you’re keeping up with a lot of that discussion already. But yeah, since there’s so much discussion about it already, let’s leave it out of this thread.

Is requiring a partial or full path equally a problem? It seems that anything invoked via a shell-handled shebang must be okay, because Python isn’t going to search PATH for the command and so shells must be passing a full path already.

It would only impact the case where someone types python <name of command> instead of simply <name of command>, which would imply they know what they’re doing (i.e. they know it’s really a Python script and not Perl).

mwichmann · March 2, 2023, 3:44pm

On Linux (and relatives) this is the exec machinery - the argument given to the invoked interpreter is where the script was found - which depends on the contents of $PATH. Thus, it is often an absolute path, but may be a relative path. I believe it will never be a bare filename - even if PATH says to look in the current dir, and script is found there, it should come out as ./script. Not quite prepared to promise that, though…

guido · March 2, 2023, 4:01pm

I am fine if py grows subcommands. But I don’t like the idea that python (or python3, etc.) grew them, and I certainly don’t want to start deprecating the ability to pass any valid filename, regardless of form, as the argument. The python command should be thought of as similar to sh, not to git. It has been this way from the start 33 years ago, and this design was not an accident.

I know you discouraged responses of this form, but I can’t help pointing out that if you want subcommands, you can use python -m <subcommand>.

steven.daprano · March 2, 2023, 4:11pm

Can we please not break things that aren’t broken just on the odd chance that, some years from now, we might add subcommands to the python command? You don’t even have a concrete proposal for these subcommands yet, which makes this proposal a case of YAGNI.

There is nothing wrong with naming Python scripts without a .py extension, or as a relative name without a dot or slash. On Linux systems, there are significant numbers of extension-less Python scripts which are treated as system commands. E.g hg, dnf, and probably others. That’s a common pattern and Linux/Unix users expect to be able to do the same with their own scripts.

I can’t say I’m a big fan of all-encompassing, Do Everything Including The Kitchen Sink commands that need many subcommands. I find that in practice, those commands tend to suffer from severe feature creep and end up being a disguised God Class or function, and the UI and docs end up being complex and hard to use.

But putting that aside, Python already supports a kind of “subcommand” pseudo-syntax:

python -m <module> [options] [arguments]

where -m <module> is a de facto subcommand of sorts. So that gives us some alternative APIs that won’t break existing usage:

Use a “subcommand” option, say python -r command [options] [arguments]. (I think -r is currently unused.)
Leave the python command as it is, and add a py launcher with subcommands and whatever restrictions you want. py command ....

(Perhaps py command ... just ends up calling pythonX.Y -r command ..., or something.)

Depreciation warnings that last forever make a pretty awful user experience.

Especially for naive users who aren’t going to know that this warning is harmless, and for apps that then have to filter the warning out before processing the script output. Yuck.

Agreed, but I don’t think that this would be radically more restrictive:

The “legacy” command python filename continues to work with any valid filename, with or without dots and slashes, like all(?) Unix/Linux commands (and presumably likewise for most Windows commands);
the fancy py subcommand filename case will likewise not require dots or slashes;
and it is only the otherwise ambiguous case of py filename with no subcommand that requires a disambiguating dot or slash in the filename.

That’s not a complicated or radical difference.

barry-scott · March 2, 2023, 4:42pm

What I see is this on ubuntu:

$ cat sleepy
#!/usr/bin/python3
import time
import sys

print(sys.argv)
time.sleep(1000000)

$ sleepy
['/home/barry.scott/bin/sleepy']

$ bin/sleepy
['bin/sleepy']

$ cd bin
$ ./sleepy
['./sleepy']

pf_moore · March 2, 2023, 4:52pm

That was my point above. The one fly in the ointment is that py is not a new command, it’s a well-known launcher (backed by a PEP, even) on Windows, and there’s a Unix equivalent that’s been around for a while (although I don’t know how popular it is). So the new behaviour would still be a breaking change for py script. I don’t think it’s a major breakage, nor do I think it’s a radical discrepancy between py and python, though, so it’s not a showstopper for the idea in my view.

I share your reservations about the “one big command with subcommands” model, but like it or not it’s a very common pattern these days, and it wouldn’t be as common as it is if it didn’t provide a good UX for people.

I think the argument would be stronger with some actual ideas of what sort of subcommands we might consider adding, though. I’m not convinced we should be doing this just because we might think of something in the future. And the example of py install <foo> isn’t at all compelling to me - we’re not going to replicate the whole of pip^[1] as py install, and we have py -m pip, so what’s the point?

py install install --upgrade foo, anyone? ↩︎

steve.dower · March 2, 2023, 5:15pm

Like I said, I don’t have my own proposal (nor did I want this discussion to get completely derailed by that, since we’re at least a few years from having to decide it), but they exist. Here’s one that was put up the other week, just to prove I’m not creating straw men:

I pointed out at the time that this looks very much like pdm’s CLI (I may have said hatch by mistake), so again, the only hypothetical is how users learn about it and start to use it. We’re always going to be at least two releases away from being able to offer anything in our existing entry points, because each one of these commands could be someone’s script that they’re running. So it doesn’t really matter when we come up with something - chances are, any funding or support for it will die out before it ships.

I think it’s fair to say that py install wouldn’t directly translate to py -m pip (without pip’s own install subcommand) under any circumstances We’re not that lazy.

steve.dower · March 2, 2023, 5:25pm

This is totally reasonable, and not-unexpected context. I guess the follow up question is how do we feel if “everyone” starts saying “never launch python, use magic-python-tool instead”?

Do we care? Do we want to own that magic tool? What if that magic tool is owned by someone we don’t want to own it? What if our users insist on complaining about functionality gaps and won’t use the tools that fill them without our endorsement/ownership?

(I imagine here the answer is simply “we don’t care”, and all the caring is happening over in Packaging. But I don’t really know what the consensus looks like, so I’m asking)

Rosuav · March 2, 2023, 5:28pm

Very small point: A lot of language interpreters (and other commands) will guess a default extension if none is given. Python is one of the few where python spam will ONLY look for a file named spam, and won’t find spam.py . I’m not sure how many people are typing the wrong version, but it’s better to have a simple “file not found” than something like “spam is not a Python subcommand, if you meant to run a file called spam, enter it as ./spam”.