PEP 722: Dependency specification for single-file scripts

BrenBarn · July 25, 2023, 9:01pm

My point is not really that the format is the same, it’s that people will see them as the same and be confused that they are actually different, because people will want to use them to do similar things. So in a way the fact that the differences you mentioned exist make me support the proposal even less.

I don’t want to seem too nitpicky here, but the above three quotes illustrate to me what I mentioned in my earlier comment about the difficulty of balancing quick-and-easy with specced-and-clean.

The simple way to state my position would be: The proposed behavior must be specced in such a way as to reduce potential ambiguities and surprises as much as humanly possible. Then after that we look at whether the result is still simple enough to be convenient, and if it’s not, then reject the PEP.

In other words, in my view, it’s not worth it to even allow the possibility of confusion over things like multiline strings or accidental collisions with block headers like “Requires:”, or even “why can’t I do this thing in an in-script requirements block that I can do in requirements.txt”. We need to choose in each case the alternative that most fully locks out any potential complications (e.g., force the block to come before docstrings). If that means the in-script requirements block becomes too restrictive or cumbersome and is no longer convenient to use, the solution is to drop the idea, rather than try to bend it a bit and make it convenient at the expense of allowing potential confusion.

The reason I think this is I’m just trying to envision the future if some version of this proposal gets approved. People will start to use it and feel good about doing so since it’s a standard. Then there will be all kinds of scrips out there with requirements blocks. Some of them may be big and complicated. Some may interact confusingly with other things (like docstrings or other comment annotations like linter hints). It will increase the burden on everyone reading any such script.

That might be worth it if the confusion is limited as much as possible. But to me it’s definitely not worth it if any of these edge cases are going to remain possible. The bottom line is that I see this entire proposal as providing just a convenience, and that convenience isn’t worth more than a tiny increase in the average amount of reading-knowledge required to understand what a script is doing.

psarka · July 25, 2023, 10:19pm

I’m still confused about the shebang part.

If I want to use requirements-specifying script as a standalone executable that I can simply drop into my local bin folder, I will want to specify the actual runner that will run the script, like pip-run, in the shebang line.

I understand that such thing is not interesting in windows (although git bash?), but for other OSes it would be quite nice.

Is such usage envisioned as “normal”? It seems to conflict with suggestions to use the shebang part to specify the python version by indicating a versioned python interpreter.

From my perspective specifying that script should be executed by python3.12 when it most certainly needs to be executed by some other executor seems troublesome (even if that other executor will launch it using some python interpreter eventually).

dstufft · July 25, 2023, 11:27pm

I think my biggest gripe with something that isn’t distinctive and reads like “prose”, is that it feels very easy for a beginner to be surprised by the behavior. I think this is particularly important since we’re already overloading comments to effectively be executable, so that’s already a surprising behavior, and the use of something that isn’t distinctive and obvious makes it even more likely for there to be surprising behavior.

Obviously this is mitigated to some degree by the fact that you have to run a command that interprets these comments, but I think that we shouldn’t assume that it’s going to be obvious to the user from the command that they’re going to be invoking a “parse the magic comments” command. Yes for something like pip-run that is likely the case, but that’s only because that’s a command that does one obvious thing. What if something like this gets baked into say VSCode-- will it be as obvious that “run this python script” also means “read the comments and look for this magic block?”, or into a tool like posy or rye or anything else like this.

IOW, I think that we’re currently looking at having a declaration where it is ambiguous whether the user meant to be making a PEP 722 declaration block or not, and we’re assuming that we can rely entirely on the invocation of “the tool” to make it unambiguous. I think that only holds true for the current crop of single use tool/commands that happen to implement this, but is not likely to be true as more general purpose tools start to implement it.

ntessore · July 26, 2023, 12:10am

Very much agree with the sentiment that making it look like prose and not a machine-readable “code” is going to cause issues. Any beginner, student, etc. that isn’t already aware of what they are looking at could be tempted to “helpfully” amend the requirements block like this:

# Requirements:
#     matplotlib
#     a valid data file

brettcannon · July 26, 2023, 12:10am

I misunderstood what you meant to be about which tools and their involvement, so ignore what I said.

Nope, it’s not a problem. It just means we will make a opinionated decision on what we consider reasonable (i.e. we might put the block before the docstring instead of after, as an example).

UTF-8 is the default encoding anyway, so I don’t think that’s necessary. If tools don’t want to put in the effort to handle the encoding cookie then I think that’s up to the tools and not the concern of the PEP.

Sure, happy to! I can also get @courtneywebster involved to see if we can run the idea past some beginners in a user study to help validate the proposal. I can also talk it over with folks on our team to make sure I’m not missing some potential implementation headache.

Nah, the SC delegates to you to delegate to whomever. Plus I have been delegated to in the past already.

I think this is an argument to not use “Requirements” and switch to something like “Dependencies” to fully disassociate from requirements files.

It’s doable, but I don’t know if we can declare if this PEP will make it “normal”.

It’s entirely up to the tool doing the execution as to whether you could embed some Python constraint into that shebang, e.g. /usr/bin/env -S pipx run --requires-python='>=3.8'. Regardless, specifying the Python version requirement is out of scope of this PEP.

Is that concern because of the prose-style marker that’s being proposed? If something more “technical” like “Requires-Dist” was used instead would that alleviate your concern?

BrenBarn · July 26, 2023, 4:01am

This is a great point and I agree. I see the gist of the arguments others made earlier that “well it’s okay if it’s got some unexpected sharp corners because it’s opt-in”, but that only goes so far. Even for pip-run, there is already the possibility to use is by explicitly specifying a requirements file, so it may not be obvious to users that if you don’t specify one, it will start reading reqs from inside the file. (Or is the intent that it would be mandatory to explicitly activate the behavior with a --parse-inline-deps option to pip-run and similar tools? That seems safer.)

Also as you mention, there’s a difference between the existing tools that do something like what this PEP proposes, and the possible proliferation of tools using it if it gets standardized.

I guess one way to look at it is that, although the intended use case for the PEP seems fairly restricted and simple, “overloading comments as installer instructions” is a fairly significant change with potentially major ramifications. I think it warrants a generous marination time to think more generally about that, not just about how well the PEP serves its intended use case.

Spaceman · July 26, 2023, 4:06am

Nicolas Tessore:

Any beginner, student, etc. that isn’t already aware of what they are looking at could be tempted to “helpfully” amend the requirements block like this:
# Requirements:
#     matplotlib
#     a valid data file

I work on a home automation rule engine in python and there I use the a yaml in the first comment block of the python file to define metadata about the file. There the user can specify dependencies, when to reload the file, etc.
The main Benefit is that it’s easily extendable since it de-serializes to a dict and provides a flexible syntax (multi line lists vs single line lists). Must users seem to be very happy with it and with an example it’s very intuitive to use.

How about

# Python:
#   requirements:
#     - matplotlib
#     # a valid data file

That way it could also be easily extended to include the python version

# Python:
#   min_version: '3.8'
#   requirements:
#     - matplotlib
#     # a valid data file

sirosen · July 26, 2023, 5:31am

There are a couple of comments here which suggest that there might be some kind of sea-change if this PEP moves forward. With the PEP, users will need to read and understand these dependency comment blocks, whereas without the PEP there will be no such imposition.

I don’t think that view is accurate, or at least it is incomplete. Implementations already exist – and I’d call pipx pretty mainstream at this point.

I think a better take is that this is already happening because it’s a useful thing for tools to do, and now is a good time to try to formalize the existing behavior with a spec. If the behavior needs to be tuned a little to meet the spec, that’s not a huge deal (yet!).

I want to check one angle of the PEP with respect to the requirements.txt vs dependency comment differentiation which has been cited.

I wrote a script this morning which reads dependency comment blocks, creates a venv, then writes the dependency comment data into $VENV/requirements.txt and installs with pip install -r.

Is my script compliant with the proposed PEP? It doesn’t validate that the data are PEP 508 dependency specifiers.
My read is that such a script is okay because it accepts something broader than the PEP. Is that reading correct? Do we need to clarify whether or not dependency format validation is required?

pf_moore · July 26, 2023, 12:23pm

See the PEP, which covers the reasoning for not using a more complex format. Basically YAGNI.

Others have already covered this, but this is not something I plan on including. If you support running your script on a range of Python versions, you’re getting into the area of publishing, and you’re better making it a “proper” Python project with pyproject.toml and full metadata.

To a large extent, I agree with this. Although I will note that the pipx implementation of this hasn’t been released yet, and pip-run is not particularly mainstream.

However, I think the main point stands, which is that if this is such a crucial feature that it’s going to suddenly start popping up in scripts all over the place, to the extent that significant numbers of Python programmers get confused as to what it means, why has no-one been working on better solutions before now?

After all, pip-run has been around since 2018. And yet, I don’t recall ever having seen a script containing a __requires__ = ["requests"] statement (the original form pip-run used for declaring requirements). So why will this PEP be any different?

(Side note - let’s not get into the debate over the __requires__ format. The PEP contains a section discussing this, plus links to the discussions which resulted in the comment-block format being adopted, so go and read them if you want to know why the current form was introduced).

Currently, yes. There’s no requirement on consumers to validate anything.

However, I am uncomfortable about the fact that it’s awfully likely that consumers will just pass the requirements to pip, meaning that “whatever pip accepts” could become a de facto extension to the standard. It also allows for abominations like

# Requirements:
#     --index-url
#     https://my.index/simple
#     private-package
#     --editable
#     ../dev-library

This latter case makes me think that I should update the PEP (and the pipx implementation!) to require validation. If I do that, I’ll probably also include a reference implementation of a client function to read a dependency block from a script - it’s not hard, and it allows me to be explicit about the validation.

I don’t think the loss of the ability to specify non-PEP 508 requirements is fatal. Yes, it will probably exclude some use cases that might otherwise benefit from this feature, but they can simply carry on doing whatever they do right now.

fungi · July 26, 2023, 12:30pm

I work on a home automation rule engine in python and there I use
the a yaml in the first comment block of the python file to define
metadata about the file. There the user can specify dependencies,
when to reload the file, etc. The main Benefit is that it’s easily
extendable since it de-serializes to a dict and provides a
flexible syntax (multi line lists vs single line lists). Must
users seem to be very happy with it and with an example it’s very
intuitive to use.

We’ve already been through this debate with pyproject.toml, and
while I personally like YAML far far more than TOML, the problem is
that there’s no YAML parser in the stdlib and the good YAML parsing
libraries for Python are fairly heavy, so odds are whatever we end
up with is going to be something that can be easily parsed with
what’s in the Python standard library now.

ssweber · July 26, 2023, 3:49pm

I’m +1 on this. I find myself with these “script with dependencies” that @pf_moore has been talking about. Formalizing what pipx and pip-run look for, rather than inventing a new format, seems to be the right approach.

dstufft · July 26, 2023, 5:27pm

In general I think an idea along the lines of this PEP are a good thing… but I’m not sure that arguing that many people aren’t going to use it is a great answer to the question of whether or not it’s confusing. If it’s a niche thing that we don’t expect a decent number of people to use, then presumably it doesn’t need a PEP, it can just continue to be a niche thing with limited audience.

To be clear, I don’t think that this is going to be super niche (or rather, I think whether it’s niche or not depends entirely on if big tools that people are already using start to support it or not, which is far more likely with a PEP). I think it’s useful and something we should do, I’m just lightly pushing back on the specific implementation out of a worry that it’s the kind of thing that can become too magical, and cause more harm than good, if we’re not careful.

pradyunsg · July 26, 2023, 5:43pm

I finally found the time to read this PEP. Unsurprisingly, it’s well-written.

IMO, that (= requiring/recommending validation) is a good idea.

I agree, and I do prefer that the keyword/marker be named “Dependencies”, rather than “Requirements” – primarily because that’s better in line with [project.dependencies] as well, and I’d like for the user-facing standards to use a consistent vocabulary. Looking at the implementations, it shouldn’t be too difficult in pipx or pip-run to change the keyword/marker either (or to allow one-or-the-other).

This is a naming question, so you could argue that I’m asking that this bikeshed be painted differently. IMO, this is not bikeshed-style concern and rather about having a consistent UX.

PS: I wouldn’t be opposed to having a single shared implementation of this PEP live in packaging (assuming other packaging maintainers are on board, of course).

FRidh · July 26, 2023, 6:44pm

This connects user scripts directly to a specific package manager, causing trouble for those using a different package manager. Single-script files may still be used by multiple users.

brettcannon · July 27, 2023, 1:17am

Let me worry about VS Code and communicating this out.

Yep, I’m on board.

I think that’s a good idea, although we may want to construct a regular expression for validation purposes if one doesn’t already exist for PEP 508, else we are asking everyone to use a parser generator just to check that what they will potentially pass to pip is a proper dependency specification.

csm10495 · July 27, 2023, 5:36am

Couldn’t we allow an embedded PEP 517/518 file?

# :pyproject.toml
<Abbreviated version of what could have been in a separate file>

That way there can be all the info we want and we could even make them installable via pip if we made pip install <file.py> possible.

At a minimum we could allow the dependencies block.

Doing this allows us to reuse the TOML DSL and as much of 517/518 as we deem necessary.

Throw in the ability to run/install directly to/from a venv and it would be quite swell.

Like I could see even having multiple console scripts get installed from one file if we allow more of 517/518. (I get that normally it’s good to go full project at that point but still would be a nice feature).

BiteCode · July 27, 2023, 8:52am

While I wish everyone did, not only most Python users don’t even know it exists, but several installers don’t provide it.

Of course it will. If you use /usr/bin/env python3.10, and python3.10 is not installed but 3.11 is, the script will fail will "/usr/bin/env: ‘python3.10’: No such file or directory, despite the fact the script would run fine.

However, I understand it’s out of scope, and would rather get the proposal without it than not at all. It would solve a lot of problems to have pip being able to read this.

pf_moore · July 27, 2023, 9:07am

To be clear, pip will not read this itself. Tools like pip-run and pipx (which run scripts) will. And you could write your own utility that reads the dependency block out of a script and passes it to pip. But I don’t intend to make this part of pip itself - we already have enough ways to read requirements.

ntessore · July 27, 2023, 9:34am

Does that mean one would not be able to quickly install the requirements for a single-file script into an existing environment? That still is a more commonplace thing to do than running scripts with pipx for me.

pf_moore · July 27, 2023, 10:04am

It means that I personally consider that to be a good use for a simple wrapper. Assuming packaging gets a function for this, it would be as simple as:

import sys
import subprocess
from packaging.utils import get_script_dependencies

script = sys.argv[1]
deps = get_script_dependencies(script)
subprocess.run([sys.executable, "-m", "pip", "install"] + [str(dep) for dep in deps])

I don’t think the cost of working out a design for a pip invocation to do this, writing a PR for pip including docs and tests, etc, etc, is worth it. If someone else does, then fine, they can create a PR. We can discuss details there.