PEP 722: Dependency specification for single-file scripts

There are a couple of comments here which suggest that there might be some kind of sea-change if this PEP moves forward. With the PEP, users will need to read and understand these dependency comment blocks, whereas without the PEP there will be no such imposition.

I don’t think that view is accurate, or at least it is incomplete. Implementations already exist – and I’d call pipx pretty mainstream at this point.

I think a better take is that this is already happening because it’s a useful thing for tools to do, and now is a good time to try to formalize the existing behavior with a spec. If the behavior needs to be tuned a little to meet the spec, that’s not a huge deal (yet!).


I want to check one angle of the PEP with respect to the requirements.txt vs dependency comment differentiation which has been cited.

I wrote a script this morning which reads dependency comment blocks, creates a venv, then writes the dependency comment data into $VENV/requirements.txt and installs with pip install -r.

Is my script compliant with the proposed PEP? It doesn’t validate that the data are PEP 508 dependency specifiers.
My read is that such a script is okay because it accepts something broader than the PEP. Is that reading correct? Do we need to clarify whether or not dependency format validation is required?

2 Likes

See the PEP, which covers the reasoning for not using a more complex format. Basically YAGNI.

Others have already covered this, but this is not something I plan on including. If you support running your script on a range of Python versions, you’re getting into the area of publishing, and you’re better making it a “proper” Python project with pyproject.toml and full metadata.

To a large extent, I agree with this. Although I will note that the pipx implementation of this hasn’t been released yet, and pip-run is not particularly mainstream.

However, I think the main point stands, which is that if this is such a crucial feature that it’s going to suddenly start popping up in scripts all over the place, to the extent that significant numbers of Python programmers get confused as to what it means, why has no-one been working on better solutions before now?

After all, pip-run has been around since 2018. And yet, I don’t recall ever having seen a script containing a __requires__ = ["requests"] statement (the original form pip-run used for declaring requirements). So why will this PEP be any different?

(Side note - let’s not get into the debate over the __requires__ format. The PEP contains a section discussing this, plus links to the discussions which resulted in the comment-block format being adopted, so go and read them if you want to know why the current form was introduced).

Currently, yes. There’s no requirement on consumers to validate anything.

However, I am uncomfortable about the fact that it’s awfully likely that consumers will just pass the requirements to pip, meaning that “whatever pip accepts” could become a de facto extension to the standard. It also allows for abominations like

# Requirements:
#     --index-url
#     https://my.index/simple
#     private-package
#     --editable
#     ../dev-library

This latter case makes me think that I should update the PEP (and the pipx implementation!) to require validation. If I do that, I’ll probably also include a reference implementation of a client function to read a dependency block from a script - it’s not hard, and it allows me to be explicit about the validation.

I don’t think the loss of the ability to specify non-PEP 508 requirements is fatal. Yes, it will probably exclude some use cases that might otherwise benefit from this feature, but they can simply carry on doing whatever they do right now.

1 Like

I work on a home automation rule engine in python and there I use
the a yaml in the first comment block of the python file to define
metadata about the file. There the user can specify dependencies,
when to reload the file, etc. The main Benefit is that it’s easily
extendable since it de-serializes to a dict and provides a
flexible syntax (multi line lists vs single line lists). Must
users seem to be very happy with it and with an example it’s very
intuitive to use.

We’ve already been through this debate with pyproject.toml, and
while I personally like YAML far far more than TOML, the problem is
that there’s no YAML parser in the stdlib and the good YAML parsing
libraries for Python are fairly heavy, so odds are whatever we end
up with is going to be something that can be easily parsed with
what’s in the Python standard library now.

3 Likes

I’m +1 on this. I find myself with these “script with dependencies” that @pf_moore has been talking about. Formalizing what pipx and pip-run look for, rather than inventing a new format, seems to be the right approach.

1 Like

In general I think an idea along the lines of this PEP are a good thing… but I’m not sure that arguing that many people aren’t going to use it is a great answer to the question of whether or not it’s confusing. If it’s a niche thing that we don’t expect a decent number of people to use, then presumably it doesn’t need a PEP, it can just continue to be a niche thing with limited audience.

To be clear, I don’t think that this is going to be super niche (or rather, I think whether it’s niche or not depends entirely on if big tools that people are already using start to support it or not, which is far more likely with a PEP). I think it’s useful and something we should do, I’m just lightly pushing back on the specific implementation out of a worry that it’s the kind of thing that can become too magical, and cause more harm than good, if we’re not careful.

3 Likes

I finally found the time to read this PEP. Unsurprisingly, it’s well-written. :slight_smile:

IMO, that (= requiring/recommending validation) is a good idea.

I agree, and I do prefer that the keyword/marker be named “Dependencies”, rather than “Requirements” – primarily because that’s better in line with [project.dependencies] as well, and I’d like for the user-facing standards to use a consistent vocabulary. Looking at the implementations, it shouldn’t be too difficult in pipx or pip-run to change the keyword/marker either (or to allow one-or-the-other).

This is a naming question, so you could argue that I’m asking that this bikeshed be painted differently. IMO, this is not bikeshed-style concern and rather about having a consistent UX.


PS: I wouldn’t be opposed to having a single shared implementation of this PEP live in packaging (assuming other packaging maintainers are on board, of course). :slight_smile:

3 Likes

This connects user scripts directly to a specific package manager, causing trouble for those using a different package manager. Single-script files may still be used by multiple users.

3 Likes

Let me worry about VS Code and communicating this out. :wink:

Yep, I’m on board. :grin:

I think that’s a good idea, although we may want to construct a regular expression for validation purposes if one doesn’t already exist for PEP 508, else we are asking everyone to use a parser generator just to check that what they will potentially pass to pip is a proper dependency specification.

2 Likes

Couldn’t we allow an embedded PEP 517/518 file?

# :pyproject.toml
<Abbreviated version of what could have been in a separate file>

That way there can be all the info we want and we could even make them installable via pip if we made pip install <file.py> possible.

At a minimum we could allow the dependencies block.

Doing this allows us to reuse the TOML DSL and as much of 517/518 as we deem necessary.

Throw in the ability to run/install directly to/from a venv and it would be quite swell.

Like I could see even having multiple console scripts get installed from one file if we allow more of 517/518. (I get that normally it’s good to go full project at that point but still would be a nice feature).

2 Likes

While I wish everyone did, not only most Python users don’t even know it exists, but several installers don’t provide it.

Of course it will. If you use /usr/bin/env python3.10, and python3.10 is not installed but 3.11 is, the script will fail will "/usr/bin/env: ‘python3.10’: No such file or directory, despite the fact the script would run fine.

However, I understand it’s out of scope, and would rather get the proposal without it than not at all. It would solve a lot of problems to have pip being able to read this.

1 Like

To be clear, pip will not read this itself. Tools like pip-run and pipx (which run scripts) will. And you could write your own utility that reads the dependency block out of a script and passes it to pip. But I don’t intend to make this part of pip itself - we already have enough ways to read requirements.

1 Like

Does that mean one would not be able to quickly install the requirements for a single-file script into an existing environment? That still is a more commonplace thing to do than running scripts with pipx for me.

1 Like

It means that I personally consider that to be a good use for a simple wrapper. Assuming packaging gets a function for this, it would be as simple as:

import sys
import subprocess
from packaging.utils import get_script_dependencies

script = sys.argv[1]
deps = get_script_dependencies(script)
subprocess.run([sys.executable, "-m", "pip", "install"] + [str(dep) for dep in deps])

I don’t think the cost of working out a design for a pip invocation to do this, writing a PR for pip including docs and tests, etc, etc, is worth it. If someone else does, then fine, they can create a PR. We can discuss details there.

I’m not in favor of this PEP because it simply adds yet another way to specify Python dependencies without taking user feedback into account. Particularly, I find this a big red flag:

…it’s intended to be for single-file scripts only, and is not intended in any way to replace or compete with project metadata stored in pyproject.toml

We’ve heard from many community members (e.g. the packaging survey) that they are tired of having to know the many ways to interact with Python packaging tooling and metadata.

While I believe your intent is, of course, benign, I don’t think end users will understand the subtle use case differences, and instead will have to reflect on the additional option they are provided alone. That’s especially unfortunate since I believe the proposed format doesn’t actually cover best practices like lock files and might have a chilling effect on the great work of standardizing on one file (pyproject.toml).

Essentially, I don’t think end users will know or understand the need for such a PEP, and will again have to expand their understanding of when and how to use which format for dependency specification.

As a reminder, respondents of the survey primarily

  • found Python packaging too complex,
  • did NOT prefer to use several Python packaging tools,
  • preferred a clearly defined, official workflow.

But they also said, PyPA should

  • focus on supporting a wider range of use cases,
  • support more interoperability between Python packaging and packaging tools for other languages.

Given these key takeaways, I don’t see how this PEP would help users, as it would just add one more way to specify dependencies, without removing other options or integrating better in real-world scenarios.

13 Likes

Oh, then the proposal doesn’t solve the problem for the people that needs it the most. We already have tools for this, so python experienced devs don’t need the proposal. If it’s not officially adopted, the people that are not saavy will not know it exists or will mess up installing the tool in the first place.

2 Likes

Thanks for the sanity check on this. I will point out that my interest in this is very much as a user, so to that extent at least I am taking user feedback into account :slightly_smiling_face:

My intent here is very much focused on the “supporting a wider range of use cases” point. Specifically, Python packaging currently has very bad support for the common use case of a set of single file scripts, often stored in a “utilities” directory on $PATH, which rely on dependencies from PyPI.

I honestly don’t see how we can improve the situation here without something like this PEP. If you have a suggestion for an alternative way of addressing this use case, I’d be very interested in hearing of it. But please understand that solutions involving “make a project directory” or “put the dependencies in a separate file” directly contradict the key requirement here, which is having a way of writing a single runnable file that can use packages from PyPI.

Like it or not, this is a common requirement for many Python users, and telling them that they “shouldn’t work like that” is not realistic[1] - they’ve been “working like that” for many years now, and either complaining that Python environment management is hard, or dumping all their 3rd party requirements into their system Python (something that the packaging community chose to discourage without really ensuring that all the reasons people do this were considered).

With regard to your other points from the survey:

These are all related to the tool that allows single-file scripts to be run. I’m absolutely in favour of simplifying the landscape here. The pip-run tool was proposed as a pip subcommand a long time ago (hence the name). It hasn’t happened yet because there’s a lot of UI and organisational issues that we haven’t been able to resolve. There’s also a question of whether this is in scope for pip, but the user survey results suggest (to me, at least) that users would be OK with pip gaining this functionality as part of becoming the core of the “unified PyPA workflow”[2] so I consider that issue as having been solved at this point.

But none of that is relevant to this PEP. All I’m trying to do here is define where any tool would find dependency data when faced with a single-file script. That problem will have to be solved no matter what the official workflow for running such a script ends up being, and I don’t see the disadvantage in building on existing, working solutions.

I’ve covered “wider range of use cases” point above. As far as interoperability is concerned, surely, by standardising the means of getting data that is currently only available in tool-dependent ways, that has to improve interoperability? And by using the existing PEP 508 standard for dependency specifiers, I’m ensuring that this proposal build on existing interoperability work? I don’t want to make assumptions here, but if you’re concerned that conda (for example) can’t use this data, surely that’s about how well PEP 508 maps to conda packages rather than being about this proposal?

So while I wouldn’t want to try to present this PEP as some sort of massive step forward in addressing the user concerns expressed in the survey, I don’t see how it’s harming whatever work we do in that area. And I absolutely don’t think that the correct response to the survey is to stop making any sort of progress out of fear that we’ll make things (temporarily) more complex in the process of working on long-term simplification.

PS Apologies if my frustration is showing through here. I’ve been arguing for literally years that by ignoring the “run my script with some dependencies” use case, we’re failing to consider an important user requirement. It’s difficult for me to know how to address a complaint that when I finally try to make some progress in this area, I’m not considering user feedback…


  1. Again, putting this in the context of the survey, there was a strong flavour users feeling that Python packaging does not listen to what users say, and IMO saying “that’s not the way you should do things” is a strong contributing factor to giving that impression. ↩︎

  2. Although within the packaging community, there’s no consensus yet on whether pip, specifically, should be the core tool, rather than something else :slightly_frowning_face: ↩︎

9 Likes

So your view is that to be “officially adopted” something like pip-run must be part of pip? Or are you referring to the idea mentioned by @ntessore of being able to extract a script’s dependencies and install them into an existing environment? (Sorry, the way your response appears on the web interface doesn’t make it clear which comment you were responding to).

If it’s the former, there’s a proposal to add pip-run functionality to pip. That’s independent of (but linked to) this proposal - but I’d encourage you to read the full discussion there before commenting in support of the idea, as there are some non-trivial issues that need some work before this can happen, and I don’t think anyone currently has the bandwidth to work on them.

2 Likes

I would also question whether the results of a Python packaging
survey are applicable to this case, which is basically a sort of
un-package approach. People choosing to take the packaging survey
likely have a selection bias for package-oriented solutions, which
this proposal isn’t really (though it is still relevant to related
topics like environment management).

Put another way, are the users who this solution is trying to
satisfy, i.e. those who don’t want to package their scripts, likely
to bother filling out a survey about packaging?

5 Likes

Well, you could argue that people who write scripts with dependencies which are Python distribution packages are users of the Python packaging ecosystem. After all, they download the packages for their dependencies from PyPI, and IIRC there was a banner on the PyPI web interface linking to the survey (I believe that was what led me to the survey). You don’t need to write Python packages (instead of standalone scripts) to be a user that regularly downloads and installs packages, and eventually even visits the PyPI website if only because it appeared on search engine results.

1 Like

Sorry for the late reply, but yes I dump all of the shared dependencies in the global environment.

Is there no common functionality in all of your scripts? How do you share functions between them? I’m surprised that you haven’t collected them into libraries with multiple entry points rather than have loose scripts.

Also, just from a maintenance point of view, do you really prefer one script with an inline pyproject.toml versus a folder with a Python script and a pyproject.toml? In the latter case, at least you’ll have syntax highlighting, and access to any tools that check and update pyproject.toml. It just seems easier to me.

It would be cool if there were a tool to generate a folder/pyproject.toml given a Python script.

1 Like