PEP 722: Dependency specification for single-file scripts

Sure, but that’s as opposed to looking up “Requirements:” :sweat_smile: Some kind of breadcrumb would be a huge lift.

2 Likes

Perhaps something akin to # Script Requirements: (# Script Requires:, # Embedded Requirements:, etc) could help here – the additional word lowers the likelihood that a rogue comment matches the format, and provides extra context for a google search.

As an illustrative example, there are currently no recorded uses of # Script Requirements: on grep.app, and only about 10,000 search results for the query.

A

3 Likes

I honestly have no vested interest in any particular header text, so I’ll just wait for consensus here. But there may be a compatibility issue with any change, as Requirements: is currently implemented by pip-run (the pipx support isn’t released, so I don’t count that). So a different header would need some form of transition.

I’ve no idea how much use this feature of pip-run has. Maybe @jaraco has a view?

2 Likes

I like this idea, and given that tools are already implementing some form of it, I think it’s worth standardising.

As for the multi-line strings question, I don’t think it’s a showstopper. I could live with either of two compromises:

  1. Tools have to know enough Python syntax to look only at comments. This makes non-Python implementations harder, but they can run a bit of Python helper code to do it, or you can write a Python tokenizer in another language. It’s a limitation, but not the end of the world.
  2. Alternatively, the requirements block must come before any code, so you stop looking for it at the first non-empty, non-comment line. Simpler to implement tools for, more limiting for the user. :person_shrugging:

PEP 508 does not allow local directories or files as dependecy specifiers.

I think it actually might allow local files. :confused: It allows URLs (as in pkg @ https://...), and as far as I can see, it doesn’t specify what schemes are allowed, so I think foo @ file:///home/takluyver/foo-0.1-py3-none-any.whl is valid according to the spec. I haven’t checked what implementations actually do with this.

2 Likes

I agree. My initial reaction was a bit too panicky. I actually think we should apply the “consenting adults” principle here. The rules on how the dependency block is identified and parsed must be clear and unambiguous (I believe they are, but we can fix them if not). But they do not have to prevent users from putting then in dumb places (such as in multi-line strings).

If a developer wants to put something that looks like a dependency block (but isn’t) in a script’s docstring, they have to put the actual dependency block before the docstring, so it gets recognised first. And if they don’t have dependencies, but want to use a tool that looks for a dependency block, they can put in an empty dependency block.

If a developer wants to put a real dependency block in a multi-line string, fine. Let them do it. What’s the harm?

Rules don’t stop people doing stupid things[1]. Common sense (and other people) stops people doing stupid things.


  1. no matter how much we try to make them ↩︎

2 Likes

I don’t like the PEP, as I said above, but assuming that it is going to be done, why not borrow the same syntax as encoding comments? Namely #-*- requirements: numpy, scipy, pandas, rich -*-.

I can also imagine

__requirements__ = """
numpy
scipy
pandas
rich
"""

or even

__metadata__ = """
[project.dependencies]
...
"""

(the latter making it an “inline pyproject.toml”).

3 Likes

I was over-simplifying. It may do, but the usability sucks (no relative paths, and don’t get me started on Windows drive letters). My point here was that tools may reasonably want to allow extended forms of dependency specifier, and I don’t want to disallow that, even though it’s not something I want to try to define in the PEP. The obvious extension is “anything that pip can use as a requirement” (and that’s what pipx and, I believe, pip-run allow).

1 Like

These ideas are addressed in the PEP (“Why not include other metadata?” and “Why not make the dependencies visible at runtime?”). I imagine you’ll disagree with those sections as well, of course. And it’s perfectly fine for you to disagree - this isn’t going to be something everyone will like. But if you have suggestions or questions that aren’t already covered in the PEP, please do ask them - I’ll be updating the PEP based on the feedback here, so I want to make sure I hear everyone’s views.

(The idea of an encoding comment style is one I’ll think about. I don’t like it, but I need to articulate why if I want to represent it fairly in the PEP).

1 Like

D’oh! Sorry, I did read the PEP but too quickly, and I failed to remember that these were in it. My bad.

I can agree with “Why not use a more standard data format (e.g., TOML)?”.

Regarding __requires__, note that there is a precedent in Hatch for reading a __version__ attribute by not using a full Python tokenizer/parser but a simple regex (Versioning - Hatch).

1 Like

I see. I wonder if this kind of defeats the point of the spec, though? If both the existing implementations allow specifying any requirements pip can install, people are going to use that, and then to build a compatible tool you need to allow those things too. So I’m not sure what the purpose of a spec saying something more restrictive is.

Maybe this PEP should be pragmatic and just allow anything that can be passed to pip? We might not want pip to be special like that, but it already is, so maybe we’re just pretending that it’s not. The broader requirement syntax could be made into a proper spec later. Or maybe if the only practical way to do it is implementation defined, it’s just something the respective projects should document rather than a subject for a PEP. :person_shrugging:

2 Likes

I propose # py:requires:, which surely won’t conflict with reasonable comments. Alternatively, #pragma: requires:, which has precedence in coverage.py.


Does pip-run support multi-line requirements by escaping the newline? Is that necessary in the proposal?

#   foo ~= \
#    2.0

Why not simply say “tools may support a format to specify (potential relative) paths to project directories, but this PEP won’t specify a format”. That’s a lot more assertive on path support

I also can’t imagine there’s too many distinct formats for specifying a path, so I’d imagine the community would converge on the same format

1 Like

I know this proposal wants to keep the scope small. But if you want to list deps, the dependency on a particular version of Python should be part of it.

4 Likes

I really think this will be very useful and could even further increase the utility of Python as a glue language. I use nix-shell scripts extensively at work and the ability to ship self-contained scripts without having to worry much about docker images or installation instructions or distribution archives cannot be understated.

Instead of a new format of dependencies, what about inline requirements.txt? It lets us punt the standardisation question even further, avoids adding a new format for people to learn and if things become standardised in future, it should still work.

Ruby (bundler) has a mechanism for inline Gemfile dependencies. Bundler: How to use Bundler in a single-file Ruby script

require 'bundler/inline'

gemfile do
  source 'https://rubygems.org'
  gem 'benchmark-ips', require: 'benchmark/ips'
end

Benchmark.ips do |x|
  x.report('original') { naive_implementation() }
  x.report('optimized') { fast_implementation() }

  x.compare!
end
1 Like

The problem here is that requirements.txt isn’t standardised, and this is, of necessity, an interoperability specification. You can’t assume that the tool which runs the script will just pass the requirements onto pip. Even ignoring the possibility of an installer that isn’t pip, the tool might want to do some pre-processing of the requirements (pipx caches environments based on the list of requirements being installed, for example). Do we want to require pipx to parse nested requirements files (using the ability to include an -r option in a requirements file)?

Also, that Ruby example, if I understand it correctly, doesn’t use an isolated environment. So it’s not equivalent to what we (or at least I) want to do in Python. If you expect to run the script in an isolated environment with just the script’s requirements installed, you probably have to process the requirements before the script runs. You could create a temporary, empty environment, and then run a script that auto-installs its own requirements in that. But then you can’t do things like cache environments.

But this is getting off-topic. The point here is to standardise a way for a single-file script to declare its dependencies in a way that lets tools do whatever they like with that information. My primary use case is to run scripts, but others may want to do audit scans on a directory of utilities, or process a set of scripts to determine if they can share an environment without dependency conflicts, or freeze the script and its dependencies into a zipapp. Once we have the data, people can use it in many ways.

As a reminder - we don’t need this standard to just write standalone scripts. If you’re happy with using an implementation-defined format, then pip-run already exists, and the next release of pipx is an alternative with different trade-offs, if you prefer. But if you want your IDE to offer to add dependency data when you type in a 3rd-party import, you stand a much better chance of that if the format is standardised, rather than being tool-specific.

5 Likes

At least for executable scripts, you can specify the version by choosing a version-specific executable:

#!/use/bin/env python3.10

...
1 Like

I don’t like when people call pip from inside scripts since it can pollute an environment by adding unexpected packages.

4 Likes

I am hesitating to write an alternative PEP (proposing inline pyproject.toml). Suppose I did; would any core dev be willing to sponsor it?

2 Likes

This will not work on windows. It will also force the exact executable to be present on Unix, while the script could run with 3.11 even if 3.10 is not here.

4 Likes

Maybe. It was more to point out to Donald that it isn’t a complicated thing to parse.

Definitely! I could very easily see implementing something in Rust and have it work with the Python Launcher for Unix.

I had been going on the assumption that the tools doing the execution would handle reading and writing any necessary data. Do you think it’s worth coordinating all of the details so the tool handling the execution is entirely interchangeable to the point that they will use the same temp virtual environment?

If we want to avoid ambiguity, this is my preference since it’s the simplest and fastest.

I would prefer not to do this unless it’s backed by a spec. We are trying to explicitly get away from conventions driving things.

I personally would want to avoid that for simplicity.

You can somewhat do that via the shebang line today. And that’s a separate ask since that’s expanding what could be defined as a dependency in any form, let alone within a .py file (i.e. you can’t do that in a pyproject.toml via a project.dependencies array, so it’s out of scope for what this PEP is trying to accomplish).

The Python Launcher for Windows actually reads the shebang line (as does the Python Launcher for Unix).

Not if you use /usr/bin/env for the shebang.

4 Likes

Maybe this is just me, but the way the discussion is going is making me feel it’s going to be a bit dicey to balance the desire for a quick and easy format with the desire for something more clean and standardized for widespread tool interop. In particular as I mentioned before, this seems like an inline requirements.txt in practice, and I don’t see how it’s a good idea to jump to standardize this rather than making a single standard for that type of dependency list, and then saying “you can also specify this kind of information within a python file like so”.

3 Likes