PEP 722: Dependency specification for single-file scripts

This PEP adds literally no new tools, and no new data formats. All it does is make one existing format (used by two existing tools) into a standard, so that if we (for example) later replace those two tools with a single new one (reducing number of tools?) then users don’t have to change their code (reducing churn for users).

That’s a fair criticism. I’m open to other suggestions. But many other languages use the “structured comments” approach, so it seems like it isn’t so bad in practice.

… and we’re back here again. How many people stating on this thread that they have a requirement for being able to declare dependencies in a single-file Python script are needed to demonstrate that this is a real-world use case?

OK. Maybe that would work. My gut instinct is that it would be something I’d use reluctantly, and be frustrated by various “papercut-level” annoyances. But I don’t want to reject a reasonable proposal just because it’s not my favourite. Also, none of the other languages mentioned in the survey of languages linked above use a separate file[1], so it feels like it’s going against common practice. Do you have examples of other languages using this approach that you can point to?

If you’re serious about this suggestion, are you willing to get it added to pip-run and pipx? What’s the transition plan from the existing behaviour to this proposal? There’s a whole “backward compatibility” section of the PEP that will need writing if we go down this route.


  1. Yes, I concede that’s at least partly because the survey is of single-file solutions. ↩︎

4 Likes

You snipped my statement in a somewhat unflattering way; I do accept the use-case. Luckily, your PEP is named “dependency specification for single-file scripts”, which I have no problems with as a requirement. My point was the the dependencies do not have to be in the same file to achieve that.

My response was aimed at pointing out the potential solution space between “single-file script” and “single-file script+requirements”, and that it’s possible to support the former in a way that doesn’t (a priori) create yet more UX & teachability problems.

I do care about python packaging (and not increasing divergence further), but between 2 jobs, my FOSS “responsibilities”, and a sliver of social life, I don’t have time to write, much less implement, a PEP, sorry.

1 Like

I know this is addressed and currently rejected in the PEP, but something like __dependencies__ with a restricted syntax (only string literals, for instance) could be a simple solution that doesn’t require a complete parser.

1 Like

Sorry, you’re right. My posts are getting so long I’m trying to keep my quoting limited, I went too far in this case.

No worries. I’m not trying to say “put up or shut up” or anything like that. But equally, I don’t have the energy to take your suggestion further (I foresee a number of problematic areas that will trigger even more rounds of debate, such as “we can’t standardise the requirements format, and yet we can’t call it a requirements file if it’s not one”). So unless someone wants to pick this up, I’ll put it in the “rejected ideas” with my concerns recorded. I hope that’s OK.

I’m not sure what you want me to say here. Unless you address the issues mentioned in the PEP, I don’t see what you’re suggesting… Even though it’s not stated explicitly, the example syntax in the PEP is restricted, because it has to be something that can be evaluated statically. That’s the point of the 4th problem in the list given. If you want to pursue this, please give a specific proposal.

3 Likes

I do not have anything to add towards a resolution, but this does not really address the existing user frustration, which for me boils down to:

There are N different official ways to do K different things in the Python packaging space.

Your proposal still raises that to:

There are N+1 different official ways to do K+1 different things in the Python packaging space.

1 Like

Thinking outside the box a bit just to see what else we could have (or why nothing else quite fits)

In https://www.pantsbuild.org/ we handle these things by mapping imports back to requirements (Record the top-level names of a wheel in `METADATA`? is kinda relevant in a way… but the other direction). Most package names map to their module names, and then for those that don’t, one big mapping (which users can extend) is the backup.

So what if these tools tried a similar approach? Scrape the imports [1] which gives you module names, then ask for those packages (probably asking some server for module → package). That should work for many cases. I think we’d miss out on optional dependencies and other, less-scrabale dependencies (like using strings with __import__).

Optional dependencies could be handled with a PEP to allow imports with brackets import requests["toml"]. I very much expect that to be rejected, however. Alternatively, import the extra, and just don’t use it (meh, but ick)

So, if you wanted to solve it for everyone, at some point you need to parse extra info that isn’t just imports (a la __requires__ from pip-run).


… So, it’s a shame that the 80% case (imports and packages align 1:1) is poisoned by the 20% case, and we can’t get this from a nice, structured way. Parsing imports has some nice benefits (remove and import, and you don’t need to remember to remove it from the Requirements block. No new thing to muddy packaging waters).


  1. And import parsing can be done easily through ast or efficiently through tree-sitter+Rust (what we do in Pants). ↩︎

1 Like

I don’t mind comment based configurations, it’s done everwhere already, like documentation. Another option, clumsy but would please purists, is to have an embedded toml data record up top in the single-file.

1 Like

I don’t know if this is a fair assessment in this particular discussion.

The first thing is that I would not be so confident in saying that the problem the PEP is trying to solve lies in the portion of the “Python packaging space” that people have been complaining about. Sure it involves installing distributions, but it is not related to the process of “packaging” a project into a distribution format that can be shared (which seams to be the point that troubles most people).

As stated previously in this discussion, the PEP focus on solving the problem of executing domestic/bespoke/personal scripts and alleviating the pain of manually managing virtual environments.
Would we make Python better if we simply refuse to solve this problem? For me the answer is no, and since different problems require different solutions, it is also natural that we have different ways of specifying different things (it is not like you can use an automatic can seamer to open a can).

The second thing is that the PEP is informational and only documents practices that are already implemented and available in the ecosystem. If anything, the existence of the PEP will be an incentive for not “reinventing the wheel” (unintended pun) next time a tool developer decides to tackle this particular pain point (which is a real pain point for many devs that chipped in this thread).

8 Likes

As far as I understood the notation expected is the one from “Dependency specifiers” standard specification (first defined in PEP 508). So yes, something like numpy==1.25.1 should be allowed.

4 Likes

See also: Sketchy and maybe crazy alternative to PEP 722

I’m wondering if the differing view/understanding around this for @jezdez stems from this “better bash” scenario compared to the “single file to distribution” scenario that has also been discussed? In the “bash script, but better” scenario, having to take a simple script and compile into an executable in the end becomes development overhead for something you were probably hoping wouldn’t take more than 10 minutes in total.

The distribution scenario, though, I don’t view it as necessarily the key motivator. I would imagine the sharing aspect of this is between machines you control (e.g., I’m setting up a new machine and I have a couple of helper scripts I use on occasion), or sharing something with a friend (e.g., someone asked how to accomplish something and it’s faster for me to write them a script than explain what they are after). I personally don’t see this as a solution for anything where multiple files would have made sense to begin with (e.g. some kid wrote a game that had graphics stored in some image files).

The papercut that comes to my mind with this suggestion is leaving out the accompanying *.requirements.toml file by accident if you moved the .py file. Right now your only option for moving a project is to move an entire directory which implicitly captures everything. The PEP allows for a simple case of moving a single file. This proposal requires remembering to either move two files or use some * globbing. Either way you can’t just go with a tab-completed command in your terminal to move files.

I will say I used to do that back in the day, but then I got bit too many times by projects which had clashing dependency requirements. It also inherently ties the script to your machine. This also assumes your Python install is not your system Python install and you won’t accidentally break your OS with your dependencies.

This is similar to the “N+1 ways to do things” argument with a similar answer: this isn’t introducing any new tools, just either standardizing what tools are already doing or empower tools to not reinvent some solution for a use case that appears to exist for folks and those tools.

As someone who will probably have to implement this PEP, my answer is “no way”. Anything that looks like Python means people will inevitably treat it as such and expect Python’s syntax to work, no matter how restrictive you meant to make it. Add on to the fact that unless you define the fully supported grammar and thus require a parser, people will implement it differently which will lead to incompatibility.

But this really can’t be a “many cases” thing; it needs to be an “all cases” thing.This also doesn’t cover the version or marker restrictions you may want to put on your requirements. It also requires something that can parse Python import statements to get the top-level package names (which isn’t too bad; I have written such a regex, although it eats into perf a bit if you were to have to run it over a very large file, which this use case is not exactly aimed at). I think what would need to be seen to consider this is examples of:

  1. The simple case; package name maps to project name and there’s no restrictions.
  2. The Pillow case; how do you map import PIL to installing Pillow?
  3. Restricted install; e.g., I only want to install packaging>=23.1.
  4. Worst, single case; project name that doesn’t match the import name and has a restriction, e.g. pillow>=10.0.0.
  5. Namespace packages space; a single import that requires multiple dependencies to resolve.

I’m assuming the 2nd case also handle the situation of multiple projects installing the same name. And then what the expected algorithm of resolving all of this to get the actual list of dependencies to install.

I’m not suggesting this couldn’t somehow work, but you do need to solve all of these situations and I don’t see how you don’t end up needing some special comment marker to go along with the imports to resolve these situations. E.g. a strawman that covers all of this is:

# ... stdlib imports

# Dependencies:
import trove_classifiers
import PIL  # requires: pillow
import packaging # requires: >=23.1
from azure import identity, synapse  # require: azure-identity, azure-synapse-artifacts

# ... local imports

But, for instance, how do you handle multi-line imports? Can that # require: show up on any line, only the first line, or only the last line? Is the lack of # requires: for the simple case too cute and not worth it? Is the opening # Dependencies: marker useful for simpler, faster parsing as well as making the simple case work as shown, or would a # requires: being required to drop the initial marker better? Is that multiple requirements bit not worth it and you should just have to write out your imports on separate lines? Is leaving the name off in that packaging example too cute/fancy? Do you support local imports as well if you drop the opening marker (which increases the parsing cost even more)?

I think the real question is what do people find more readable: this or what the PEP proposes?

8 Likes

FWIW, I agree – what you’ve described with better words than me is why I don’t think doing UX research work should be blockers, but they are certainly useful for guiding effort[1]. :slight_smile:


  1. To the extent we can guide efforts for a group of volunteers today anyway. ↩︎

1 Like

To me this argument (“not introducing any new tools”) does not hold water. The new format needs to be parsed and handled correctly, which – more likely than not – will come with a reference library or tool to do so.

But even if there’s no reference, introducing a new format (that ~everyone needs to implement) is more impactful than a new tool (that no-one is forced to use). Has anyone asked IDE and editor authors how easy it would be for them to support PEP 722? Their users surely will be asking for it. I know of IDEs that still don’t have syntax highlighting for f-strings, for example.

Is that papercut a good enough reason not to reuse existing infrastructure / formats / concepts, and drastically cut down on the implementation complexity of this PEP? I find that a hard sell.

4 Likes

I like the idea that requirements used by packaging tools are always in the same kind of place (a suitably named .toml file) with the same sort of format. I also like what @jeanas suggested, which I understood as basically “expand this to do what pyproject.toml can do”. I dislike the idea that every toolchain now potentially has to be aware of another specification (even if it’s just describing something that a couple tools already came up with) and parse magic comments. However, I like the idea of being able to keep the information in one file purely for distribution, for a single-file project.

I think I have a way to harmonize all of that.

  1. Come to a consensus that we do, in fact, want pyproject.toml to be used for projects that shouldn’t generate a wheel.
  2. Provide, with Python, one simple standardized script in the Tools dir that parses a source file for a single comment block containing text in the pyproject.toml format (it can afford to use quite naive/simple detection, I think), populates any missing required keys with sensible defaults (e.g. taking a project name from the file name and setting version to 0.0.1), and writes that file.

Now, when end users receive a single-file script, they can “install” it and its dependencies by simply running the toml-splitting script and then using their favourite tooling to install dependencies based on the now-existing pyproject.toml. When developers start a one-file project, they can just start typing pyproject.toml contents into the .py source. If it remains a one-file project, they can just distribute that file through GitHub, file sharing networks, social media etc. If the project later becomes more complex, the developer can use the toml-splitting script to create an initial pyproject.toml and go from there. Nobody has to be aware of a new standard or do any implementation work; it’s just a matter of documenting the toml-splitting script.

1 Like

As I’ve been following this discussion, I have been baffled and shocked at how controversial this has turned out to be.

  1. This isn’t a packaging standard, nothing is being packaged. There is no build process (inherent to the use case!) and no separate distributable artifact other than the file itself (again, inherent to the use case!)
  2. It is addressing something that existing tools already do because there is a demand for it.
  3. It is addressing something tools for other languages already do, because there is a demand for it.
  4. Having a format means different runners will be more compatible, rather than the existing situation in which they are not interchangeable.
  5. Having tools like this become more common makes this common single file scripting use case simpler, specifically because it means you don’t have to make an entire project out of a shell script in order to be following best practice.

All objections I have seen this far have either already been addressed, or ignore the use case. I’m wildly impressed with the patience and diligence of @pf_moore in answering everyone’s concerns, even when those voicing the concerns obviously did not read the PEP or the rest of the thread, because he has had to answer the same objections multiple times.

People keep referencing the packaging survey and how respondents indicated that they felt packaging was too complex, or best practice was unclear, or that there were too many options. I am one of those respondents. And watching this has been incredibly frustrating, because I’m witnessing the community argue that a simpler solution to a problem that I have all the time is inferior to the more complicated existing solutions that I can’t use because they just add to the problem - while nominally doing so because packaging is too complicated. And it’s absurd.

28 Likes

I too am surprised. I think this single-file script with dependencies is a fantastic idea. I mentioned before, but I think it will further improve Python as a glue language and make things a lot easier for beginners and experts alike.

In fact, I think this may even improve the situation and perceptions in relation to the so-called “packaging problem”. I’ll bet that there are a lot of beginners who fight with packaging to distribute scripts, but if this PEP existed they wouldn’t need packaging and therefore would complain less.

It also would help with packaging in other very important ways. I can see a path where this even helps remove obstacles in the way of a packaging lockfile spec. There are always circular debates about apps vs libraries vs scripts and their relationship to lockfiles. If we remove scripts from the equation, the solution is far, far, far more tenable.

If this PEP existed I can think of at least 10 different tools at my workplace that could be distributed as scripts instead of complicated zip files or docker images.

11 Likes

There seems to be a misunderstanding in some recent posts where the posts’ authors think that this will affect most of all of Python packaging. As far as I can tell, this proposal only impacts high-level Python script executors, such as pipx, pip-run, and potentially conda run, and IDEs via a feature request.

This proposal would be simple to implement in IDEs, with maybe a check-box in run configuration (or a dialogue box on run) and some very simple text parsing; I know I would enjoy implementing it.


I think the N apps for K features concern is partially misleading, as in most cases users want to increase K: my take-away from the packaging survey is only that (a majority of?) users want N to be 1. I don’t think this proposal hinders not furthers that goal (hypothetical pip run would likely implement this).


My main issue with this proposal, which has been somewhat addressed above, is that dependencies are installed inadvertently because an unknowing user happened to specify the dependencies in the valid format, then use a script executor without knowing it has the capability to install those dependencies.

2 Likes

Your emotions are yours, but staying a bit less emotional in your writing is usually more productive.

I don’t think that’s true of what I wrote in Sketchy and maybe crazy alternative to PEP 722.

Packaging is too complex → This will make it more complex, not for those who will always stick with this solution, but overall, and for those who have to understand both systems.
Best practice is unclear → This adds a new practice, making “best practice” less clear.
There are too many options → There will be one more option.

So it’s not clear to me how you immediately jump to the conclusion that the criticism towards this proposal is absurd.

Yes, from the point of view of someone who will never write anything but quick single-file scripts (which is a lot of people), this will simplify things, though at the cost of introducing yet another option (in any event, it will take a lot of time before all topmost Google search hits for queries like “run Python script with dependencies” point to resources with the new method, and in the meantime there will be some confusion, inevitably).

For those who are both writing quick scripts and more significant projects (which is also a lot of people), this will increase the packaging fragmentation, which everybody agrees is confusing.

That’s why I think it is worth reflecting on how to make single-file scripts at the same time convenient and similar to what already exists.

1 Like

Jupyter notebooks already have a popular simple magic inline command for this, as mentioned above. (Examples)

Hopefully Jupyter could instead / also support this PEP?

(A separate toml file would probably not be a viable alternative.)


How would the PEP work for e.g. torch? Typically you have to look up and copy this:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117

from https://pytorch.org/

Should scripts be able to specify a package index? It would be possible to implement “some” features, for example being able to add extra index locations. However, it is difficult to know where to draw the line

Maybe it is better to keep it simple. But it is unfortunate that ML use cases would be out of scope then.
Here is a random simple example of a single-file script / Jupyter notebook using torch.


From the Python Packaging User Survey:

What should the PSF and PyPA focus on? → #1 by far: Making Python packaging better serve common use cases and workflows

What should the packaging community do to be “an ecosystem for all”? → #1 by far: Support more interoperability between Python packaging tools

This is it. Simple scripts are a very common use case. Currently there’s no good workflow for them, and no interoperability between tools. This PEP addresses both. :+1:

What Python packaging tools do you use? → #1 by far: pip

I prefer to use several Python packaging tools, rather than a single tool → Most disagree

I would love to see support for this PEP integrated in pip as pip run. But the PEP has to be accepted first of course.

What do other packaging managers do better than Python Packaging?

  • Better deployment of dependencies when installing projects
  • Better systems for managing dependencies within a project

For example they provide simple one-liner dependecy requirement syntax for simple script “projects”.
For example F# has a popular directive #r "nuget: FSharp.Data" roughly equivalent to e.g. #r "pip: numpy". (Not an embedded .fsproj file.)

Python packaging is too complex

“Simple things should be simple, complex things should be possible.”

A very simple solution is needed. A simple script can be written in seconds to minutes. Specifying its dependencies should not break the flow. Writing a simple declaration with trivial / no syntax should be easy from muscle memory.

There are many viable ways this could be done with minimal syntax complexity:

# Requirements:
# numpy
# Pillow

__dependencies__ = "numpy Pillow"

import numpy  # requires: numpy
import PIL    # requires: Pillow

"""
%pip install numpy
%pip install Pillow
"""

There are only very few matches on Github for #Requirements or #Dependencies, so more complex syntax for disambiguation is not necessary.

Keeping the syntax similar to the existing pyproject.toml would be nice (for experts that already know it) I guess.
But if it’s not possible to keep it simple it doesn’t serve the common use case and becomes pointless.
Learning the simple one-liner syntax doesn’t seem like a real issue for experts.

This would be way too complex:

__pyproject_toml__ = """
[project]
dependencies = [
  'numpy',
  'Pillow',
]
"""

...

Can proponents of a more pyproject.toml-like approach show an example of similar simplicity as the PEP?

Transforming a script containing simple PEP declarations into a full pyproject.toml project would be trivial if it’s ever desired.

1 Like

conda used to have support for creating an environment from reading an environment spec embedded in the notebook metadata. Support for that was deprecated a long time ago and removed this year in version 23.3.0. Maybe there’s some lesson to be learnt from that experiment.

2 Likes