PEP 722: Dependency specification for single-file scripts

The way I solve this problem (and perhaps many other users, too) is simple: I don’t use virtual environments. So I pip install everything that I need for such scripts in my main (not system) python and then everything… just works. (I agree that knowing the dependencies is useful even in my case, but using something like pip-run seems heavyweight.)

This is a case where venvs are a classic non-solution to a non-problem, except in the edge case of conflicting dependencies (see also my rant here). But those are not really solved by this proposal, since it doesn’t specify version numbers.

1 Like

That might be a part of the problem. :wink:

I’m only half-joking there, and I’ll also caution that we shouldn’t really take an all-or-nothing approach – doing a round of UX research isn’t “free” and we ought to be congizant of that. At the same time, doing a round of UX research isn’t a panacea either; there’s definitely ways to do it that are meaningfully worse that not having done it in the first place.

While I agree that having a better sense of what users want/need is definitely something we want & need – and we have some of that through the survey conducted – we also know that there’s knowledge above and beyond what’s in the surveys that we’ve collected through interacting with people and the projects directly. Having a round of user interviews done is a benefit to a spec like this that makes it more likely (assuming positive feedback) but not doing that shouldn’t be a blocking concern.

1 Like

Thanks for the extensive response, I appreciate it and will have to chew a bit on our misalignment regarding the problem space.

Oh, actually, let me write down my thoughts on this conda/PyPA compatibility issue quickly:

I don’t worry about the PEP related to conda specs any more than before since it’s essentially the same mapping issue between PEP 508 specifiers and the conda matchspecs, as we have elsewhere. Tools like conda-lock have been able to work on rectifying that, and I believe the mapping issue could also be solved for conda users if there is a need (which is a big if, which has been discussed elsewhere).

The only wrinkle is a UX issue around the already existing conda run which behaves differently than the PoCs (pip-run and pipx run) mentioned in the PEP: conda run requires specifying a conda environment by name or path and only works with executable scripts/binaries installed in that environment. It might be possible to retrofit it with a --file option and default to a temporary environment if no env name or path is given. Let’s see how the PEP goes before we dive deeper into that.

I haven’t looked at the existing pip-run implementation, but it’s
possible it could be extended with the option of specifying the name
of a persistent environment (venv) to reuse directly or to create if
it doesn’t exist, which sounds like would bring it at least
superficially closer to the conda run behavior.

[…]

The way I solve this problem (and perhaps many other users, too)
is simple: I don’t use virtual environments. So I pip install
everything that I need for such scripts in my main (not system)
python and then everything… just works. (I agree that knowing
the dependencies is useful even in my case, but using something
like pip-run seems heavyweight.)

This is a case where venvs are a classic non-solution to a
non-problem, except in the edge case of conflicting dependencies
(see also my rant
here
).
But those are not really solved by this proposal, since it doesn’t
specify version numbers.
[…]

It seems orthogonal to the use of venvs, at least to me. I
personally would actually use it to determine what needs to be in
the venvs I create on the fly for my random one-off scripts. As I
mentioned earlier, this idea is something I (and many others, I
think?) are already doing, but it offers a standardized format for
listing the Python package dependencies of a script within the
script itself so that creating the environment that script needs
(either manually in advance, or in an automated fashion on the fly)
is a task we can collaborate on interoperable solutions for.

Well, the proposed PEP explicitly says

Of course, declaring your dependencies isn’t sufficient by itself. You need to install them (probably in some sort of virtual environment) so that they are available when the script runs.

And, indeed, that is the way both pip-run and pipx would handle the situation as far as I understand…

Well, the proposed
PEP

explicitly says

Of course, declaring your dependencies isn’t sufficient by
itself. You need to install them (probably in some sort of
virtual environment) so that they are available when the script
runs.

And, indeed, that is the way both
pip-run and
pipx would handle the situation as
far as I understand…

Sure, but it doesn’t mandate the use of venvs, so I don’t understand
the point to your rant about venvs. Just because the author of the
proposal, and the authors of the tools already doing something along
these lines, use venvs that doesn’t mean you have to use a venv to
get some value out of the proposal. I get that you don’t like venvs,
so… just don’t use them? And accept that there are others who do
find them useful as actual solutions to broader problems than you’ve
encountered (or perhaps ever will encounter) rather than derisively
declaring them a “non-solution” or pretending the problems some of
us deal with are nonexistent.

I could see, for example, using a tool which checks your preferred
environment for the presence of the packages specified in a PEP 722
compliant comment block in a script before running that script, and
either warning you that you’re missing dependencies (by telling you
what exact packages you want) or simply installing them for you,
whatever your comfort level. That could work equally well in a
non-isolated shared environment or with a persistent venv (and the
latter is how I would use because I need the additional isolation in
many cases). It’s something I find useful and already do because I
need to rebuild my environments semi-regularly, so welcome the
opportunity for an interoperable specification around it.

1 Like

Of course, and if my “rant” (which I admit is my own word for what I wrote!) implied otherwise I certainly should not have done so. And indeed a tool which just ensured that my environment (virtual or otherwise!) had the correct packages installed would indeed be useful!

I could say more about virtual environments but that is not appropriate for this thread.

1 Like

To be clear, those are not PoC implementations of the PEP. They are existing tools that have implemented a “run-with-dependencies” operation for scripts, because there was demand for that feature. They will continue to exist whether or not this PEP is accepted.

All the PEP does is document in a common place, the format that both tools use for extracting dependencies. It also tries to make the format useful should people write other tools that need this data, and it will almost certainly make slight changes to details of the format based on feedback here (which pip-run and pipx will probably implement, because following standards is a good thing, but they don’t have to, of course).

Think of pip-run and pipx more as existing use cases for this PEP, and it might make more sense.

3 Likes

But this expansion of covered use-cases (great!) should not make the situation worse for everything else, and the packaging survery could not have been more clear about reducing the divergent tools in packaging, so adding yet another way to specify dependencies understandably meets resistance – and it’s on the PEP to prove this necessity.

It’s also a really ugly way: magic comments. This breaks syntax highlighting, much automated tooling (due to parsing two different syntaxes in the same file), is prone to diverging in semantics from requirements.txt / pyproject.toml / poetry lock files, etc.

So I do not buy the “single file” requirement, or at least not that it trumps all of the above. This does not mean that I’m dismissing your use-case, but I do believe the same result (having a script without too much structure or ceremony & a reasonable way to specify its dependencies) could be achieved differently, for example by:

scratch/
  - my_fancy_script.py
  - my_fancy_script.requirements.toml
  - ye_old_workhorse.py
  - ye_old_workhorse.requirements.toml
  - [...]

That would still give a clear approach, without much overhead: Start hacking away in xyz.py, and once you need third-party dependencies, add xyz.requirements.toml. The actual suffix for that is bikeshed central, but that way we could:

  • reuse some existing infrastructure (e.g. a reduced form of pyproject.toml), rather than having yet another way for dependency specification.
  • users who want to “graduate” their script for some reason just have to rename that file and add some extra metadata to make it a full-fledged project.
7 Likes

True. My point was more that as you say, UX research is costly, and it’s not always easy either to ensure the data is representative and unbiased, or to interpret the results accurately. And while the participants in this discussion are certainly self-selected, I don’t imagine that the sort of user research we could reasonably undertake would avoid at least some level of self-selection. Even the “big” user survey that @jezdez has referred to involves a certain amount of self-selection, even if it’s only selecting “people willing to take a survey” (which is likely to be biased towards “people who have a point they want to make”).

Eliminating such bias is a complex, specialist task. I have some background in statistics, so I know enough to know I don’t know how to do it properly, but that’s all :wink:

User research is absolutely a good thing, and we should do more of it. But it’s not a way of avoiding having to make choices based on our experience and knowledge. And sometimes choosing what (in our view) is right over what’s popular.

2 Likes

This PEP adds literally no new tools, and no new data formats. All it does is make one existing format (used by two existing tools) into a standard, so that if we (for example) later replace those two tools with a single new one (reducing number of tools?) then users don’t have to change their code (reducing churn for users).

That’s a fair criticism. I’m open to other suggestions. But many other languages use the “structured comments” approach, so it seems like it isn’t so bad in practice.

… and we’re back here again. How many people stating on this thread that they have a requirement for being able to declare dependencies in a single-file Python script are needed to demonstrate that this is a real-world use case?

OK. Maybe that would work. My gut instinct is that it would be something I’d use reluctantly, and be frustrated by various “papercut-level” annoyances. But I don’t want to reject a reasonable proposal just because it’s not my favourite. Also, none of the other languages mentioned in the survey of languages linked above use a separate file[1], so it feels like it’s going against common practice. Do you have examples of other languages using this approach that you can point to?

If you’re serious about this suggestion, are you willing to get it added to pip-run and pipx? What’s the transition plan from the existing behaviour to this proposal? There’s a whole “backward compatibility” section of the PEP that will need writing if we go down this route.


  1. Yes, I concede that’s at least partly because the survey is of single-file solutions. ↩︎

4 Likes

You snipped my statement in a somewhat unflattering way; I do accept the use-case. Luckily, your PEP is named “dependency specification for single-file scripts”, which I have no problems with as a requirement. My point was the the dependencies do not have to be in the same file to achieve that.

My response was aimed at pointing out the potential solution space between “single-file script” and “single-file script+requirements”, and that it’s possible to support the former in a way that doesn’t (a priori) create yet more UX & teachability problems.

I do care about python packaging (and not increasing divergence further), but between 2 jobs, my FOSS “responsibilities”, and a sliver of social life, I don’t have time to write, much less implement, a PEP, sorry.

1 Like

I know this is addressed and currently rejected in the PEP, but something like __dependencies__ with a restricted syntax (only string literals, for instance) could be a simple solution that doesn’t require a complete parser.

1 Like

Sorry, you’re right. My posts are getting so long I’m trying to keep my quoting limited, I went too far in this case.

No worries. I’m not trying to say “put up or shut up” or anything like that. But equally, I don’t have the energy to take your suggestion further (I foresee a number of problematic areas that will trigger even more rounds of debate, such as “we can’t standardise the requirements format, and yet we can’t call it a requirements file if it’s not one”). So unless someone wants to pick this up, I’ll put it in the “rejected ideas” with my concerns recorded. I hope that’s OK.

I’m not sure what you want me to say here. Unless you address the issues mentioned in the PEP, I don’t see what you’re suggesting… Even though it’s not stated explicitly, the example syntax in the PEP is restricted, because it has to be something that can be evaluated statically. That’s the point of the 4th problem in the list given. If you want to pursue this, please give a specific proposal.

3 Likes

I do not have anything to add towards a resolution, but this does not really address the existing user frustration, which for me boils down to:

There are N different official ways to do K different things in the Python packaging space.

Your proposal still raises that to:

There are N+1 different official ways to do K+1 different things in the Python packaging space.

1 Like

Thinking outside the box a bit just to see what else we could have (or why nothing else quite fits)

In https://www.pantsbuild.org/ we handle these things by mapping imports back to requirements (Record the top-level names of a wheel in `METADATA`? is kinda relevant in a way… but the other direction). Most package names map to their module names, and then for those that don’t, one big mapping (which users can extend) is the backup.

So what if these tools tried a similar approach? Scrape the imports [1] which gives you module names, then ask for those packages (probably asking some server for module → package). That should work for many cases. I think we’d miss out on optional dependencies and other, less-scrabale dependencies (like using strings with __import__).

Optional dependencies could be handled with a PEP to allow imports with brackets import requests["toml"]. I very much expect that to be rejected, however. Alternatively, import the extra, and just don’t use it (meh, but ick)

So, if you wanted to solve it for everyone, at some point you need to parse extra info that isn’t just imports (a la __requires__ from pip-run).


… So, it’s a shame that the 80% case (imports and packages align 1:1) is poisoned by the 20% case, and we can’t get this from a nice, structured way. Parsing imports has some nice benefits (remove and import, and you don’t need to remember to remove it from the Requirements block. No new thing to muddy packaging waters).


  1. And import parsing can be done easily through ast or efficiently through tree-sitter+Rust (what we do in Pants). ↩︎

1 Like

I don’t mind comment based configurations, it’s done everwhere already, like documentation. Another option, clumsy but would please purists, is to have an embedded toml data record up top in the single-file.

1 Like

I don’t know if this is a fair assessment in this particular discussion.

The first thing is that I would not be so confident in saying that the problem the PEP is trying to solve lies in the portion of the “Python packaging space” that people have been complaining about. Sure it involves installing distributions, but it is not related to the process of “packaging” a project into a distribution format that can be shared (which seams to be the point that troubles most people).

As stated previously in this discussion, the PEP focus on solving the problem of executing domestic/bespoke/personal scripts and alleviating the pain of manually managing virtual environments.
Would we make Python better if we simply refuse to solve this problem? For me the answer is no, and since different problems require different solutions, it is also natural that we have different ways of specifying different things (it is not like you can use an automatic can seamer to open a can).

The second thing is that the PEP is informational and only documents practices that are already implemented and available in the ecosystem. If anything, the existence of the PEP will be an incentive for not “reinventing the wheel” (unintended pun) next time a tool developer decides to tackle this particular pain point (which is a real pain point for many devs that chipped in this thread).

8 Likes