(post deleted by author)
You already know more than I do! The only thing I can add is that I’ve found this presentation from 2015 when they explain how this was meant to be used:
Working with notebooks $ conda create -n project $ conda install -y bokeh pandas jupyter $ ipython notebook iris.ipynb $ conda env attach -n iris iris.ipynb $ anaconda notebook upload iris.ipynb Reusing your notebook $ anaconda notebook download malev/iris $ conda env create iris.ipynb $ source activate iris $ ipython notebook iris.
For the record, you already can do the following with fades:
- Define which import needs a package install by commenting it
#!/usr/bin/env fades import math import requests # fades ...
- Specify it in the script’s docstring:
#!/usr/bin/env fades """Super fun script. The following deps will be handled by fades: requests """ import math import requests ...
- Or just specify it at run time, but it’s not that fun:
$ fades -d requests myscript.py
In any of these cases fades will create a virtualenv with just
requests installed (if not there already to re-use it) and run the script in the context of that virtualenv.
Do you know whether
fades would be willing to support this PEP if it were accepted? If not, is there a particular reason why?
I don’t have much time to comment on this but I both acknowledge this as a valid use case and also am a soft -1
I think the solutions I see are as follows, in order of preference:
- Create the concept of a “script directory” that would require a single
pyproject.tomlwhere the stem of every script corresponds to a key in
optional-dependenciesand tools would manage the dependencies for a given script’s environment based on the value of that key
- Wait for this to be standardized in which case package indices can serve an API for a reverse lookup (sorted by most downloaded)
This discussion has grown quite a bit! Several people have said things I agree with and some have said things I half-agree with so I’m just going to try to navigate a few of those. I guess the short version though is this: I think maybe what we need is less a discussion of this PEP and more a discussion of “what do people overall want in terms of running single-file scripts and what is the best way to achieve that”?
I suggested something like this earlier in the thread, although I can understand if people didn’t see that because I buried it in a long rambly post about my own approach to this problem. But yes. . .
. . . I do think it means we should be careful not to assume that because the script is one file, the script and dependency information must also be one file. In other words, there’s a difference between “a single file” and “a single runnable file (with an accompanying non-runnable dependency file)”.
That’s appreciated. Like I said a bit earlier in the thread, I agree that single-file scripts are a valid Python use case and I agree they’re not well-supported by current packaging/dependency/environment standards. But I just think we should be a bit careful. Just because that is a need that could usefully be met doesn’t automatically mean this proposal is the best way to meet it.
In the PEP you have several rejected alternatives. You have good rationales for rejecting them, but they’re essentially judgment calls. On some of those points I would make those calls differently or at least want to consider them more before making them. And the way I read some of the comments in this thread is basically other people saying they would make some calls differently too. So I think for some of these it’s not just a matter of saying “see the rejected alternatives section”; the question is whether you rejected an alternative that maybe should actually be accepted instead.
As I see it this is kind of trying to have your cake and eat it too. The PEP is just abstractly about a format for in-script dependencies, and in and of itself has (or should have) nothing to say about backwards compatibility concerns of pip-run or pipx or any other third-party tools. Those tools or any others can use this format, or can use some nonstandard format (just as they’re doing now). If something like the alternative proposal @h-vetinari suggested were to be approved, well, pipx and pip-run and maybe some other third-party tools would not be compliant with it, and they could fix that, or not, but that’s neither here nor there with regard to what the standard is. Maybe the ideal path (or at least a good path) is that some discussion happens and people go “okay yeah actually a slightly different version of this would be better”, and then pipx and pip-run implement that alternative, and it works great, and then we can take another stab at codifying that in a PEP.
With regard to the packaging survey and users’ thoughts on the profusion of tools, I’m a bit ambivalent. As was discussed at length on other packaging threads, part of the problem there is that talking about standards doesn’t magically cause work to be done or tools to be improved. A lot of what users want is tools to do things; standards about how things should be done impact the typical user only indirectly, insofar as someone actually implements those standards.
So, on the one hand, that means this proposal will probably have limited effect on users’ confusion, because it’s just codifying behavior that already exists. The main risk is that having this mechanism blessed with a PEP will increase the number of tools in this space, but they will start to diverge in various ways and users will have a tough time choosing between them. That said, I do think the PEP amplifies this concern by alluding to the possibility that other tools could deviate from the given format (e.g., “accept anything that pip accepts”). That seems to really open the door to potential confusion. If we’re going to specify a format, let’s specify a format.
On the other hand, though, that’s sort of why I see less upside for this PEP, or even some downside. As I understand it, pipx and pip-run already do this (or soon will). So this PEP doesn’t give users anything they don’t already have in those tools. Is there a worry that, without this PEP, other tools will start to do it in slightly different ways, and that will lead to an increase in user confusion? But if, as I mentioned above, some of the rejected alternatives might actually be better, wouldn’t it actually be bad if we proscribed those other ways of doing things?
In my view (and I realize that not everyone agrees with me on this. ) the real way that interoperability standards can reduce user confusion is when there is a thicket of alternative ways of doing things, but it becomes clear that some of those are just differently colored bikesheds, or some are better than others, and then a standard can come in and say “a bunch of people did a bunch of stuff, but we’ve now decided this is the official way”. I’m a little leery about approving a PEP like this where I see stuff in the rejected alternatives that sounds (at least potentially) better to me.
I agree! I’m not saying we need a full-court-press user survey for every single thing. In fact I think by the time a PEP is proposed, broad-based UX research may not be the right thing; there can a tendency for people to support a proposal because it claims to meet a certain need, and only later realize they don’t like the way it tries to do so.
But rather, like I said at the beginning of this post, what I think is beneficial is more discussion at an earlier stage to find out what users’ goals are and what the obstacles are to those goals, and then that can inform a more specific proposal. Sort of seeking “feed forward” rather than feedback.
More generally, though, like I said repeatedly on some of the other threads, my own take on the sentiment expressed in the user survey is not so much “there are too many confusing standards” but “Python does not come with an included battery that does everything I want packaging to do in a coherent manner”. It is because of that that users are cast adrift and must navigate through a sea of alternatives. This PEP is really neither here nor there with respect to that user sentiment (if it is a common user sentiment, as I believe) — because users who believe that won’t care if there are two or three or a thousand tools implementing this, they will just say “if this behavior is so useful and great, why doesn’t it come with Python?”
To summarize this long post:
- Being able to run single-file scripts without having to engage the full Python packaging mechanism is a real need. But the PEP makes judgment calls about how to do that, and I’m not sure all of those are the right ones, and at the least I think we need a fuller discussion of them.
- More user input is good, but I think it’s better to get that input before getting to the point of making a specific proposal like a PEP. This makes it more likely that the proposal will actually make users feel like they got what they wanted.
- In my view, a big part of the problem users have with Python packaging is not with standards nor even with tools, but specifically with the default tools that come with Python. So I don’t see the packaging survey as saying too terribly much about this specific proposal; the question is whether the proposal will make it into official included Python batteries.
(Sorry by the way for my earlier empty post. I accidentally hit reply way too early on this post, then deleted it, but then couldn’t post the real one because the thread is in “slow mode”.)
Likewise, I don’t think I’m the only one who thought there was some genuine need that PEP 582 was trying to meet, and I hope I’m not the only one who agreed that that particular proposal wasn’t the best way to meet it. ↩︎
As a concrete example, it’s not clear to me that that it’s better to have this ad-hoc format instead of something like TOML-in-comments ↩︎
although that’s not saying they’d make the same alternative calls I would ↩︎
That makes it all the more important that we specify the right format. ↩︎
and I acknowledge nothing in the survey results said this in so many words ↩︎
To me the single file requirement is the entire point. There are already ways to define dependencies once we have a multiple file project, but there is no standard way to define them inside a single file script.
If you come up with another method to define dependencies outside the .py file then anyone who actually needed a solution for the single file script use case will continue to come up with other ways to define requirements, or avoid using third party packages entirely because that ‘solution’ ignored the issue it was supposed to solve.
Separately to this, I’m a little disappointed that specifying version of Python is considered out of scope as moving machines and finding out the version installed is missing something I’ve used after wasting time installing dependencies has definitely bitten me before. At the very least I’d like to see a stronger, specific argument as to why it’s considered ‘out of scope’.
Put another way, people already will (and do) track packaged
dependencies of their scripts inside their scripts. Currently there
is no standardized/blessed syntax for doing it, so everyone who does
it does so differently. Saying the solution to that is not to put
dependency tracking in the script won’t stop people from doing that,
it will just mean that they continue to do so in a fragmented and
uncoordinated way because nobody wants to provide an interoperable
specification for tools to gravitate toward.
If the resulting state of PEP 722 is “script dependencies go in
another file that’s not the script” then I and others will just
ignore its existence and continue to track our dependencies inside
the scripts that need them, so the single file requirement really is
central to the use case, full stop.
It can probably be added via a later PEP.
My take on this this is that the tools in the PyPA-centric packaging ecosystem are not really ready for this kind of feature yet, although there are some new things appearing in this domains (
pip --python, posy, the py launchers, and so on). On the other hand it is a feature central to the conda ecosystem.
This part has been nagging at me for a while now but I wasn’t sure how to put it. To me, the current proposal doesn’t totally solve the “single file” requirement for distributing scripts because the user still needs the correct python version installed. They also need to install a third-party tool, but this could change in the future with e.g.
The colleagues I’m most likely to share a single-file script with are the sort who don’t know the difference between their MacOS system Python and a conda env. So this removes one step from the setup process but there are still a few remaining, and I’m not sure this one is our biggest problem.
That said, it’s clear that other people have quite different workflows and this might work in other environments (like, a place where everyone has a consistent user-space python installed on their machine)
It doesn’t need to be. However, as I understand it, the motivating idea here is that distributing a single file is convenient in ways that distributing more than one file at a time isn’t; and the goal is therefore to provide a way to distribute single files (and have them “just work” on the receiving end, as long as there is a compatible runner in place) without an explicit packaging step.
FWIW, I proposed a TOML-in-comments approach earlier. It has the advantage that it doesn’t really require any discussion or standardization; it only needs the existence of a tool (or integration into an existing script runner) to detect that comment and make a separate file from it; then all existing tooling works normally. However, it’s noticeably less convenient to write, and implementing the “name and version are optional/inferred” functionality requires some design.
(I definitely don’t want this to be an extended side-track – if someone wants to talk about what PEPs are and how subjective they should be, let’s split that off into a separate thread)
PEPs are ~always subjective + full of “judgement calls” made by the authors. There are ~always tradeoffs in the designs worth putting into PEPs, and the PEP is a proposal for how to deal with the problem at hand.
The question is whether we think it’s overall beneficial given the tradeoffs (with the specific design being proposed based on what the PEP author has written). Figuring things out and discussing the design details is why we write these PEP-style documents and discuss them.
If someone wants different choices/tradeoffs picked here, they can either make the case for them here (as I did earlier in this thread) or write a competing PEP to cover the usecase. Usually, explicitly voice your concern to the PEP author is sufficient, given that it’s done publicly (and, isn’t unnecessarily repetitive as many replies in this thread have been ).
How various people feel about this is taken into consideration by the person(s) responsible for making a decision on the PEP (SC or, in our case, the delegates). It’s definitely happened that a decision is deferred because a PEP doesn’t cover the “rejected ideas” sufficiently, though that isn’t a problem here.
Some notes about that option that I couldn’t edit because the thread is in slow mode:
- now that I think about it if that were a thing I would definitely use it
- if projects have such a desire for random scripts that would be checked in then existing dependency tooling would work out-of-the-box like version upgrade automation and security scanners
- tooling that can add/remove dependencies via CLI would have a clear path to supporting script management also
Sure, and I didn’t mean to suggest otherwise. Just saying that I think some of the perceived pushback or unexpected disagreement has to do with the details of how the PEP proposes to do things, and not a rejection of the idea of single-file scripts or of running/distributing them.
If there is a standard to specify in-script dependencies,
fades most probably would follow it. The current specification looks simple and flexible enough.
Consider FHS and wanting to put a script in
*/bin should contain only executables, not non-executable files.
No, it’s not? I’m very curious what makes you say this.
The PEP, at no point, implies that it is meant to be about distributing any script file. Searching for “dist” in the PEP gives me 0 results. The Rationale section has the following sentence…
Having to consider “uses 3rd party libraries” as the break point for moving to a “full scale project” is impractical, so this PEP is designed to allow a project to use external libraries while still remaining as a simple, standalone script.
… which clearly indicates that it’s not about distribution but about usage; and about having a place to put supporting information in a discoverable manner so that running a script becomes easier. It’s even been stated by the author that this isn’t trying to address script file distribution.
As this post is in slow mode, it’s hard for me to respond to individual points here in a manner that keeps individual topics separated, so what I’m going to do is try to pick out some key comments and respond in a single message. However, I will say that I’m currently doing a major overhaul of the PEP to try to incorporate the various comments made here. Just to be clear, the proposal itself is essentially unchanged, but I want to make sure the rationale and motivation are as clear as I can make them, to avoid the misunderstandings that have come up repeatedly in this thread. I also want to make sure the “rejected alternatives” section takes into account the questions that have been raised.
I don’t expect we’re done with discussion yet (although hopefully “slow mode” will allow people to take more time to think before posting - I know I’m taking advantage of that) so I doubt the next revision of the PEP will be the final form, but hopefully it will be a lot closer.
OK, so onto specific points.
This is very much the case, and honestly, the confusion is my fault, because my motivation has always been the “better batch files” scenario. I only think of distribution in the sense that I’d email a
.bat file to someone (or post it in a gist) and I’d like to email a
.py file in the same way. But I didn’t make that at all clear in the PEP. I’m still struggling to explain that motivation well, as I also want to emphasize the fact that this is simply standardising existing practice, and not proposing any new functionality. The two explanations are difficult to combine without then giving the impression that I expect to see a huge explosion in the use of tools like
pip-run (something else that people have misread into the existing PEP…) So yeah, this is a work in progress right now.
This comment (and variations of it) has come up a lot, and it seems like a huge exaggeration. I’m genuinely baffled as to why people think this PEP is going to have such a massive impact on the ecosystem. To me, it’s a relatively minor tidy-up, and the only way it would have a big impact is if it suddenly triggers a lot of interest in an important use case that we’ve so far ignored. So the people arguing against the PEP on the basis that it’s going to be disruptive almost look to me like they are saying “please let us continue ignoring this important use case because we don’t want to deal with the consequences”
While I’d like it if we, as a community, chose to work on improving the “python as a better batch file” workflow, I’m also perfectly happy if we continue to leave it to tools to experiment and innovate for now. This PEP is assuming that the latter is what will happen. I have no personal interest in championing something as big and controversial as a PEP to do the former, but I’ll support anyone who wants to try.
Thank you for this perspective. I’ve hesitated to say this (because I’m not a typical end user) but I also responded to the survey, and I also agree with the “too complicated” idea, but not with the way it’s being used at times to hinder progress. Interpreting results from a survey like the packaging one is a complicated, specialised skill (I’ve witnessed UX specialists doing precisely that, and I know I couldn’t have done what they did) and while I think it’s important that we heed the results of the survey, I also think it’s critical that we don’t simply use the survey to reinforce our own prejudices - I’m sure I could find arguments in support of this PEP by (selectively) quoting the survey, but I don’t think it’s a useful way of making my points.
We’ve had a number of people present reasons why having everything in a single file is a key requirement. I’ll hunt out as many as I can find, and add them to the PEP. But for now, let’s just say I’ll be adding a rejected option to the PEP of “Store the dependency data in a separate file”. If you (or anyone else) want to propose a solution that uses a separate file, I suggest writing it up as an alternative PEP, and in particular going through the points made here and addressing them.
Nobody’s assuming that. There have been specific reasons given. The most obvious one is people moving (or sharing) a script and forgetting about the extra file (and no, “just don’t forget” isn’t a reasonable solution for this). But there’s also the case of directories like
/usr/bin where (by convention or design) everything in that directory is treated as an executable. And
pipx allows running a file from a URL (such as a gist), where you can’t even necessarily work out where the associated dependency file would be.
I will add one to the PEP. But to a certain extent it boils down to “no existing tools provide this, so the argument that it’s essential are weak, and it can be added later in a different PEP if it turns out that there is a critical need that existing tools don’t address”. That, plus I want to keep the PEP focused on “formalising things that already exist” and not “defining new functionality”.
Projects with “random scripts” can already use nox, tox, hatch, or any one of a plethora of environment managers. Or pipx/pip-run. What extra does this add to that mix? You say “existing dependency tooling would work out of the box” - but they’d still need to support all of the existing approaches. And in any case, tooling could support this approach, but they’d still get asked to support single-file scripts.
I’m not the one arguing that the survey demands fewer solutions, but I’d expect a bunch of people here to raise the same objections to this proposal that they do with PEP 722 in terms of “adding more ways of doing things”. At least I’d hope they do - otherwise I don’t see how they can argue that they are being consistent when objecting to the PEP…
Possibly with a lot of feedback on problems with their proposal, but hey, constructive criticism is good too ↩︎
I think I’ve been misrepresented here, because in fact I agree that your use case has merit.
What I’m concerned about here is that people will see script runners as part of toolchains, and then everyone’s script runner will be expected to do the thing, because the PEP exists. And the point here is - yes, the use case has so far been ignored, and we could solve problems for people be paying attention to it. But if we do it this way, we lose reuse: since the requirements are now not being specified by the “write a pyproject.toml that includes a requirements specification” method, now there needs to be a separate process to parse those requirements.
But then, I need to consider this in the context of what @pradyunsg pointed out to me:
See, this is the part where I get confused. Because if it isn’t about distribution, then I don’t understand what the “important use case” actually is. It sounds like what you’re worried about is that someone writes a single-file script that needs, say,
requests, and then doesn’t want to have to switch to “a full project” because of the need for that requirement - as described:
…But if the point isn’t to keep everything in a single file for the sake of keeping everything in a single file (and the only reason for that which makes sense to me is what I said earlier:)
in this case, I don’t see why there is a problem with using a separate file. I think that “moving to a full scale project” is a canard here, because there isn’t a demand to go from one file to an entire project template; there’s a demand to go from one file to two (the source and a skeleton
pyproject.toml that includes
project.dependencies entry along with the other minimum requirements).
Unless, perhaps, this is purely meant to work around “I don’t think it should be necessary to fill in anything else just to list dependencies, but previous PEPs mean I can’t have that”?