Pre-PEP: Locking a PEP 723 single-file script

I still really like this idea and the flexibility and utility it provides. A very simple workflow that many non-experts can probably achieve with sufficient support from workflow tools or even IDE.

Imagine an IDE had a button that said “Lock script”. Yes, people could say we can do this already with shiv or zipapp or others, but that would produce an additional file. This provides an ability to lock-in-place for simple cases.

I also think if this is standardised, then IDE can help with syntax highlighting and collapsing, or warning etc.

On the topic of Security, it is very important! But execution of any script (or binary or pyz or sh) carries some level of risk. If we’re going to block this proposal purely because malicious people can hide sneaky code, then I think that applies to all of software already.

Anyone who is pasting code into a terminal or double clicking on a binary is consciously or unconsciously taking a “risk”. Their hard drive might be wiped, their computer might crash, they might run out of memory, they might get hacked.

Or, what happens most of the time, the desired functionality that they expect occurs and they are happy with how easy the experience was.

Don’t get me wrong. I deeply care about Security. And these are extremely good points that need addressing or mechanisms to reduce risk. I’m just trying to share a point of view that “we’re all consenting adults” (and yes, some younger ones use python too) and that if we take a position that the most secure thing is not to do anything, then we’re not in a great spot. I know nobody is suggesting we do nothing btw!

The points on security are valid, but how do we get to yes?

While I agree here in principle, I think the signal-to-noise ratio here is inherently bad for for security here. While it contains more than enough data to verify only getting the dependencies one wants, people are going to fatigue reviewing blocks like this.

Lockfiles aren’t really for human consumption; they are for tooling. The pylock format isn’t special here, and isn’t particularly easy to tell something that shouldnt be included is.

lockfiles next to a pyproject.toml file are much easier to verify that the only things in the lockfile came from what was declared.

If we need tooling to easily read the actual dependencies in a script file, it’s sort of stopped being a simple script file.

2 Likes

Is there a technical reason why a tool (or IDE) cannot be written to support auditing both pylock.toml or a Python script with the appropriate metadata?

Could the lockfile block be formatted without newlines on a single comment line? Then it’s out of the way and no script can hide between the lines.
Nobody wants to read the lockfile content anyway.

Can lockfile content be dangerous? The actual dependencies are separate. Isn’t the lockfile itself just a constraint? Even if it contains “evil-lib==1.0”, as long as evil-lib doesn’t appear in the dependencies (outside the lockfile) wouldn’t that just be a superfluous constraint that doesn’t matter?

The actual dependencies are not in the lockfile, right? Lockfiles anyway already need tooling, even when they are separate files, due to their enormous length.

1 Like

I guess I am in the camp “don’t run untrusted scripts”. But… is there a way to make sure the interpreter does not execute anything past the start of the lock file block?

Is there a way to mark the end of a Python script, something like ... for YAML documents? I am quite sure there isn’t because I guess I would have heard of it by now.

Probably we do not want the runners to send a modified version of the script to the interpreter, where the lock file block and anything past it is removed.

Otherwise maybe the specification could mandate the presence of import sys; sys.exit() at the start of the lock file block, and runners must refuse to send this script to the interpreter if this line isn’t there.

#!/usr/bin/env python
# /// script
# dependencies = ["requests"]
# ///
import requests
print(requests.get("cool-url").json)

# /// pylock
import sys; sys.exit()
# ... long blob here
# ///
# fmt: off
import os  # noqa # nosec
os.system("rm -rf /")  # noqa # nosec
# ... long blob here
# ///

But yeah… I think that unless there is an easy, low-cost way to handle this, we should probably do nothing about it, because bad stuff could be anywhere anyway, in any script, in any dependency.

1 Like

I’d like to reiterate here that I hope that locked scripts will remain the exception rather than the rule. PEP 723 is a simple and effective way of making scripts easy to share. Expecting huge lockfile blocks is the exact opposite - I’d only want to see lockfile sections in very rare, specialised situations. And no, “run this Python script to install the app or do this analysis” does not count.

If locking scripts is expected to be common, I’m -1 on the proposal.

As an example of prior art, Powershell supports “signature blocks” on scripts. As far as I know, these are used very infrequently in public projects, because they are intrusive. Clearly they have an important place to play in secure environments, but they are not the norm, and I think lock sections should be viewed in a similar light.

My feeling is that before we standardise something here, it would be nice to have an implementation that uses [tool.whatever] to implement the semantics we’re discussing here.

I’m thinking of this as being relevant for both (a) demonstrating how this would work and (b) having a solution with relevant trade-offs for the relevant users. I’m guessing people still want the one-command experience of whatever run [path-or-URL].

The experience from building that can also help inform what the standard looks like as well as solving the immediate user problem of wanting a solution for the problem at hand (which, if I’m reading this thread correctly, we don’t have complete agreement on the shape of either).


My first thought was that if we want this for a uv/pipx run ..., then it’d be OK to have a separate file that’s referenced from the script for the lock file (or expected to be found at ascript_path + ".lock" location, mirroring how .metadata files are exposed in the index) and the runner would utilise that. I’m still unsure why that’s not feasible although I do see multiple responses stating that is a non-starter without outlining why [1].


  1. If you did, I apologise - I missed it. ↩︎

2 Likes

+1, having a demo implementation in some of the launcher(s) is a great idea.

It’s touched on a little here Pre-PEP: Locking a PEP 723 single-file script - #3 by thejcannon

I agree with the statement that distributing 2 files is oddly more than twice the work of distributing a single file.

Almost immediately it means one needs to consider:

  • should the files be shared in an archive/zip to keep them together?
  • what’s the name for my sidecar lock file? A new implicit naming convention that adds cognitive load
  • a separate file can live in multiple places, should metadata in the script point at the location of the file?
  • a separate file lockfie creates temptation to share it between scripts
  • are symlinks supported?
  • what if the referenced file is missing?
  • how to keep both files in sync?
  • even just the effort involved in uploading or sharing 2 files is extra work

In isolation, none of these things are huge barriers. But all of these papercuts really add up.

If I’m looking after 2 files, I may as well look after 3 files (+pyproject.toml) instead.

And that’s exactly the opposite of what I want when building:

  • single file MCP servers
  • single file marimo notebooks
  • single file gradio demos
  • single file sysadmin scripts
3 Likes

I’d expect it to be about as common as locking with pip-compile was before the current suite of workflow tools, unless those tools actively steer people towards it.

Here’s a guess at what would happen:

Some people, a small minority of total users, would start locking everything, and another large fraction of users would not even know the feature exists. The middle-ground of people who would only sometimes lock script dependencies would be the smallest fraction of users, and I think three of those four people are already in this thread. :wink:

My expectations change dramatically if a popular workflow tool starts encouraging people to “always lock”.


I don’t really like trying to guess at the frequency of future usage to figure out if a feature is a good idea. I think there’s some other requirement or concern hidden in there. Like “users are going to get themselves into trouble by misusing this”.

Can we characterize how they’ll get into or cause trouble? Or if there’s a different concern, let’s try to state that.

I do think the aesthetics of a script which ends in a bazillion lines of “machine junk” are really bad, but I’m not confident that the aesthetic argument is strong on its own.

2 Likes

For me, the biggest problem is that lock files need maintenance. You need to update the lock data as new versions of your dependencies come out, to incorporate bug fixes (and security fixes in particular). Generally, I suspect that people distributing scripts won’t expect to do that sort of maintenance, and so we’ll end up with scripts with out of date dependencies. Which might not be that bad of a problem, but it’s not good, either.

I’ll be honest, though - the aesthetic problems of a 20-line script with an appended 300-line lockfile are actually significant to me. I can argue technical issues as much as you like, but realistically, a huge file, 90% of which is machine generated junk, just makes me go “yuk” :nauseated_face:

On reflection, with the right sort of tool support running a script with a separate lockfile doesn’t feel like it would be too bad. Imagine a command uv run --lockfile pylock.script.toml script.py which runs script.py in an environment defined by the lockfile, checking that the script dependencies are satisfied by the lockfile. You still have the “distributing 2 files” issue, but users not interested in reproducibility can just use uv run script.py, so only people who need it have to concern themselves with the lockfile anyway.

2 Likes

No, there isn’t a technical reason preventing this, but I’m in the camp of “YAGNI” with a hint of “Security tools work best when there is exactly one way to use them correctly, and that’s also the only way that works” (only a hint here, because unless this becomes pervasively used, my concerns here are unlikely to apply to all but a handful of users to begin with, and I think as a community, we’re already very far from making what I would consider the right approach the common one)

This depends on definitions of dangerous. I would say yes, even with the set of dependencies more easily reviewable, because you could intentionally construct a pylock block that pins to a known vulnerable version of some transitive dependency that has been yanked, but not deleted. By explictly depending on the exact version, you get it despite it having been yanked by design.

People have brought up that defining support for zipapps could be useful for inline script metadata, although I think defining it with pyproject.toml is just as effective.

I think it’s definitely possible.

Lock files are independent from the dependencies that were used as input (by design), so they are not constraints but their own thing.

Is the check just to make sure you didn’t choose an incorrect file? The pylock file doesn’t require checking a pyproject.toml, so verifying things isn’t inherent to lock files as specified.

If you did this then I think you might as well just take the next step and support zip files.

I think the question is are you distributing the single file script because it’s the easiest distribution mechanism to users, or because it’s the easiest way to keep packaging details with your code? The former is for the situation where you’re sending your code to others to run, while the latter is you and possibly others are expected to be editing the code regularly, and so it’s more for personal use for scripting.

If you’re talking about distribution, then it almost doesn’t matter how it all looks. So a huge comment of TOML at the end of a script or a zip file with a TOML file is minimal; the user just wants stuff to work. But if you’re planning to edit it then you probably want the lock details at the end of the file to get out of your way.

But a separate file seems to make inline script metadata way less useful. If sending around a separate lock file (which you may do in a zip file anyway), then why can’t you use pyproject.toml instead of inline script metadata? Only if you expect locking to be a rarity does a separate lock file but not a separate file for direct dependencies makes sense to me.

I think we need to decide if we think single file script distribution is meant for general distribution or is it more of a convenience for personal scripts. I think if the answer is “yes” they are for distribution, then we should support including the lock file details in the file. But I also think, regardless, we consider supporting locking from the lowest/simplest level up, so that would mean zipapps if we do single files (executable directories fall under the same solution/purview as single files in my view).

2 Likes

If we are deciding that single-file scripts are worth supporting as generally distributed, so pylock blocks should be supported, I would want something required of tools that produce and consume these blocks be required by specification to help with some obvious problems.

I think there needs to be an easy way for someone who is handed one of these scripts to regenerate the pylock block with updated versions. This might seem like obvious table stakes, but I think it proves slightly more problematic if we assume a 1:1 with lockfiles because the consumer of a lockfile isn’t guaranteed to have the tools to create a lockfile, even if they have tools that can consume a lockfile. (This was an intentional split based on assumptions of a project based workflow)

This would ameliorate some, but not all concerns about maintenance and security. However, and especially if these are being distributed as things like cli-scripts, I personally worry that the recipients won’t even know the implications of locking if it behaves exactly the same to them as prior to the locking.

The below in particular is mostly covered with this requirement.

I also think there needs to be a way for a user to just query what would be installed, and if any of what would be installed are locked to versions that have been yanked. This must be done without executing any of the file.

This would help with some of the security view, but I’m not sure where the line between “enough” and “too much” is on this. We could go further, and require a confirmation on installing a yanked version, but that gets into nagging, but at the same time, I have reason to believe that “loose scripts passed around” are going to become a problem.

1 Like

Why is that? You can always delete the lock file portion if it doesn’t work for you and then just use whatever normal inline script metadata support the tool you are using provides. I don’t see what requiring all tools that can consume the lock file portion must also be able to produce it.

This is of course “simple” to do. Parsing out script metadata is simple enough (there’s a reference implementation in the PEP) and analyzing lockfiles is certainly possible. So I’m not sure what extra is needed in terms of standards. Obviously whether tooling exists to do these things will be a matter of demand - if no-one wants this enough to implement it, then it won’t exist…

Can you share those reasons? I can imagine there being problems, but it’s just speculation. And people’s real use cases beat speculation.

1 Like

My experiences with people passing around single file scripts is that they become internal tools people rely upon without the same level of scrutiny applied to them, and that over time, they do become problematic, especially when the original author is no longer available to support the person it was passed to, or it is passed to a non-development team. While I can’t guarantee it will be more of a problem with locking, I strongly believe that the prior experiences of packaging and hard pinning versions will interact with this in unpleasant ways by nature of how people treat tools like this.

I also think it may interact poorly with people’s existing expectations for python scripts[1] On linux for example, many people are used to many of their python dependencies automatically being updated when not explicitly using a venv isolating from system python.[2] With unpinned versions and single file script dependencies, this was sort of still transparently happening for users, just with the tool managing per-app venvs.

I think in many ways, the problem is less the tools themselves and more the way people are already using these tools, intersecting with concerns about who has to be aware of what for security features to work. Even if people don’t have any of these intersecting concerns, I think reviewing blocks like this in scripts is going to significantly contribute to security fatigue.

I’ll retract that. My thoughts here were along the above, but the same person that isn’t going to know why this is a problem also isn’t going to know to run something to update the block or know to delete that block. It’s more inherent to the ideas of treating loose scripts this way in the first place, so this doesn’t even help with what I would want to avoid.


I’m trying to think about this from a perspective of “How do we make the easiest way of using this, also the best way”, and I’m having trouble envisioning this as better than distributing packages with a console scripts entrypoint that are already supported by tools like pipx and uvx, but then also come with an update distribution mechanism.

I’m also skeptical of the single-file deployments as a use case here, It appears to be the only one that benefits from actual locking in this manner. I have questions as to why other tools and options here are insufficient, but this particular workflow is not something I have personal experience with.


  1. I don’t think all expectations here are good expectations, but that doesn’t make them not expectations I’ve seen people have before. ↩︎

  2. I’m not looking to reignite a war on distribution packaging vs per-app venvs, there are tradeoffs and differing use cases, and neither is perfect ↩︎

4 Likes

Thank you. I agree, this is my concern as well, that people will not realise that by locking and not maintaining those locks, they are getting increasingly out of date dependencies. The idea that packaging tools use the latest version of dependencies is pretty ingrained, and having to explain that this case is different will be hard (although “pin your application dependencies” is well known, people don’t think of scripts as applications).

This is why I am looking at locked scripts as being a rare, specialised case. If we can’t present them in that way, they’ll get used in situations where people don’t understand the implications and the downsides will outweigh the benefits.

The biggest advantages of distributing single file scripts (or zipapps) IMO is that you don’t need to use a central distribution location like PyPI, or an organisation level index. I don’t think people should be using PyPI to share applications that are of interest to maybe only a few people, and many groups don’t have the infrastructure to set up their own index. More generally, I see scripts as something that’s distributed by sending a file to someone - maybe by email, or by posting it on a file sharing site or webpage. Not something that’s published to a well-known repository for people to download.

The original post talked about how distributing “single file” applications is becoming more popular, mentioning cases like go and Rust, as well as Python tools like pex, shiv, PyApp and PyOxidizer. But the important distinction in those cases is that the source for the application might be multiple files, while the final distributable application is a single file. This is also true for zipapps.

The distinguishing feature of scripts, to me at least, is that the single file is both source and final application. I think that’s an important use case (shell scripts, batch files, etc., are a very important target here - Python offers significant advantages over shell scripts in many cases). But it’s a use case where needing a “build step” is uncommon and awkward, and being able to edit the script and just rerun it is important (certainly in development, maybe less so in deployed situations).

Locking a script feels to me like a build step, and as such, it fits uncomfortably in my head alongside the concept of a “script”. That’s why I expect people to overlook the maintenance requirement of keeping the lock up to date, for example.

3 Likes

To be clear, I see value in the informal sharing of scripts online. I don’t understand the perceived value of locking in this case, It seems like the situations where people want locking, they also are doing things that (to my mind) indicate other paths to be better. zipapps, packages, or even building a container all seem better suited to deployment cases (in different ways each) because of access to build steps and other tooling, even if it isn’t some central or public source.

I’d be surprised if a lockfile on a script can really live long enough to be helpful. Depending on how recent the Python version used to lock the script and what dependencies you have, you’ll probably only get a couple of years before someone runs the script on the latest Python and it ends up trying and failing to build a version of numpy from source that didn’t support that version of Python.

1 Like

Yes, that’s my question. I see the quoted use cases for it, but I’m more or less taking people’s word that locking a script is the right approach in those cases. I don’t know enough about the use cases to know why the other approaches aren’t sufficient, and I’m hoping someone (@thejcannon?) will clarify.

2 Likes