Pre-PEP: Locking a PEP 723 single-file script

(Y’all can thank @brettcannon (no relation))

So, are you going to propose a “lock file” comment for single-file scripts? :wink:

I’m fishing for thoughts here, then will collate them into a PEP.


Background

  • All Most of the same motivations for “why lockfiles” applies as much to a PEP 723 single-file script (is that the official nomenclature?)
  • But currently, single-file scripts aren’t easily “lockable”
    • The blog post that spurred this uses cog to “pin” the deps inline
    • uv lock --script <path> creates a (separate) <path>.lock file (not a PEP 751 lockfile, but is prior art)

This is primarily motivated by wanting to distribute single-file scripts (distribution of a single-file-runnable is something the broader ecosystem likes to focus on, because of the simplicity of distribution. Rust and Go compile down to a single binary. Pex and shiv and pyapp and PyOxidizer all attempt to do the same for Python).

  • Pex: Single-file executable (zipapp) with transitive dependencies and the bootstrapping script baked-in
  • Shiv: Single-file executable (zipapp) with transitive dependencies baked-in
  • PyApp: Single-file executable (binary) with transitive dependencies (and the bootstrapping logic) baked-in
  • PyOxidizier: Single-file executable (binary) with transitive dependencies (and the bootstrapping logic) and Python baked-in

Taking an “outside-in” approach to the same set of problems these tools attempt to solve:

uv run --script <script with locked deps>
  • :white_check_mark: Single file “executable” (especially if you use #! /usr/bin/env uv run --script
  • :cross_mark: (With this PEP: :white_check_mark:) Fully pinned transitive dependencies AKA “fully deterministic/reproducible environment”
  • :white_check_mark: Brings-its-own-Python (via uv’s Python binaries support)
  • BONUS: :white_check_mark: Supports “remote” executables (uv run --script <url>) - (it don’t get much easier than that :joy: )

As an example of this kind of simplicity in distribution, all this is missing is guaranteed reproducibility:

uv run -q --refresh https://raw.githubusercontent.com/thejcannon/joshcannon.me/refs/heads/main/scripts/claudesay.py 'Certainly!'

Things to bikeshed over

Now that that’s covered, let’s get arguin’!

First and foremost, should we try and shove this into the PEP 723 metadata block?

Right now the # /// script block supports:

  • requires-python (as does pylock.toml with the same semantics)
  • dependencies (as doesn’t pylock.toml)
  • tool (as does pylock.toml with the same semantics)

That means right now we could completely overlay the existing keys and all of pylock.toml keys and have no conflicts.

This would look something like:

# /// script
#
# # Optional. Doesn't conflict with `[[ packages ]]` because it could live as the "input" set of dependencies
# dependencies = [...]
#
# # and/or
#
# lock-version = "..." 
# [[packages]]
# <a bunch o' lockfile metadata>
# ///

We could probably scope down the possible keys, which would also help this (see “What’s in it?” below)

This is honestly my vote, but keep in mind I’m biased. I don’t maintain any Python packaging tool or standards or official docs. :smiley:

Should we use a new format?

What’s the “tag”?

E.g. # /// pylock? # /// scriptlock?

My vote: pylock

Can a file have both?

My vote: yes, (i.e. how a project has a pyproject.toml and a pylock.toml) and we should require tools read locked metadata over script if both exist.

Where does it go?

  • Does its location relate to the # /// script block if it exists? (E.g. is it required to go beneath a potential # /// script block?)
    • My vote: (I don’t care. I have preferences but not requirements)
  • Can it go anywhere in file (a la PEP 723)? (E.g. is it required to be at the bottom?)
    • My vote: (I don’t care. I have preferences but not requirements)

What’s in it?

For simplicity, we could let it just have the same schema/spec as pylock.toml.

That being said, keys like environments and dependency-groups doesn’t make as much sense here and so might cause confusion/abuse.

At a minimum I’d argue we’d need:

  • lock-version
  • requires-python
  • [[packages]] (and all of the keys under it)
  • [tool]

Lets get crackin’

2 Likes

I’d find expanding on the background/motivation to be helpful here.

PEP 751 primary motivations[1] to me aren’t very fitting for PEP 723 inline-script lock metadata since , AFAIK, per script locking isn’t a popular solution[2] and having one shared environment resolved for a set/folder of scripts is fine for deployment and ad-hoc usage.


I’m wary of inline-script metadata to becoming too verbose and it will be if any of [[packages]] table makes its way into this spec.

In that sense, it leaves me with supporting a co-located lock file than an inline solution. It’s painful to distribute an entire package for a single script, but it is not that painful to distribute two files (script and lock).


  1. there exists many 3rd party tools to solve the lock file problem ↩︎

  2. Yes tools are lacking, but workarounds aren’t plenty either. ↩︎

3 Likes

Thanks for the thoughtful feedback!

I agree that OP was missing a stronger motivation (or really any motivation that wasn’t just a single pointer to PEP 751). I edited it to include more concrete/nuanced motivation(s).

I’ll spoil it here: A big one is remote distribution. Copying one file is easy. Copying two is oddly more than twice as hard.


I’m wary of inline-script metadata to becoming too verbose

I’ll call a spade a spade, it definitely will come extremely verbose. Instead of arguing that, I’d rather discuss: “Is the metadata being verbose harmful enough that it wouldn’t be worth the addition?” I.e. “The pros don’t outweigh the cons” or “I’d rather workaround the challenges of not having the locked dependency information in the script itself than deal with them in the metadata”.

2 Likes

One concrete motivating use-case is the sudden rise of “MCP Servers”.

The first example in the official Python SDK is a single file script (in the spiritual sense, not in the PEP 723 sense). Below the example is an instruction on how to run it: uv run mcp install server.py.

This is a stone’s throw away from having inline-metadata and a if __name__ == "__main__": block. Which then “unlocks” uv run <path> or better yet uv run <URL>. This isn’t a contrived use-case. This is how I’m distributing MCP servers at $dayJob. It’s just so damn simple.

However, these scripts (of which now there are PLENTY) can’t make guarantees of reproducibility or of supply-chain security (something that should be alarming, considering the rapid popularity of these things and the broad permissions and data they are given access to).

5 Likes

My thoughts:

  1. Make it a separate script block, # /// pylock. Bikeshed the name if you want, I don’t care that much.
  2. Suggest that it goes at the end of the file, but (as per PEP 723) allow it to be anywhere in the file. I prefer “at the end” because it will be verbose, and “after the code” is easier to ignore.
  3. Make it an exact copy of the pylock.toml schema. Any confusion caused by the existence of (optional!) keys that don’t make sense for scripts will be minimal, compared to the confusion caused by having two similar, but not identical, formats.

The use case you describe sounds sufficiently compelling to me, although I’d caution against triggering some sort of “every script should be locked” policy/movement. IMO, most scripts should be just fine with unlocked requirements and a relatively loose interpreter version requirement. But being able to lock when appropriate seems like a good capability to have. And being able to embed the lockfile rather than have to manage two files also feels useful.

Although if you’re sufficiently concerned about supply chain security to want a locked script, maybe running it direct from a URL (as opposed to downloading and reviewing it before running) isn’t such a good idea…? :person_shrugging:

13 Likes

I completely agree!

But also, trust is a chain (or maybe its a ladder? analogies are hard). (some) URLs can be sufficiently trusted. I trust that https://raw.githubusercontent.com/python/cpython/490eea02819ad303a5042529af7507b7b1fdabdc/Tools/clinic/clinic.py is going to consistently give me the same bytes each and every time because all I have to do is trust normal HTTPS cert stuff (really I just trust other smarter people who understand it and they themselves trust it) and trust that GitHub isn’t going to violate git’s immutability/principles or change its path schema semantics.

1 Like

I’m not a huge fan of embedding a potentially huge amount of automatically generated lock data inside scripts even if it’s at the end, it seems like in many (most?) cases this lock data would become the majority of the script file. I think PEP-723 metadata gets away with being embedded by virtue of being short and also editable by hand.

My own non-standard approach to this issue is somewhat like some of the others mentioned. It will bundle a PEP-723 script and a lockfile[1] into a zipapp along with the script runner and a bootstrapping script[2]. The script runner will pick (or possibly install) an appropriate Python version to use to run the script, and is not restricted to the runtime that launched the zipapp. Additionally by virtue of being a zipapp this can include any arbitrary data files you might need for your script inside the archive.

I feel like once you’re including a lockfile you’re already including some kind of packaging step (synchronising the lockfile) and you might be better off with a method of merging the script and lockfile as a single file product that can be launched directly as opposed to putting the lockfile data in the script itself.


  1. A ‘universal’ requirements.txt format file with hashes in this case - this predates PEP-751. ↩︎

  2. Along with a copy of pip, in case that’s missing for some reason. ↩︎

1 Like

I think if it’s placed at the end there should at least be an indication that such a block exists prior to any imports (I also wish I had thought about this more during 723 discussion, as it being anywhere in the file as similar issues there IMO) (preferably at the top of the file). This could just be # ///pylock block below as long as it’s indicative and standard. I’m not concerned about this from a security standpoint, the script itself could be malicious just as easily, but from a level of obviousness about where any import statements are coming from.

This sounds like a reasonable adhoc approach, but I think if we want a standard (which feels plausible to me, although not yet proven) then it would need formalising, with tool support rather than a bootstrap. Maybe by extending the zipapp format to include an embedded pylock.toml file. Tools like uv could then support running such a zipapp in an environment constructed based on the lockfile.

@thejcannon I think at a minimum, your proposal would need to cover why it’s better than combining the script and a lockfile in a zipapp. Both approaches need tools to add support for a new feature, both offer single-file distribution.

As a side note, what’s the current state of tool support for zipapps? Do pipx/uv support <tool> run https://url/for/zipapp.pyz, for example? I suspect it’s somewhat “accidental”, just because python script.py and python zipapp.pyz both work. But if we start adding metadata to zipapps, something more deliberate might be needed.

1 Like

FWIW, my main use case for this would be sending such scripts to “semi-technical” users, who are capable of running and understanding parts of the code without being full time developers or software engineers. For me it would be a big advantage that they could read and edit the parts of the code they wanted in plain text, and not have to manage some zip structure.

5 Likes

At the time of creation the assumption was that most users didn’t already have a script runner, hence the bootstrap. The primary goal was how can I share a PEP-723 script with someone who has Python but doesn’t also have a script runner. Being able to also include a lockfile was a bonus.

I don’t really believe that the need to extract the script from a zipfile if you wish to edit it is overly onerous. Tools to run such scripts could even include an ‘unbundle’ feature to make this easy.

For me it would be trivial, but it is an additional technical barrier for my target users, and may make it a non starter for some.

FWIW I’ve previously outlined this scenario for a uv specific features request: Allow embedding of script lock files · Issue #11064 · astral-sh/uv · GitHub

2 Likes

I agree with @pf_moore on the spelling for locking true single file pure Python scripts, but I’d also like to see the proposal cover standardising a separated lock file for zipapp archives.

Specifically:

  • the script metadata block goes in __main__.py
  • pylock.toml is placed adjacent to __main__.py
  • only the lock file being present would trigger env creation when running a .pyz (as that would indicate the script metadata hasn’t been used to populate the archive itself with the required dependencies)

Edit: while this does give us multiple ways to do things, I think the trade-offs between those ways are genuine, so different situations will favour the two different approaches.

Just wanting to show support for the proposal for single file, locked scripts.

Another motivating use case (in addition to single file MCP) is single file notebooks / single file data science via marimo notebooks. They already support PEP 723 metadata, but locking would be a step up for reproducible data science. Serializing package requirements in marimo notebooks | marimo

3 Likes

It’s worth pointing out that zipapps are largely a core Python concept, not a packaging one, so we need to be careful about how we cover them in packaging standards. Currently, I don’t believe any packaging standards cover zipapps (it’s notable that PEP 723 doesn’t cover them, for example). Also, executable directories work in exactly the same way as zipapps, so would a standard covering zipapps also cover executable directories?

I think we should restrict this discussion to the simple, easily scoped, proposal to extend the PEP 723 “script metadata” concept to include a way to record locked dependencies. Any extension for zipapps and executable directories should be a separate proposal.

3 Likes

While I can see it being useful in this context, I feel that by being a notebook format it’s somewhat avoiding the issue of there being a huge block of additional text by simply hiding all the comments in its rendered form.


I’m not sure the embedded script block makes as much sense once the files are in an archive - it could be in a separate file at that point.

That said I think actually specifying this would be a separate topic if there’s enough interest, the only thing I think this PEP would need to convincingly cover is why a bundled form, such as a zipapp isn’t satisfactory.


If I can boil down my major issue with this it’s that I use a fair number of scripts with inline dependencies, I’d be somewhat interested in standard locking behaviour but I really don’t want bundling all of the lockfile data inside the script to be the only standardized solution.

Perhaps there’s a compromise position here?

Whether the lockfile is a separate file or if it’s included as an inline metadata block I think there needs to be an entry in the # /// script block to indicate that lockfile data is going to be used instead of the dependencies declared in the script block itself.

Something like:

# /// script
# requires-python = ">=3.12"
# dependencies = [ ... ]
# lockfile.embed = true
# ///

To indicate embedded lock data, or:

# /// script
# requires-python = ">=3.12"
# dependencies = [ ... ]
# lockfile.path = "pylock.scriptname.toml"
# ///

To indicate that the tool should look for a lockfile at that location.

I like the idea, but I’ll leave this for someone to scoop up in a separate proposal, to keep this one (and it’s discussion) tightly focused.

2 Likes

A fair point, so I withdraw the suggestion.

2 Likes

My lock files are typically a few thousands lines of code…
At least 400~800 lines for tiny projects.

In fact, something like httpx[cli] which would be a great target for this spec, generates 162 lines uv.lock for newest Python or 252 lines for Py3.8+.

httpie brings in with 582 lines of lockfile goodness.

So… pretty please, can we not have the lock file at the top of the script?

P.S. the discussion about multiple files is a non-starter for me.

3 Likes

I feel like every script is going to turn into get-pip.py with the “congratulations for being security conscious and trying to read this to make sure it’s not sneaky!” :wink:

I’m a little concerned about the developer ergonomics here, and how that interplays with secure usage. Suppose the lock goes at the end, so it’s

#!/usr/bin/env python
# /// script
# dependencies = ["requests"]
# ///
import requests

print(requests.get("cool-url").json)

# /// pylock
# ... long blob here
# ///

Doesn’t this expose users to this kind of script smuggling?

#!/usr/bin/env python
# /// script
# dependencies = ["requests"]
# ///
import requests

print(requests.get("cool-url").json)

# /// pylock
# ... long blob here
# ///
# fmt: off
import os  # noqa # nosec
os.system("rm -rf /")  # noqa # nosec
# ... long blob here
# ///

I’m open to the idea that the solution is “don’t run untrusted scripts”, but I think it deserves our direct attention as a topic. A PEP would have to address this, IMO, even if only to say that it’s out of scope.

8 Likes