Adding a hash fragment to a direct URL reference

I have several direct URL references which currently lack hash fragments, so I need to generate the hashes and add them to the URLs.

I had hoped this might be as simple as doing pip hash "dist_name@https://github.com/org_name/project_name/archive/refs/tags/vX.Y.Z.zip, but pip hash only accepts local filenames, not remote URLs, and it emits the CLI hash syntax rather than the URL hash syntax.

The simplest process I have so far is:

$ pip3 download --no-deps "dist_name@https://github.com/org_name/project_name/archive/refs/tags/vX.Y.Z.zip"
Collecting dist_name@https://github.com/org_name/project_name/archive/refs/tags/vX.Y.Z.zip
... snip ... # Surprisingly, `--no-deps` doesn't skip metadata extraction
Saved ./vX.Y.Z.zip
Successfully downloaded vX.Y.Z
$ pip3 hash vX.Y.Z.zip
vX.Y.Z.zip:
--hash=sha256:1f98f2...etc...

and then manually adding #sha256=1f98f2...etc... to the direct URL reference before checking the result is still a valid reference:

$ pip3 download --no-deps "dist_name@https://github.com/org_name/project_name/archive/refs/tags/vX.Y.Z.zip#sha256=1f98f2...etc..."
Collecting dist_name@https://github.com/org_name/project_name/archive/refs/tags/vX.Y.Z.zip#sha256=1f98f2...etc...
  File was already downloaded /home/acoghlan/vX.Y.Z.zip
  ... snip ...
Successfully downloaded dist_name

Have I missed a simpler way of doing this?

If not, what do folks think of the idea of enhancing pip hash <direct url reference> to handle this dance automatically? (presumably with some appropriate CLI options so you could do things like generate direct references for a whole directory of previously downloaded files at once)

Sorry if I’ve misunderstood something Alyssa. But isn’t pip hash for the object’s owner to run locally, before the first upload, to produce a reference value? And afterwards, for users to verify their downloads against that value. Locally, after the download, not before it.

Or is the hash fragment in urls used by web servers for something else, or by other tools to produce lock files?

Automatic verification would be great, if pip download --require-hashes doesn’t do this already.

Yeah, pip hash is currently a purely local operation.

The pipeline I needed was “direct URL without a hash fragment → download artifact → hash artifact → emit direct URL with a hash fragment”.

At the moment, it seems there isn’t any readily available way to automate that, even though both intermediate steps are available (as pip download --no-deps and pip hash respectively). The last step seems to require that you just know the correct format for adding a hash fragment to a direct URL reference.

It doesn’t have to be pip hash that offers that end-to-end functionality, it was just the first place I thought to look for it (and it still seems like a natural place to offer it to me if anyone else thinks it is worth adding)

Thanks. I might be confused thinking about this from old fashioned file checksums. Can a PyPi API be queried, to add a hash to a URL?

pip-tools et al offer ways of generating hashes automatically, and don’t seem to rely on a download.

Repositories can publish hashes for the artifacts they host, and PyPI does so: Simple repository API - Python Packaging User Guide

The case here isn’t about packages in a repository (those are better specified by name than via direct URL references), it’s about creating a more robust specifier for a source artifact hosted somewhere else (e.g. GitHub tag artifacts for a repo that doesn’t publish to PyPI).

1 Like