PyPA spec: Installers record package index source into dist-info like `direct_url.json`?

I see in the pypa spec information about a file called direct_url.json
https://packaging.python.org/en/latest/specifications/direct-url/#direct-url
Apparently installers MUST record the url that a package came from if it was installed like:

pip install https://example.com/app-1.0.whl

What if the package was installed from some package index that isn’t pypa? I want my installer to store information about which package index the package came from, but the pypa specs don’t seem to say anything about where I should store this information.

The relevant PR where I would use this info:

Also, it’s not clear to me what the spec has to say about installers adding additional files into dist-info. Is this supposed to be allowed? Prohibited? Is there a designated place for installer-specific information?

For .dist-info directories in installed projects, per the relevant section of the installed projects spec,

Additional installer-specific files may be present.

This is also implied to be true for wheels, since the specified files are only the “minimum” required to be present.

In practice, at the very least current tools dump the project’s license files straight into that directory, which can contain arbitrary names (which PEP 639 looks to change, by placing them into a license subdir instead).

1 Like

Thanks! I guess in the ideal world the spec would require the wheel and the installer to put any additional information they want to include into distinct namespaces. But if we make a best effort by either prefixing any additional files with the installer name or putting them into some installer-specific subfolder then hopefully the extra stuff we add won’t clash with the extra stuff wheels add.

Agreed - at some point I’d like to update the dist-info spec to require some form of namespacing, but that’s not trivial (due to backward compatibility concerns) and there’s no pressing need.

By the way, I should note that you need to be very careful regarding the whole idea of recording the index source. Installers are completely permitted to choose any compatible source when downloading packages, based solely on the project name and version. If it matters to you precisely which distribution file gets installed, you’re fighting the design, and will likely hit issues. Without a better understanding of your use case, it’s hard to suggest what you should be doing instead (I don’t understand the linked PR, I’m afraid).

I should note that you need to be very careful regarding the whole idea of recording the index source.

I’m interested in producing a lock file for the installation so that I can “replay” the installation process later.

It seems to me like recording the index that was chosen would be in the same realm as the direct_url.json file, but maybe direct_url.json and REQUESTED are more similar indicating what the user actually asked for and which index it came from should be private info for the installer?

That’s a good summary IMO. In practice REQUESTED is not currently designed and implemented well enough to actually record user intention, so comparing it to direct_url.json feels weird, but conceptually they are indeed more similar.

1 Like

Potentially more important is the hash of the file that was used to do the install, more than the index. If you have the hash you can get the file from anywhere, and at that point the index is simply a hint of where you can go looking. Same goes for the file name of what you installed.

2 Likes