PEP 627: Updating PEP 376; making RECORD optional in installed .dist-info

PEP 376, the current standard, specifies that the RECORD file is a mandatory part of the *.dist-info/ directory.

In Fedora, we pack Python packages in RPM, and we have trouble generating correct RECORD files. When a file changes (e.g. a shebang is adjusted), the hash needs to change. When a file is removed or added, that needs to be reflected in the RECORD. We don’t really have the tooling to reflect file-set changes in a file that’s also part of the set.
It could be possible to build Python-specific tooling for this. It seems the solution would be quite fragile. (Tell people to use python-mv instead of rm to remove files? Run a tool, ensuring it runs as the very last build step and hoping Python can be privileged enough to “own” the last step?) But it could probably be done.

But before we go there, I’d like to ask if RECORD is actually necessary at all.
I know it’s used to uninstall packages. But Using pip (or other PyPA-standard-based tools) to uninstall/update system-installed packages always results in a giant mess. RPM already has its own tooling and file database for this.
I assume it can also be used to verifying integrity of the installation. Again, RPM has its own tools.
Is it useful for something else?

Could the PEP be updated to say e.g.:

The METADATA and INSTALLER files are mandatory. The REQUESTED and RECORD files may be missing.
If the RECORD file is missing, it will not be possible to use tooling to uninstall the package. An alternative way to uninstall the package should be provided.

Also, the PEP includes some scattered info of when the hash of a file can be left out (for .pyc files and RECORD itself), but doesn’t clearly specify when this can/should be the case. It also doesn’t mention the file size can be left empty, but from the examples it looks like it should be left empty whenever the hash is.
What are the rules here?

The PEP doesn’t read like a spec. Would it be a good idea to distill the actual spec out of the document, clarify it and put it under ?

1 Like

Probably, yes.

We’ve been moving things to the specifications page whenever we “touch” that area, like writing a PEP or something else.

I don’t think there’s any reason we can’t actively move things proactively (although idk if we want a PEP saying “hey, this moved” - that’s something for @pf_moore to declare on). :stuck_out_tongue:

As per this section of the PyPA specs page,

The preferred approach to handling corrections and clarifications for all recent interoperability specifications is to designate in the PEP that the actively maintained version of the specification is hosted in the PyPA Specifications section of the user guide

PEP 566 is an example of formally moving the “master location” for the spec to

I think that we probably should do something similar for the wheel spec - write a revised version of the spec that clarifies the formal specification of a wheel file, and support it with a PEP stating that the wheel specification is now maintained at such-and-such a location on, and future changes to the spec would need to be handled as a PEP proposing updates to that document. If nothing else, having a PEP cycle that moves the spec over will also give people a chance to debate any potentially controversial "interpretations"1.

Note: I definitely don’t think we should allow changes to a spec as fundamental as the wheel one without those changes going through a PEP, so we’d need to be careful to make it clear that any future PR changing the spec must be backed by a PEP. That’s covered in the process, but I just want to make the point explicitly.

1 For example, mandating a particular encoding for the RECORD file…

1 Like

I’m +1 on no record on the filesystem for rpm. Which is not the same as record in wheel and comes from an older pep.

You could probably always omit the hashes on disk as well. Haven’t seen a tool that checks.

My proposed update to “Recording installed distributions”, moving from PEP 376 to a specification under PyPA, is now proposed as PEP 627 and a PR for

See the proposed spec rendered by GitHub. PEP 627 has rationales of changes.

I might have gone too far with some changes when “distilling” a spec out of PEP 376; I’ll be happy to limit the scope of the changes if something is controversial.


The newest pip writes the REQUESTED file (congrats!), so I’ve added REQUESTED back to the proposal in peps#1549. (Until the PEPs page is updated, see the GitHub render.)

What do you think?

I don’t think it’s necessary. REQUESTED is already an optional file right now, and any installer can already choose to not write it. pip does not always write it either (there’s implication whether it writes this file, but only matters when INSTALLER is pip).

I don’t understand the purpose of the file, then. If some tools write it but some don’t, it’s impossible to know when a project can be automatically cleaned up.
But, this should be discussed separately; I call it out of scope of PEP 627?

Note that PEP 627 now preserves the status quo from PEP 376, and lists the issue in its deferred ideas section.

Installers should only clean up a distribution installed by themselves, so pip’s clean-up logic1 associated with the REQUESTED information only matters if the distribution also writes INSTALLER and set it to the same installer. Since non-pip installers persuambly wouldn’t write the INSTALLER value as pip, whether the installed distribution contains REQUESTED does not matter to pip.

1 Which pip does not actually implement right now, REQUESTED is only the first step toward this feature.

Where is this discussed/specified?

Hmm, I thought PEP 376 says that (it does not); I guess I was over-interpreting. pip does indeed only clean up packages installed by pip though, so I should have said this instead:

pip only cleans up a distribution installed by pip, so its clean-up logic associated with the REQUESTED information only matters if the distribution also writes INSTALLER and set it to pip.

Trying to give some kind of semantics to INSTALLER has mostly happened on various GitHub issues and at the in-person packaging summits rather than being part of the original PEP 376.

Trawling around a bit brought me back to Playing nice with external package managers, which reminded me that there are good reasons pip isn’t entirely strict when it comes to respecting INSTALLER. Anyway, I think we can declare resolving that mess out of scope for Petr’s PEP, and just go for the simple change of making RECORD optional (which I suspect will mostly solve the problem the MANAGED-BY idea was aimed at solving anyway - with RECORD missing, no Python level utility is going to be able to uninstall the project)

Edit to make my view on the proposal more explicit: big +1 from me. It makes life easier for system package managers, and provides a clear and obvious “this is managed externally, so leave it alone” signal to Python level tooling.

Alright. As I said, I don’t want to open the can of worms around REQUESTED with this PEP, so it’s back to status quo from PEP 376. If there is more discussion and PEPs around this, I’d love to take part (though I won’t have much to say, except to ensure things work well with system package managers – and PEP 627 allows those to say “don’t touch this” by leaving out the RECORD).
I would like to explicitly point out one related thing in PEP 627, which that might be controversial: the INSTALLER file explicitly has no semantics for tools. If the goal is interoperability, it does not make sense to default to not touching packages installed by other tools. (Though I think it’s perfectly OK if pip does look at it now, before the ecosystem catches up and REQUESTED files are installed by default.)

Now, how do I move PEP 627 forward?

@pf_moore, as you’re the standing PEP delegate: can you take a look? What would you like to see discussed/announced?
PEP 1 says it must be posted to python-dev, but I assume that’s a formality for PyPA PEPs and so I’d like to fo that after PyPA discussion, which, AFAIK, happens here on Discourse.

If you’re happy it’s ready for pronouncement, I’ll review the PEP and the discussion, and come back with a decision (or send it back for further discussion if I feel that’s needed). If you feel it still needs discussion, let me know and I’ll wait. I won’t have time to look at this before the weekend anyway.

As a core developer, you don’t need a sponsor for the PEP, and python-dev doesn’t need to be involved - this is the relevant comment in PEP 1:

With the approval of the Steering Council, PEP review and resolution may also occur on a list other than python-dev (for example, distutils-sig for packaging related PEPs that don’t immediately affect the standard library). In these cases, the “Discussions-To” heading in the PEP will identify the appropriate alternative list where discussion, review and pronouncement on the PEP will occur.

Thank you!
I’m happy with the PEP and I believe I’ve addressed all the discussion points. So, please pronounce when you get to it!

I don’t think it’s controversial, but are there semantics that you think would help? What if the file contained an uninstall shell command so that other tools could at least remove (and perhaps replace) installations? (With the lack of a command implying “cannot remove”)

Perhaps that would enable too much fighting between tools if (e.g.) pip and conda both insist on uninstalling each other’s numpy whenever you touch them…

There’s actually a significant number of Conda packages out there right now packaged improperly to have their INSTALLER files say pip. I wouldn’t worry about this too much since very few people bother to complain when pip “incorrectly” uninstalls those Conda-installed packages.

1 Like

I’ve responded/re-asked in a new thread: Which tools can uninstall an installed project, and how?

I’m sorry - I dropped the ball on this (mostly my own fault, but I’ll also blame Discourse for having no way to mark messages as unread or for later review :slightly_frowning_face:)

I’ll get to this tonight. Please ping me if I forget…