PEP 627: Updating PEP 376; making RECORD optional in installed .dist-info

(You can set a bookmark with a reminder.)

1 Like

OK. My notes on the PEP and on the proposed wording of the packaging spec.

  1. You say you donā€™t want to get sucked into REQUESTED, and I agree. But the current wording feels like it implies more than you want to. Iā€™d suggest adding a note at the top of the section on the REQUESTED file saying something like ā€œInstallers are not required to maintain REQUESTED, and so consumers must not assume that the lack of a REQUESTED file means anything in practice.ā€ (Or you can take the same view as PEP 376, that tools will maintain this data - in which case you need to amend the comment that ā€œAlmost all information is optionalā€ to reflect that REQUESTED data is not optional).
  2. I donā€™t see much discussion of the removal of the requirement for hashes/sizes to be present. I get your argument that no tools currently use that information, but once we remove the requirement for it, itā€™s going to be more or less impossible to re-introduce it. Which means future tools wonā€™t be able to say things like ā€œrefusing to uninstall FOO as local changes have been madeā€. Is there any concrete reason for relaxing this requirement? (I donā€™t buy ā€œsimplifies the specā€ and I donā€™t see the issue with expecting shebang rewriters to update the hash - RECORD explicitly doesnā€™t need a hash to make that process straightforward).
  3. Iā€™d like to see an explicit statement that tools which rely on Pythonā€™s package databaseĀ¹ MUST refuse to uninstall projects that have no RECORD. See point (5) below, as well. (Tools like system package managers that rely on other data can do what they want - thereā€™s no particular need to make this point, though).
  4. If INSTALLER is intended for use in messages, the content should be usable in that context - so Iā€™d suggest that the spec should say something like ā€œthe file must contain a single line containing the name of the installer - for example ā€˜pipā€™ or ā€˜condaā€™ or ā€˜Mega-Corp Super Installerā€™ā€.
  5. Question - if an installer omits RECORD, should it be required to write INSTALLER, so the user at least knows who to blame for the package that ā€œnormalā€ tools canā€™t uninstall? (That can be a future revision, Iā€™m not going to block acceptance on it, but I think itā€™s a reasonable requirement).

Overall, though, this looks pretty good. No-one has raised any fundamental issues, so it looks like we have a reasonable consensus. If you can address the above points, Iā€™m OK with accepting the PEP.

Ā¹ A minor cosmetic problem with this rewrite is thereā€™s no longer a good noun phrase for ā€œthe database of installed packagesā€ :man_shrugging:

Yes, but then I have to choose when to be reminded. What I want is for the message to remain in my list of unread (and therefore ā€œfor my attentionā€) messages until I deal with it.

Another thing that sucks is that you can only have one reply ā€œon the goā€ at any one time :slightly_frowning_face:

1 Like

Very much no. That would require installers to be able to write future-proof uninstall commands that work with all operating systems they support. Iā€™m pretty sure weā€™d end up with all sorts of broken corner cases if we tried to do this.

2 Likes

(I think I somehow managed to configure my account so that all this is feasible from my email client. ā€“ Wanted to answer as private message, as to not pollute this topic, but it wonā€™t let me. I will stop here.)

Are you sure about this? This definely didnā€™t used to be the case, and some quick searching of the code I donā€™t see any checks done by pip. Conceptually speaking pip should not be locked to only uninstall things installed by pip, it should be barred from uninstalling things whom the source of truth is not the Python metadata.

1 Like

Iā€™ll update.
Iā€™d like to take the same view as PEP 376 (even though I donā€™t agree with it). That means the REQUESTED file is optional, but handling it is mandatory ā€“ leaving it out must me a conscious choice.

Here Iā€™m actually trying to keep the status quo. PEP 376 says ā€œThe hash is either the empty string or [the data]ā€. It also gives some examples and notes about when the hash and file size are left out ā€“ specifically, .pyc, .pyo and RECORD are mentioned.
For the file size, it just says ā€œthe fileā€™s size in bytesā€, but later examples/notes contradict that by leaving it out, so I read that as a shorthand for ā€œeither the empty string, or fileā€™s size in bytes, as a base 10 integer.ā€œ

I believe thatā€™s a valid interpretation, given how non-rigorous PEP 376 is generally. But I worded my PEP so that it works even for those who PEP 376 it as ā€œfile size is mandatoryā€œ.

I donā€™t want to get bogged down in specifying which files can have the details left out.
I believe individual tools can make good choices, and itā€™s in their interest of serving users to allow messages like ā€œrefusing to uninstall FOO as local changes have been madeā€ where possible.
But anyway, the proposed spec does say:

For other files, leaving the information out is not recommended, as it prevents verifying the integrity of the installed project.

OK, Iā€™ll add that.

The proposed spec already says If present, INSTALLER is a single-line text file naming the tool used to install the project..

I didnā€™t say it this way in the PEP, since the ā€œsingle-lineā€ and ā€œcontains name of installerā€ requirements arenā€™t changed from PEP 627. Do you think itā€™s necessary?

Uninstallers will need to handle missing INSTALLER anyway ā€“ itā€™s better if the spec acknowledges that the file can be missing.
For installers, again I think itā€™s fine to rely on the tool to do whatā€™s best for the users. Who knows what reasons to leave INSTALLER out might come up. IMO a spec should ensure interoperability, not good tools :ā€)

Drat, I looked at this message and now I canā€™t ā€œmark it as unreadā€. Iā€™ve bookmarked it for tomorrow, but Iā€™ve no idea if alerts go anywhere Iā€™ll notice. If I havenā€™t responded to this by Monday, ping me as it means Iā€™ve forgotten.

(This is one place Discourse really sucks IMO).

1 Like

OK!
I took the opportunity to sneak in some edits into my earlier post. I also updated the PEP and spec proposal.

I found the comment ā€œIf the installer is executable from the command line, INSTALLER
should contain the command nameā€ unclear, as I originally thought it allowed for tools to insert a full command line in there. But on reflection, thatā€™s not true, and I think this is fine.

I think the rewording is worse, as itā€™s no longer backward compatible. As worded. installers must write a REQUESTED file if thereā€™s any chance that the user requested the install. But that would imply that consumers are allowed to assume that the lack of a REQUESTED file guarantees that the install wasnā€™t user-requested - which isnā€™t true, because existing installs donā€™t reliably add REQUESTED.

I think that like it or not, we need to acknowledge existing reality, which means that the presence of REQUESTED has to mean ā€œthe installer is sure that the user requested this installā€ and consumers can rely on that, leaving "no REQUESTED file as meaning ā€œnot user requested, or weā€™re not sure, or the installer didnā€™t record that informationā€. Yes, that makes the data mostly useless, but we donā€™t have any good use cases for needing it anyway, so I think thatā€™s acceptable.

Apart from the one issue around REQUESTED, and a typo that I noted against the PEP commit, I think this is OK now.

The intention was to give REQUESTED some useful semantics, with the caveat that tools donā€™t follow them, so the data is unreliable until the ecosystem catches up.

But, itā€™s becoming clear that I canā€™t really salvage REQUESTED. I too canā€™t see any use case where it can work reliably today. So, Iā€™d rather remove it from the spec altogether, which would mean REQUESTED becomes a tool-specific file that pip adds for its own use.
A different solution, which could encode all three values of yes/no/donā€™t know, can be standardized later.

Would that work?

I think itā€™s clear at this point that thereā€™s a question over REQUESTED. Letā€™s get some feedback from others. Once we have some sort of consensus, you can update the PEP.

See https://github.com/pypa/pip/issues/7811 for the issue where adding REQUESTED to pip was discussed. That issue also notes that flit implements REQUESTED as well. But I donā€™t know of any code that uses REQUESTED, just tools that write itā€¦ I think the only reason it got added to pip was because it was in the standardā€¦ Maybe @sbidoul has a view here?

On a personal note, Iā€™ve never seen an implementation of the sort of semantics that REQUESTED is trying to capture that worked. So frankly, Iā€™d rather see it removed altogether. Itā€™s one of those things where, if itā€™s in the standards, tools must implement it (or risk devaluing the standard, as happened with PEP 376). But if there are no consumers of the data, implementing it is pointless.

Iā€™m definitely -1 on having it remain in pip (and flit) if itā€™s removed from the specification.

My vision is to have REQUESTED behave similarly as what can be done with apt-mark auto/manual in debian. In my mind this is a prerequisite for automatic removal scenarios.

Mid term, I see REQUESTED evolving to hold a version specifier so if the user installed ā€œfoo<3.0ā€, the pip resolver could use that information to keep the dependencies in the requested state when installing additional packages. (I think thatā€™s how I responded to the upgrade strategies survey)

When working with disposable virtualenvs or with virtualenvs managed via a lockfile/requirements.txt this is not very useful I agree. When managing the --user environment however, having that feature similar to what linux package managers have sounds useful to me.

Iā€™m not aware of any tool that make use of it today, but the REQUESTED spec makes sense to me and I figured implementing it in pip was a useful enablement first step.

I understand the discussion here mostly revolves around a transition when some tools implement it and other not? In that respect I would be strict in the spec: ā€œinstallers MUST create REQUESTED for packages that it knows are user supplied, as opposed to dependenciesā€. The transition period could be managed by asking confirmation to the user when auto-removing packages.

1 Like

That means the current spec ā€“ neither PEP 376 nor my current proposal (permalink to current state) ā€“ is adequate for the mid term. So this will need an update to the spec, and we lose nothing by not leaving REQUESTED out of the spec right now.

Unfortunately, there is no distinction between ā€œthe file was installed automaticallyā€ and ā€œthe file was installed with an older version of pipā€. So, the information is pretty useless to consumers that would like to uninstall unneeded projects.

If you can design a spec that can express all three values, yes/no/donā€™t know, the transition will be much easier to manage.

So there are two possibilities?

  1. remove REQUESTED from the pep and specification and develop a new standard
  2. keep REQUESTED with a MUST language for installers, and mention that tools must ask user confirmation before taking destructive action based on the absence of REQUESTED, since itā€™s absence may mean installers do not implement it correctly

It may be better to leave it out of the spec until fully implemented. Perhaps such specs could have a short section pointing to elements of the PEPs that have not been translated to specs by lack of sufficient implementation feedback (as opposed to PEP elements that have been abandoned)?

Short-term, any tool in its right mind will want to be compatible with pip, which now writes REQUESTED. And the PEP explicitly allows tool-specific files. So, strictly speaking I donā€™t think need any extra text in the spec. But I did add a note.

Iā€™ve also updated the PEP so we have concrete text to discuss.

Pip is now free to experiment. Hopefully we get an update that standardizes an existing, proven implementation. Which answers:

Flit, sure. But I think pip should be free to drive this feature and then standardize it (probably as an optional extension which would take care of the backcompat troubles).

:man_shrugging: Iā€™m speaking personally, not as PEP delegate here, but Iā€™m not particularly interested in pip being the innovator over this. Iā€™d personally argue that we remove it from pip until someone comes up with a usable spec for a replacement feature that can be standardised and addresses backward compatibility issues. An obvious approach would be to have a mandatory USER_REQUESTED file (we canā€™t use REQUESTED, as thatā€™s been taken by PEP 376 :slightly_frowning_face:) that contained the values ā€œyesā€ or ā€œnoā€, with a missing file meaning ā€œinstalled by a tool that doesnā€™t respect this standardā€.

I would expect @sbidoul to argue with me, though, and honestly I donā€™t care enough to block him working on this.

Nope, Iā€™m not gonna argue :slight_smile: If people think the backward compatibility issues are going to be a show stopper for REQUESTED, feel free to dump it.

1 Like

Is there anything more the PEP needs?

If you feel the REQUESTED issue has been addressed, let me know and Iā€™ll take another look.