I’ve just submitted PEP 592 which will implement the ability to mark a file as “yanked” in the simple repository API. Because this is adding a new attribute, it will not affect how any current versions of pip interpret the simple repository API, however it will allow us to solve the problem of trying to mark files as “don’t actually use this” going forward.
I’ve included the PEP body below, and it can be viewed online once the PEP pages sync the latest version.
I’ve also had this PEP move the canonical location of the simple repository API specification to the packaging guide.
Abstract
This PEP proposes adding the ability to mark a particular file download on a simple repository as “yanked”. Yanking a file allows authors to effectively delete a file, without breaking things for people who have pinned to exactly a specific version.
It also changes to the canonical source for the simple repository API to the Simple Repository API reference document.
Motivation
Whenever a project detects that a particular release on PyPI might be broken, they often times will want to prevent further users from inadvertantly using that version. However, the obvious solution of deleting the existing file from a repository will break users who have followed best practices and pinned to a specific version of the project.
This leaves projects in a catch-22 situation where new projects may be pulling down this known broken version, but if they do anything to prevent that they’ll break projects that are already using it.
By allowing the ability to “yank” a file, but still make it available for users who are explicitly asking for it, this allows projects to mitigate the worst of the breakage while still keeping things working for projects who have otherwise worked around or didn’t hit the underlying issues.
One of the main scenarios where this may happen, is when dropping support for a particular version of Python. The
python-requires
metadata allows for dropping support for a version of Python in a way that is not disruptive to users who are still using that Python. However, a common mistake is to either omit or forget to update that bit of metadata. When that mistake has been made, a project really only has three options:
- Prevent that version from being installed through some mechanism (currently, the only mechanism is by deleting the release entirely).
- Re-release the version that worked as a higher version number, and then re-release the version that dropped support as an even higher version number with the correct metadata.
- Do nothing, and document that people using that older Python have to manually exclude that release.
With this PEP, projects can choose the first option, but with a mechanism that is less likely to break the world for people who are currently successfully using said project.
Specification
Links in the simple repository MAY have a
data-yanked
attribute which may have no value, or may have an arbitrary string as a value. The presence of adata-yanked
attribute SHOULD be interpreted as indicating that the file pointed to by this particular link has been “Yanked”, and should not generally be selected by an installer, except under specific scenarios.The value of the
data-yanked
attribute, if present, is an arbitrary string that represents the reason for why the file has been yanked. Tools that process the simple repository API MAY surface this string to end users.The yanked attribute is not immutable once set, and may be rescinded in the future (and once rescinded, may be reset as well). Thus API users MUST be able to cope with a yanked file being “unyanked” (and even yanked again).
Installers
The desireable experience for users is that once a file is yanked, when a human being is currently trying to directly install a yanked file, that it fails as if that file had been deleted. However, when a human did that awhile ago, and now a computer is just continuing to mechanically follow the original order to install the now yanked file, then it acts as if it had not been yaned.
An installer MUST ignore yanked releases, if the selection constraints can be satisified with a non-yanked version, and MAY refuse to use a yanked release even if it means that the request cannot be satisfied at all. An implementation SHOULD choose a policy that follows the spirit of the intention above, and that prevents “new” dependencies on yanked releases/files.
What this means is left up to the specific installer, to decide how to best fit into the overall usage of their installer. However, there are two suggested approaches to take:
- Yanked files are always ignored, unless they are the only file that matches a version specifier that “pins” to an exact version using either
==
(without any modifiers that make it a range, such as.*
) or===
. Matching this version specifier should otherwise be done as per PEP 440 for things like local versions, zero padding, etc.- Yanked files are always ignored, unless they are the only file that matches what a lock file (such as
Pipfile.lock
orpoetry.lock
) specifies to be installed. In this case, a yanked file SHOULD not be used when creating or updating a lock file from some input file or command.Regardless of the specific strategy that an installer chooses for deciding when to install yanked files, an installer SHOULD emit a warning when it does decide to install a yanked file. That warning MAY utilize the value of the
data-yanked
attribute (if it has a value) to provide more specific feedback to the user about why that file had been yanked.Mirrors
Mirrors can generally treat yanked files one of two ways:
- They may choose to omit them from their simple repository API completely, providing a view over the repository that shows only “active”, unyanked files.
- They may choose to include yanked files, and additionally mirror the
data-yanked
attribute as well.Mirrors MUST NOT mirror a yanked file without also mirroring the
data-yanked
attribute for it.Rejected Ideas
A previous, undocumented, version of the simple repository API had version specific pages, like
/simple/<project>/<version>/
. If we were to add those back, the yanked files could only appear on those pages and not on the version-less page at all. However this would drastically reduce the cache-ability of the simple API and would directly impact our ability to scale it out to handle all of the incoming traffic.A previous iteration of this PEP had the
data-yanked
attribute act as a boolean value. However it was decided that allowing a string both simplified the implementation, and provided additional generalized functionality to allow projects to provide a mechanism to indicate why they were yanking a release.Another suggestion was to reserve some syntax in the arbitrary string to allow us to evolve the standard in the future if we ever need to. However, given we can add additional attributes in the future, this idea has been rejected, favoring instead to use additional attributes if the need ever arose.
Warehouse/PyPI Implementation Notes
While this PEP implements yanking at the file level, that is largely due to the shape the simple repository API takes, not a specific decision made by this PEP.
In Warehouse, the user experience will be implemented in terms of yanking or unyanking an entire release, rather than as an operation on individual files, which will then be exposed via the API as individual files being yanked.
Other repository implementations may choose to to expose this capability in a different way, or not expose it at all.
Journal Handling
Whenever a release has been yanked, an entry will be recorded in the journal using one of the following string patterns:
yank release
unyank release
In both cases, the standard journal structure will indicate which release of which project has been yanked or unyanked.
Copyright
This document has been placed in the public domain.