Feature Proposal for PyPI: Draft Releases

What follows is a proposal to implement “Draft Reviews” in PyPI.

Context

Since at least 2015, there has been a need for “Unpublished Releases”. The following Issue has a history of comments along with a link to the original discussion:


After a reasonable discussion with @dustin, we came up with a viable implementation.
This proposal has been posted here per the recommendation of @sumanah, and any suggestions regarding forum etiquette can be directed at them.

Summary of the Feature Request

Maintainers of packages would like to be able to separate the Upload and the Publish stages of their releases, in order to double check and test their package as if it was already release, before publishing the release for all others to use.
Since users want to be able to fix last-minute details in either the package or its metadata, these draft releases need to be mutable, contrary to normal releases on PyPI and Test PyPI.
Because these draft releases are only intended for maintainers and their workflows, they should not be accidentally installed by the average user. This means it’s not going to be visible nor discoverable on PyPI or any of its clients.
All other aspects about the release should stay the same, in order to provide the most accurate representation of how the published release will look and behave like.

Solution

In order to fulfill the maintainer’s needs, while leveraging our existing arquitecture and making as few changes as possible, the following solution is proposed:

  • Add one field to the Release model:
    • published - a timestamp, which will be taken from the created value for all pre-existing releases. This will tell apart draft releases from published releases.
  • Add one field to the Project model:
    • auto_publish_releases - boolean, default to true
    • UI for setting this value in the project management UI
  • Filter on whether published is set for a release in public UIs, likewise for the simple API.
  • In management UI, separate the draft release from published releases at the top
    • with a button to publish the release
  • URL routes for testing and viewing draft releases in the form of something like:
    https://pypi.org/project/pip/20.0.2-draft-a1b2c3/
    • Where a1b2c3 is a hash of the release id
    • Installation command example in release description is updated to use this URL. e.g.
      pip install pip=20.0.2-draft-a1b2c3
    • Update the traversal logic here to consider these draft versions
    • Special banner / indicator / badge in release subscription that this release is a draft
    • Draft URL can redirect to actual URL after publication
  • Only one draft release per project
  • On publishing, the published timestamp is set
  • Unpublished releases do not show up in XML-RPC for mirrors until it is published.

Long term, we’ll also want the following:

  • An API to indicate whether a project/release should be auto-published from the upload client
  • An API to publish a draft release from an upload client

It’s also important to note that this change would not require any work on the upload/installer clients. As it would be managed entirely by Warehouse, the PyPI backend.

Any feedback on this is welcome, especially thoughts about:

  • how to generate the hashed draft link
  • how installing an draft release should work with pip (maybe using an obfuscated simple index and --index-url ?)

I’d like to hear your opinions on this, specially if you’re a package maintainer.

I’ll need your feedback by April 30th 2020 at which point I’ll proceed with the basis of what I know.

5 Likes

Can you expand how would this differ from alpha/beta releases we already support (usable via pip --pre) and why that does not suffice?

After you test your alpha/beta, then you have to make a whole new package, and maybe you’ll mess things up at that step. The idea here is that you can test the exact thing you’re going to ship.

Another use case for projects that rely on volunteers to produce wheels: you can upload a draft sdist, then all your volunteers can download it, build the wheels, and upload them, and then finally you publish the sdist and wheels all together as a single atomic operation. This avoids awkward problems that happen now when an sdist is uploaded before the wheels, and there’s a window where users running pip install suddenly start trying to build from source.

Does a Release here refer to a single file, or to the collection of all files with the same (project, version) tuple?

Does pip actually tell pypi the version requirement that it’s trying to satisfy when it fetches the list of available versions?

Alternative suggestion for the UI for installing these: pip install --extra-index-url https://pypi.org/draft/pip pip

The nice thing about treating these as just another index is that you can automatically use this in requirements files, other PyPI API consumers automatically have a way to see them, etc.

3 Likes

I’m pretty sure this refers to warehouse’s database model.

Just to add on to the benefits Nathaniel said, I think that the main difference between these draft releases and the pre-releases we already support, is that the former is a tool for maintainers who wants to test their libraries thoroughly before publishing, while the latter is a tool for downstream developers which might want to test their libraries against future versions (like testing against Python release candidates).

The latter.
For example, pip is currently in version 20.0.2, and you might want to create a draft release for version 20.1.0. After you’re sure everything works fine, that draft then becomes version 20.1.0 by simply setting it’s published attribute to the current timestamp.

Correct!

Yes, I’m also in favor of using --index-url or --extra-index-url. That pip command was just to deliver the point that the UI will need to change to tell how to install the draft release, much like when looking at an old release of a library right now:

I’m pretty sure if we’d want version numbers to not change between draft -> published transition, since that information is sourced from the files themselves and we’d likely not start modifying these files on PyPI. Is there an error in my understanding, or this example?

1 Like

(on mobile, sorry for typos/weird phrasing)

I’d like this to be a separate index URL, possibly a per-project-unique index URL, where the only difference from the “normal” index on PyPI, is that one draft release (the project’s own draft release) is available with no additional information / indicators on the simple page.

My main concern with this is the caching of the content, since we’d want to be serving the exact contents of the rest of the index and not hitting the user’s/CDN’s simple index cache, which might not be ideal. I’m optimistic that we could have a CDN-edge logic for automatic redirects here?

I think @ewdurbin and other PyPI admins are in a much better position than me to figure out the answers to “Is this a relevant concern? What can we do for it if it is?”.

I’m not sure I understand how projects can publish draft releases, if they can’t do it through the API, which is the only way to do uploads to PyPI.

Or is this referring to the button to convert a draft release into a regular release? Can someone please bikeshed a better verb than “publish” here, since “publish a draft release” seems very ambiguous to me - did a new draft release get uploaded or did an existing draft release get “promoted” to a normal release?

Maybe the word promoted works. :stuck_out_tongue:

What does “auto-published” mean here?

No, and we should not change this IMO.

Which means that we can finally deprecate the --prefer-binary flag we added to pip for helping folks deal with this. /cc @techalchemy since this is a better description of why the option exists than I could use-words-that-my-brain-stitched-together to explain.

Is this necessary - including “draft-<hash>”?

I’m pretty sure we can use the regular URL for the release in the UI, while clearly stating “draft release” and not presenting the pip install command to anyone except the logged in project maintainers / owners. That avoids the entire need for creating these hashes, or draft-only URLs, which feels like a good simplification to me, and reduces the number of things that change when you click publish, which is a good thing IMO.

As a consumer of Warehouse’s XML-RPC changelog, how would that be affected? If the changelog is fetched while a release is between creation and publication, would its release & file creation actions be listed in the changelog or not? If not, then once the release is published, would these actions be listed at the moment that they originally occurred (necessitating clients to go back some amount in the changelog to see if any holes were filled in) or at the moment of publication? Would there be a new “release published” action?

What are the downside if draft releases are served in the same endpoint, but with the published release with a GET parameter, e.g. https://pypi.org/simple/pip/?draft=1? Clients can grow a flag to add that parameter, and old client versions not supporting this flag will never see those releases, so we don’t risk showing not-yet-published artifacts.

IIUC, this would need to go through the PEP process for an interoperability standard, to document and standardize this mechanism. I don’t see that as a downside per-se and there is precedent for PyPI-related PEPs that were fairly well scoped and well thought out, making it through the design process in quick time (PEP 572).

1 Like

Big fan of this, and I would certainly use it!

One other idea to consider might be a more generic “listed/unlisted” flag (like on nuget.org), that can be toggled at any point to change whether the version is returned in queries or shown in the UI. This would also satisfy “yanking” (deleting without breaking dependents), and you could test the package by providing the specific version rather than a range.

I haven’t heard anyone suggest that security is a concern here (actively preventing people getting the package early - most app stores treat this seriously), so I doubt there’s a need to make it too hard to discover. And there’s certainly value in being able to share the install command broadly (including with CI) before listing.

You’re absolutely right, it was typo that I’ve now fixed. Thanks for spotting it!

That’s the idea, yes, although I’m still not sure if it should be per-project or per-project-release.

I’m also willing to bet that PyPI admins and others have more experience than me regarding how to make this work with our CDN and caching systems. I think that by having these resources under a different URL scheme might help with that.

This hasn’t been explained, as is not part of the scope of this proposal. A maintainer can toggle the auto_publish_releases value to specify if every new release that gets uploaded should be instantly published (current behavior) or if it should be uploaded but not published, as a draft release. The upload client itself would not be aware for this functionality with this change.
That said, it would be nice if after this gets implemented, future work could make it so upload clients could specify if a release is to be uploaded as a draft release or as a published release.

Current behavior. Every time you upload a release from one of the clients, it’s published and visible to all. This “long-term goal” refers to the ability to toggle this configuration from the upload clients themselves, rather than the project settings in pypi.org

I’m not sure if what you’re proposing would make it possible to differentiate between published releases and draft releases? How would you specify that you want to install a draft release while using the same URL?

As much as I’d like to submit a PEP one day, I think it’s a bit out of scope for this matter, and would rather make this implementation in a way that one is not required :sweat_smile:

Since a draft release is not yet published, it should not show up in the changelog.
The actions would be listed the moment of publication, in order maintain our current behaviour.
In other words, mirrors shouldn’t be aware that drafts exist, as they are downstream users, and drafts are intended as a tool for maintainers.

Maybe @dustin or someone else can expand on this :sweat_smile:

I don’t think unplubishing or yanking a release is inside the scope of this proposal, since we’re trying to maintain as much of the current behaviour around releases as possible, and that functionality is not yet provided by our current implementation.
However, I do think this work opens up the discussion for such a feature in future works :slight_smile:
Again, @dustin might have something to say about this, as yanking was mentioned during our discussion.

The fact that it’s not currently implemented is a great reason to see whether it’s actually the same thing as what’s being proposed here.

If the goal of this proposal is not satisfied by a release that doesn’t appear on the (unauthenticated) UI and can only be installed by direct version reference (the definition of a yanked package, if I understand the PyPI plans for it correctly - it’s certainly the definition used elsewhere), that’s fine. But it deserves to be explicitly dismissed, rather than simply overlooked.

1 Like

Well, since each project can have only one draft release at a time, I’d say the “complexity” of per-project-release might not be needed. :slight_smile:

Okay, this wasn’t immediately clear to me from the way it’s written and sounds like exactly what I was thinking we should do. Basically, the functionality for “override whatever is the default mode for publishing” has been deemed out of scope.

Okay… we need a better phrase for describing/referencing this behavior. :stuck_out_tongue:

I’m not talking about the simple index API that pip would look at (like https://pypi.org/simple/pip), but rather the UI that’s meant for humans (like https://pypi.org/project/pip/20.0.2).

My point: for the user-facing pages, we shouldn’t distinguish between draft vs normal releases on user-facing pages. we shouldn’t use URLs like https://pypi.org/project/pip/20.0.2-draft-a1b2c3/ but rather https://pypi.org/project/pip/20.0.2/. These won’t affect the tools like pip, which use the simple index API. :slight_smile:

:+1:

I felt this intent is conveyed clearly by how carefully this proposal is working around the limits of the current upload API + simple index API with smartly-crafted custom PyPI-specific logic. :slight_smile:


FWIW, Discourse is telling me “this topic is clearly important to you” and suggested that I “invite others to speak”. Which is a subtle way of telling me to stop being so noisy, so I’ll let go shush and let others chime in.

1 Like

Implementation of yanking is underway here, and is indeed out of scope of this proposal. Yanking is specified by PEP 592 and is primarily an installer-side concern. This proposal addresses publisher-side concerns, and as proposed doesn’t require any PEP.

2 Likes

I couldn’t quite find the best message to reply or quote…

As far as how the draft release is exposed for client testing, I’m +1 to suggesting the use of --extra-index-url with an obfuscated simple index that is generated per release, that would only contain the project and files for its draft release. Something like https://pypi.org/draft/{SOME_HASH}/simple, served with the appropriate Surrogate-Key header to be purged along with the project (in other words when the release is published this index disappears).

If implementing a full copy of the index with just the addition of the draft release is preferable (which I can imagine it would be for some installers that don’t support --extra-index-url), we could somehow encode the right information in the draft index URL so that our CDN configuration would know to use its current cache for all simple URLs except the project that contains the draft release.

No matter what we should ensure that these draft indexes only live as long as the release draft and are 404 thereafter to discourage misuse.

2 Likes

410 perhaps?

1 Like

Sure, but if I can/could publish my release as “pre-yanked”, I can/could still verify the files by installing with an exact version number and then unyank the whole release in one go (at least according to the PEP). That seems to give me the exact behaviour we’re after here, and the only thing needed is for the release to start yanked.

I know the names are different, but why design a new feature when it can work with one that’s already under development?

I think the major difference between a draft release and a pre-yanked release would be that if you discover a problem with your draft release, you can delete it and re-upload with the same version number. But with a pre-yanked release, it would be considered published, start propagating to mirrors, etc., so you wouldn’t be allowed to delete and re-upload.

3 Likes