PEP 792: Project status markers in the simple index

This is the discussion thread for PEP 972: Project status markers in the simple index

Draft PEP: PEP 792 – Project status markers in the simple index | peps.python.org

Previous thread: Pre-PEP discussion: Project status markers in the Index APIs

Summary

This PEP proposes a standardized set of index-supplied project status markers, as well as a mechanism for communicating those markers in the HTML and JSON simple indices.

The underlying idea behind this PEP is to give Python packaging a “status” mechanism that (1) operates at the project level, rather than per-release or per-file, and (2) allows for currently PyPI-only features like quarantine and archival to be exposed for downstream installer/tool consumption in a standard and reusable (by other index implementations) way.

I’m excited to hear what the community thinks about this! I thank you all in advance for your consideration and feedback :slightly_smiling_face:

CC @dustin @dstufft @miketheman @facutuesca @sethmlarson

4 Likes

LGTM!

My one question is about the HTML API. Is it worth exposing the project state on the project list page (called the “Base HTML API” in the spec)?

1 Like

I see no reason not to, but I’m curious if you have a use case in mind for it – I’ll be honest and say I almost never look at the “Base HTML API” at all :sweat_smile:

(One really - really tiny? - downside is that it would make that response bigger, but given how big it already is I don’t think it’s an overriding concern. But others perhaps know better than I do.)

I’m also curious what the use case would be. I suspect that it might not be served very well by including status markers there – at least in the case of PyPI, the underlying data for this page changes so rapidly that we intentionally only update the cache for it every 24 hours, which means this page is almost always out of date and status markers would be as well, and we generally recommend using other APIs instead.

Nope, but I noticed it wasn’t addressed in the PEP one way or the other.

1 Like

Per @dustin’s point I’m inclined to not add markers to it (at least until there’s another response asking for it), but not mentioning it at all was an oversight on my part! I’ll update the PEP’s index specification part to clarify that :slightly_smiling_face:

2 Likes

This looks pretty good to me! Here’s my feedback:

  • Quarantined is meant to be a potentially temporary state, as it can be “rolled back” to active. Can we mention that in the state description? Currently it could be confused that a project that becomes quarantined is to be considered malware.
  • Do we want to try adding the free-text user message to statuses in this PEP? It feels especially relevant for archived and deprecated and something that installers will want to forward to the user if they’re implementing warnings.
  • Security implication about project statuses, adding the classification of a negative status (quarantined) might further drive the idea that “anything on PyPI is safe”, which we don’t want users to believe and for users to continue evaluating releases they choose to use. Descriptions of project statuses that are user-facing should reflect this.
  • The line “should be considered in the active state”, do we want to clarify the semantics for when an index doesn’t implement statuses, does the active state still apply then (and installers / users should treat projects in that case as “active”)?
  • Nit to capitalize all SHOULD and MUST (some already are, but not all), SHALL → MUST, etc.
1 Like

Thanks @sethmlarson!

That seems reasonable to me to mention! Thinking generally as well: technically there really aren’t any permanent states with this PEP, since the PEP (intentionally) doesn’t define any state transitions or a state machine.

Given that, perhaps it makes sense for me to add some overarching language that explains that a project’s state is whatever state it happens to be in, and that states are not guaranteed to be static over time (much like yanking)? That would cover quarantine transitively, but perhaps it would also be confusing to state so generally.

I like this idea and have no opposition to it :slightly_smiling_face: – the reason I left it out initially is because it wasn’t clear where the free-text should live in the index responses, but here’s one idea:

     <meta name="pypi:project-status" content="quarantined">
     <meta name="pypi:project-status-reason" content="the project is haunted">

and:

{
  "project-status": {
    "state": "quarantined",
    "reason": "the project is haunted"
  }
}

Thoughts on that?

(There are probably some additional considerations that come with that as well, e.g. how long reasons can be, support for links, etc. Curious if anybody has thoughts on those!)

Agreed; I think there’s probably a separate UI/UX component to further surfacing these states on PyPI that’s probably distinct from this PEP itself. But I suppose the PEP could also make it clear that project statuses aren’t contrapositive, i.e. a project that isn’t in the “archived” state isn’t inherently active, not quarantined doesn’t mean safe, etc.

Yeah, that was meant to be clarified by this line at the top of the Project status markers section:

A project always has exactly one status. If no status is explicitly noted, then the project is considered to be in the active state.

i.e. the implication is that an index that doesn’t implement this only has the “active” state, and all installers should treat all packages from that index as “active.”

Curious if you think it could use further clarification :slightly_smiling_face:

Thanks, will fix!

1 Like

Okay, next round of updates are in: I’ve added free-form reasons to the PEP’s proposed changes, and have replaced SHALL with MUST where present to use the more canonical RFC 2119 labels.

(h/t @sethmlarson)

2 Likes

Thanks for pulling this all together, @woodruffw - I’m generally supportive of this approach, with some questions:

Curious why the JSON project-status is under the root namespace, instead of meta key, which is a little more similar to the HTML approach? In re-reading PEP 691, it makes clear that the meta key:

  • All JSON responses will have a meta key, which contains information related to the response itself, rather than the content of the response.

That feels to be the right place to express Index-only metadata, similar to the mirror protocol’s _last-serial key - which isn’t in the PEP, but was added during implementation.

I’d also advise deferring statuses that don’t have a clear target today like deprecated - unless there are other Indexes that are looking to express this already.

2 Likes

Thanks @miketheman!

This came out of the original discussion in Pre-PEP discussion: Project status markers in the Index APIs@ncoghlan observed in Pre-PEP discussion: Project status markers in the Index APIs - #11 by ncoghlan that meta is intended for metadata about the response, rather than the response’s own metadata (since the entire response is itself arguably a form of metadata about a project).

In this case, I agree with her rationale (and @pf_moore’s rationale in the following comment) – meta mostly gets currently used as a signaling layer for things like mirror clients (hence _last-serial), not as a boundary between index- and package- controlled metadata itself. yanked and provenance are examples of this, since both are conceptually index-mediated but appear side-by-side with release-file-mediated metadata :slightly_smiling_face:

No strong opinion from me, but out of curiosity: do you see a procedural advantage to deferral versus the existing MAY language in the PEP? For reference:

Indices MAY implement any subset of the status markers specified in this PEP, as applicable to their needs.

I think my main argument for keeping deprecated in the PEP is that it probably is something that PyPI (and other indices) will eventually want to implement, so standardizing it upfront means not having to do another PEP cycle (plus metadata version bump?) when the time comes.

(FTR, I would like to push a deprecation marker forwards on PyPI – IMO it’s a useful discrete signal/state beyond archived and gives maintainers the ability to express “no new features coming, but you might see the occasional bugfix/security release.”)

2 Likes

Thanks for these responses, that’s helpful for me to understand, I didn’t read the pre-pep thread as closely.
If yanked and provenance are already top-level acceptable use cases, then following that pattern might be best here.

The MAY allowance does satisfy my thoughts on deferring statuses, so any index can implement them later as needed.
However it may be wise to drop the “subset” part and reframe it to allow any index to implement a specific semantically-applicable status even if it’s not in the PEP, unless your intent is to drive the change for contents that live under the project-status specific key via PEP - I think it might be better to allow indexes some flexibility here, but I’m happy with either way.

I’m generally :+1:

1 Like

Thanks @miketheman!

Yeah, this was the intent – another part of the pre-PEP discussion was whether the PEP here should be normative about which statuses are allowed and their corresponding semantics. @dustin pointed out that the PEP’s value proposition is a lot weaker without specifying those normative requirements, so I’m inclined to keep them in :slightly_smiling_face:

Ref: Pre-PEP discussion: Project status markers in the Index APIs - #32 by dustin

2 Likes

Thanks for integrating the changes, this all looks great to me! I’m excited to start seeing this feature being used by projects.

1 Like

Thanks all! There hasn’t been any substantive changes to the PEP’s language over the last week or so, so I’m officially asking for pronouncement on this from @dstufft :slightly_smiling_face:

1 Like

After reading through the discussion and the PEP itself, I’m happy to approve this PEP. Congratulations!

7 Likes

Thanks @dstufft!