PEP 691: JSON-based Simple API for Python Package Indexes

There’s two potential use-cases for all the data from this PEP. One is an installer, knowing what package it wants in order find the appropriate file. The other is if there’s some browsing functionality in e.g. an editor who wants to provide an interface in front of the package server to users to peruse.

If all you care about is the former case, then you may want a list of normalized package names for the index to simply verify the package is there, and no URL since the PEP tells you how to construct it. That suggests (list, normalized, no).

But if you’re providing a browser, you want that display name (which makes providing it a SHOULD and not a MUST sort of thing). Now, you still don’t need the URL since the PEP specifies how the URL to the package’s details must be constructed. So for me the question is whether the index should contain the display name and canonical name, and to me that comes down to what will burn less energy: the bandwidth to ship the canonical name over the internet on every fetch, or calculating it on a case-by-case basis by every browser. I have no good feel on this since PyPI’s bandwidth is so massive it may actually outweigh the cost of calculating the canonical name when someone happens to pick a package name from some list of things. This suggests either (list, non-normalized, no) or (map, both, no).

But I personally don’t love the discrepancy between display names and canonical names to begin with. As such, I would still be quite happy with (list, normalized, no) for the index to save on the bandwidth and processing since you can never rely on a display name for anything. Besides, people can figure out that oslo.concurrency and oslo-concurrency are the same thing pretty easily. But I also don’t know if other people agree with me on this.

On further reflection, i’m going to go out on a very small limb, and say that (1) is basically the least interesting question, no matter what we pick, it’s extremely trivial to go from one to the other with list(dict(v, name=k) for (k,v) in projects.items()) or {normalize(v["name"]): v for v in projects}.

I think in general (1) is only actually a useful question if the answer two (2) includes normalized names in some fashion or not. Without the normalized names, we don’t have anything particularly useful to key the dictionary with, so the dict gets awkward to use, and it might as well be a list.

The first use case is the historical use of this response for the simple api, but afaik no installer is using this page anymore. It was primarily used when the /simple/$PROJECT/ urls were not normalized, which meant that when running an index on a static server, you couldn’t easily rely on redirects, so installers would fall back to fetching /simple/ to find the URL they needed.

At some point (I don’t really remember when anymore), we switched everything over to using normalized URLs for the API, which meant installers no longer needed to hit /simple/ so afaik no modern installer does, we kept the /simple/ response for backwards compatibility, not because we had a strong use case for it.

Thus, the entire reason the URL is included at all is because in the historical use case, the point of the API was to look up the URL because there was no out of band way to know what the URL should be… but that obviously doesn’t apply anymore.

Technically removing the URL is a breaking change between the two serialization formats, but it’s probably not a meaningful one? The URL is easily computable out of band now, and that behavior is widely depended on, so it’s probably fine to remove it. It just makes me a tad nervous, but I think we can probably say that, given any sort of JSON deserialization can re-add it if it’s needed, it should be safe to remove it.

Which ends up leaving the big question is whether it should be normalized names or not. I don’t have a strong preference here that is informed by a specific use case, since all the use cases I’ve worked with no longer use the index at all. My default is that the API should use normalized names, so that’s what I ended up with.

Which… after I typed all that, is basically what you said, we should either do:

{
  "meta": {
    "api-version": "1.0"
  },
  "projects": {
    "frob": {"name": "Frob"},
    "spamspamspam": {"name": "spamspamspam"}
  }
}

or

{
  "meta": {
    "api-version": "1.0"
  },
  "projects": [
    {"name": "Frob"},
    {"name": "spamspamspam"},
  ]
}

There’s the bonus option of:

{
  "meta": {
    "api-version": "1.0"
  },
  "projects": {
    "frob": {},
    "spamspamspam": {}
  }
}

But I really don’t like that third option.

Bandwidth wise the second option is the smallest of the two, 1.9M Compressed / 9.5M Uncompressed vs 3M Compressed / 15M Uncompresed.

Implementation wise, in either case we’d have to live the “name” field ambiguous as it is in the current PEP, we could tighten it up to a SHOULD, but we couldn’t mandate it was the normalized name.

I don’t honestly care a lot either way between these options, the second option uses the least amount of bandwidth, so absent any other signal, I guess I have a preference for that.

Thanks both for the useful insight. There are a number of threads to pull on here, but I’ll focus on the versioning theme first, as I think it is this which is coloring the subsequent discussion:

And:

In practice, I agree that this is a valid perspective, but believe it is a mistake to see it in this way for certain use cases. The three scenarios that I see for a different serialization being added in the future:

  1. A more efficient serialization of the data is to be made available (e.g. binary JSON, protobuf).
    Motivation: performance and reduction of server traffic
  2. The world moves on from JSON, and a new serialization is to be provided.
    Motivation: Standardization and/or simplification for consumers
  3. A more versatile serialization is to be used to allow a new feature in the data model (e.g. sorted maps/objects/dict).
    Motivation: New functionality

In the first case the motivation is to provide a more efficient wire format - there is no drive to add/change the data model. No need for a version change. (Though if you see version & serialization as bound then the version would be v1 for this serialization, no matter if other serializations have incremented the version for other reasons).

In the second case it is extremely likely that a replacement serialization will be a superset of the JSON data model (array, object, string, number, boolean, null) and therefore could be introduced as “just another serialization” without introducing any data-model change (so again, no need for version change).

The third case is when you want to enhance the data-model for the sake of adding new functionality, or indeed removing/changing the behaviour. IMO this is pretty much the only case where version changes are needed (be they SemVer or not :stuck_out_tongue_winking_eye:). Having said that, in practice there wasn’t a version bump of the Simple API with PEP592 (yank support).

Given the fact that PEP592 didn’t invoke a version increment, one might ask what would elicit one? Certainly a removal of existing concepts in the data model - in which case, it would presumably be called v2. Suppose after such a change we then choose to add a new serialization type, by the perspective you have provided this would logically be called application/vnd.pypi.simple.v1+{new_serialization} (note the v1), even if the new serialization exposes the changes to the data model that were previously in the v2 of another serialization. To me this is quite confusing.

To put this another way, and bring it back to a real-world case that I care about: Suppose I wanted to write a service which blacklists projects for an private/corporate package index. Furthermore, suppose there existed a library which represents the concept of SimpleIndex which I can use to implement such a service. If the versioning is not coupled to the serialization format, I could conceive that the core implementation ends up looking like:

def filter_project(project: simple_index.v1.Project) -> bool:
    # Return True if the project is to be removed from the index.
    return project.name in blacklisted_projects

However if we follow the approach that the version and serialization are coupled, then the implementation goes:

def filter_project(project: typing.Union[simple_index.v1_json.Project, simple_index.v1_html.Project]) -> bool:
    # Return True if the project is to be removed from the index.
    return project.name in blacklisted_projects

Furthermore, if a new serialization gets added in the future, one has to adapt the code to handle the new type (which may or may not have a name attribute, since the data model is serialization specific). The ultimate effect is that adoption of the newer serialization will be more effort / slower for consumers. (note: the code is a fairly simple illustration, one could imagine having much more complex logic beyond just a couple of type annotations).

I would add that it seems like an unusual decision to bind the version to the serialization (at least: I can’t think of a similar case in the APIs that I’ve interacted with) - it effectively means that we should be calling the APIs the SimpleHtmlIndex and SimpleJsonIndex, since they have entirely independent lifecycles (except that the 2 endpoint names are in lock-step - this is the Simple part).

In effect I am advocating that:

The consumer of Simple Index data shouldn’t care about the over-the-wire serialization. It should be possible to add new serializations without consumers of the underlying data needing to know of the change (though of course the client implementation itself will have to handle the deserialization).

In other words: By binding the version to the serialization we are effectively losing the ability to say “I can handle Simple Index v1 data without needing to know how you sent the data to the client (thanks to a client library handling the deserialization for me)”, and instead the serialization format becomes a detail that all downstream consumers of Simple Index data must care about. I can’t see the advantages to having such a version<>serialization coupling, so if there are such advantages, I think it would make sense to document them.

I realise how long my posts get, so I’ll try to keep this one short :smile:

This is a fundamental mistake on the data-model level - JSON objects are not sorted according to the spec. The order information can only be preserved (according to standards) if the data is an array/list. I accept the previous point you made that order is not specified in PEP-503 wrt. the project list, though - just wanted to make sure that this JSON specific detail wasn’t overlooked.

Whether or not it was the smallest, I think the second option is more consistent with the natural interpretation of the HTML Simple data-model, and would :+1: that. If the spec can only go as far as a SHOULD for name, then everybody needs to normalize the name anyway - to me it seems that the name may as well be non-normalized (since it is lossy to go from non-normed to normed). If the spec is pushed to a MUST for the name to be normalized, this argument evaporates.

Why are you coupling your application data model and the serialisation data model in the first place? I would have expected you to have an internal Project class, and serialise/deserialise from the wire transport forms at the boundary of the application.

I understand the problem you’re describing, but it seems like it’s one of your own making. Remember that an index provider like Warehouse is only going to have one internal Project class, so there’s a practical constraint here that ensures that serialisation formats will always be convertible to a common class…

I don’t think the version number and the wire formats are tied together. I just don’t think that (de)serialization has to be exactly the same between formats.

To use your later example, if the projects key was a map in the JSON API, you could still write something like:

def filter_project(project: simple_index.v1.Project) -> bool:
    # Return True if the project is to be removed from the index.
    return project.name in blacklisted_projects

You would just do something like this:

import email.message

from dataclasses import dataclass
from urllib.parse import urljoin

import json
import html5lib
import requests

from packaging.utils import canonicalize_name


@dataclass(slots=True, frozen=True)
class Meta:
    api_version: str


@dataclass(slots=True, frozen=True)
class Project:
    name: str
    normalized: str
    url: str


@dataclass(slots=True, frozen=True)
class SimpleIndex:
    meta: Meta
    projects_set: set[Project]
    projects_dict: dict[str, Project]


def from_v1_html(base_url: str, content: bytes) -> SimpleIndex:
    html = html5lib.parse(content, namespaceHTMLElements=False)
    meta = Meta(
        api_version=html.find(
            ".//meta[@name='pypi:repository-version']"
        ).attrib["content"]
    )
    projects_set = set()
    projects_dict = {}

    for link in html.findall(".//a"):
        normalized = canonicalize_name(link.text)
        project = Project(
            name=link.text,
            normalized=normalized,
            url=link.attrib["href"],
        )
        projects_set.add(project)
        projects_dict[project.normalized] = project
    return SimpleIndex(
        meta=meta, projects_set=projects_set, projects_dict=projects_dict
    )


def from_v1_json(base_url: str, content: bytes) -> SimpleIndex:
    data = json.loads(content)
    meta = Meta(api_version=data["meta"]["api-version"])
    projects_set = {
        Project(name=v["name"], normalized=k, url=urljoin(base_url, f"{k}/"))
        for (k, v) in data["projects"].items()
    }
    projects_dict = {
        k: Project(
            name=v["name"], normalized=k, url=urljoin(base_url, f"{k}/")
        )
        for (k, v) in data["projects"].items()
    }

    return SimpleIndex(
        meta=meta, projects_set=projects_set, projects_dict=projects_dict
    )


def _parse_content_type(header: str) -> str:
    m = email.message.Message()
    m["content-type"] = header
    return m.get_content_type()


def get_simple_index(url: str) -> SimpleIndex:
    content_types = [
        "application/vnd.pypi.simple.v1+json",
        "application/vnd.pypi.simple.v1+html;q=0.2",
        "text/html;q=0.01",  # For legacy compatibility
    ]
    accept = ", ".join(content_types)

    resp = requests.get(url, headers={"Accept": accept})
    resp.raise_for_status()

    content_type = _parse_content_type(resp.headers.get("content-type", ""))
    match content_type:
        case "application/vnd.pypi.simple.v1+json":
            return from_v1_json(url, resp.content)
        case "application/vnd.pypi.simple.v1+html" | "text/html":
            return from_v1_html(url, resp.content)
        case _:
            raise Exception(f"Unknown content type: {content_type}")

Of course you wouldn’t have projects_set and projects_map, you’d just have projects and you’d pick one of the two types (or maybe you’d pick a list, or something else), I just included both to show that either are possible.

This is what I mean though when I say that the way that serialization formats don’t need to match 1:1 on the wire, as long as the underlying data model has all the same semantics, how that is mapped to a specific format kind of doesn’t matter.

PEP 592 technically should have increased the minor version of the API, it just didn’t really matter because nothing was really using the minor version yet. The minor version is largely advisory so that something like pip can warn users that maybe they need a newer version of pip to fully understand the index they’re using.

I agree. For the HTML case it can still be extracted from the data, but in the JSON case be calculated.

I have a use-case: users in VS Code want to install something via a UI and we want completions of project names as they type. That requires a complete list of project names.

The spec could require the key identify whether it is a display name or canonical name. It could also allow for both or either keys to be provided as long as at least one of the keys is present. That would effectively make the display name optional but still allow for the canonical name to always be something you could calculate if you were not given it.

Otherwise we know what can go into v2. :wink:

And I don’t think they will because …

I will personally write that code for mousebender. As a user you have to make the networking call and you let the library handle the interpretation of the bytes based on the returned content-type.

I think we are all agreeing that an underlying Simple Index data-model is a good idea (at least on the client side). My whole point is that neither PEP-503 nor PEP-691 actually document the underlying data-model, so any implementation you have of this is brittle / subject to future change (as will be seen by those who assumed there would only one hash type based on PEP 503, for example).

So to reiterate my proposal:

  • PEP-691 to document the underlying data-model of a Simple Index (using array, object, number, string, booleans and null, or some other primitive form (e.g. dataclasses, UML, etc.)), and not the serialization format. For JSON it is trivial to go from the model to the serialization. Even JSON itself is an OK way to define the underlying data-model, if this is less work / more palette-able - the key thing is that the PEP provides a guarantee that no matter what serialization comes afterwards (under the same version), it can fit inside the underlying data-model losslessly (incl. order :wink:).
  • Version the Simple Index API based on the endpoints + data-model. New serializations do not require a new version. Changes to the underlying data-model may introduce changes to any/all of the serializations, this is when we change the version.
  • Ideal, but not essential: In recognition that Simple Index HTML serialization was ambiguous with regard to the underlying data-model, and that there are already interpretations of the underlying data-model which were different to the one being discussed here (see Brett’s example regarding hashes), call the data-model defined in PEP-691 v2. (I see very little cost to this, but also recognise there is very little value beyond setting ourselves up to do versioning correctly)
2 Likes

I’ve updated the PEP with the feedback in this thread, the main change being switching to a format like:

{
  "meta": {
    "api-version": "1.0"
  },
  "projects": [
    {"name": "Frob"},
    {"name": "spamspamspam"},
  ]
}

for the index.

You can see the changes on Github or rendered shortly once I’m able to merge.


Some specific replies:

I’ve mostly added this, but I’ve explicitly disclaimed things like order, etc by using set, since none of the PEPs have said anything about order, I don’t believe the order to be part of the API model, just a natural consequence of HTML and JSON not having a set type.

This was already the case in the PEP, brett’s comment was just providing a different way to think about why v1.0, but the versioning is independent of serialization (and the PEP allows different serializations to emit the same data in different ways, including excluding the data completely).

The PEP now leaves things at v1.0, but provides a FAQ about why.

2 Likes

I touched up the PEP via PEP 691: touch up (#2668) · python/peps@f1af4a7 · GitHub , but it was mostly grammar and formatting stuff.

After reading the PEP, I’m happy to pronounce my acceptance of the PEP as its delegate! While I expect some things will get added to the spec in the future (e.g. application.json is going to get debated pretty fast once this rolls out, an optional url key for each project in the index object), the PEP covers all the details in PEP 503 in a JSON encoding and in a way that code can gracefully handle the transition.

5 Likes

@brettcannon You missed a couple spots. Aside from two instances of “concpetual” and “it’s presence” and “it’s inclusion” instead of “its …”, the data models code shows dist_info_metadata as an attribute of page details when it should be an attribute of files. (Is this the right way to report these things?)

Thanks Brett!

I went ahead and finished up my Warehouse PR, and got it deployed.

The existing cached responses in our CDN are for pre PEP 691, so you have to wait for them to fall out of the cache before PEP 691 is fully available, however:

$ curl -Is https://pypi.org/simple/barbican/ | grep content-type:
content-type: text/html

$ curl -Is https://pypi.org/simple/barbican/ -H 'Accept: application/vnd.pypi.simple.v1+html' | grep content-type:
content-type: application/vnd.pypi.simple.v1+html

$ curl -Is https://pypi.org/simple/barbican/ -H 'Accept: application/vnd.pypi.simple.v1+json' | grep content-type:
content-type: application/vnd.pypi.simple.v1+json

$ curl -Is https://pypi.org/simple/barbican/ -H 'Accept: application/vnd.pypi.simple.v1+html;q=0.2, application/vnd.pypi.simple.v1+json, text/html;q=0.01' | grep content-type:
content-type: application/vnd.pypi.simple.v1+json

$ curl -Is https://pypi.org/simple/barbican/ -H 'Accept: application/vnd.pypi.simple.latest+json' | grep content-type:
content-type: application/vnd.pypi.simple.v1+json

$ curl -Is 'https://pypi.org/simple/barbican/?format=application/vnd.pypi.simple.latest+json' -H 'Accept: text/html' | grep content-type:
content-type: application/vnd.pypi.simple.v1+json

Data looks like:

{
  "files": [
    {
      "filename": "barbican-6.0.0.0b1-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "bbf547f3b624714d9f7e94316cfea14aefe9e71472d198f0b104149c7aeb19f6"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/14/47/5295ed8dee1104ee2919b453e687e47c4b2ec6f5e1ee8cffff42c95c3759/barbican-6.0.0.0b1-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-6.0.0.0b1.tar.gz",
      "hashes": {
        "sha256": "4af39b8559de5640a11af6df9391718962eaad94dfdf79c24962ad83c9abc678"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/d4/33/b17816f19d213dd4cf0d92292c8f0841f2cfd8b3e780f418913036f0446d/barbican-6.0.0.0b1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-8.0.0.0rc1-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "a091f5116bf0fab1268cd57a14e4f33a773ee622d73a3b58d3786bbb5d081f09"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/03/27/e5d0feb554e89eacef10138df727083e3068069d5d3f4f158f7bfe42729f/barbican-8.0.0.0rc1-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-8.0.0.0rc1.tar.gz",
      "hashes": {
        "sha256": "2b77f95e1588ea817d0b150708059bd4c3035a43100b78501c880114cefa5c13"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/13/96/3ae0071793fb95464919158957f66977a7cd52b1ba0443af4123355df8fa/barbican-8.0.0.0rc1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-8.0.0-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "e24766d4161a34c573dd1a061bf5de94fb2d074d50a393c05ec270246512c639"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/49/69/8ed38f54b1ae9ad74e16e70c56d73697bb94a1bb3ab2619c2b6fe4d19a51/barbican-8.0.0-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-8.0.0.tar.gz",
      "hashes": {
        "sha256": "80b42b4c4f4274d1a4a5f22656627ac296587cc1742741306415208e80354104"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/c7/77/3d4bda1a2f533a49de56ef9f3944a041faf25de6ecea4cc3c4544db573b5/barbican-8.0.0.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-8.0.1-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "64f809b1d4973f21d5c007f4c70d8d28d56f980bf3fbcb5e7d807020432cf2dd"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/43/d9/0831e96e9642228525391079bb013133e39164b4be109b179e399bcd623c/barbican-8.0.1-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-8.0.1.tar.gz",
      "hashes": {
        "sha256": "2c2ae21ce7e9f4dc3cd08a2d8f639b6ad543104a1083a0cb89cae5d069fcc579"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/07/f2/fd006c128fcaffb28a302d9a67c4252ba241187a57b9a74462108048847e/barbican-8.0.1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-9.0.0.0rc1-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "d0975d2a4d4b6decb6b8408a6f19013ff4586ac3018d6f941ca694c68789aeaa"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/57/a1/514a5dbed2cd8779ee29075420c2207768f7b17ec90d3ca83936c1c1fd4d/barbican-9.0.0.0rc1-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-9.0.0.0rc1.tar.gz",
      "hashes": {
        "sha256": "103a06c6e775205bccb315c957b9aaad74ad6e6a93268544c2fe4c037b985eab"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/64/af/50076b444e1ae046c502f6ddf6c2ed1f2115bc41bf85b4457144e572d38f/barbican-9.0.0.0rc1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-9.0.0-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "134eb4e04f992ab10344e3a7ccb3eeae7274e589cbc293e55d5077a9e31da7c9"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/20/fd/59dc1408d22cf67700ddbde337a5af71671c0b3c50149035d62d8457689c/barbican-9.0.0-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-9.0.0.tar.gz",
      "hashes": {
        "sha256": "8d1f3d0a6bc338fe69196cfd2b90eb0a1b8d53f3ad7b9bc17838a348becd1c10"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/5c/91/c6c24739168ce05322e0d57b60cf833de5dcb4332ba7806a0632f4c356f5/barbican-9.0.0.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-9.0.1-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "da0f90ec423f0cc1dc1f0c30faa567f54727e610cec909604a78a9bc7e014486"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/4c/84/4edfcad99739fec4d4a8973d5192e98fdea6edb8b37f034c0a6b728c4252/barbican-9.0.1-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-9.0.1.tar.gz",
      "hashes": {
        "sha256": "980ba9a19650b87900a380d1a4745ecb1162c9e1281e83eec9e68528bc7e272e"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/e6/80/2586f3eab531223a37f0b5aea321a6fd5aeceade11e879799c0296fdd099/barbican-9.0.1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-10.0.0.0rc1-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "800ab1f170369edeaba9ecf3ddef1d3c869589a2aa56fc7ef7031190b81f6253"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/7b/d3/bfb59e01374afb293372bee45f236875ff855886a7af07f1f3cf4e04b990/barbican-10.0.0.0rc1-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-10.0.0.0rc1.tar.gz",
      "hashes": {
        "sha256": "b2e1996f3cf113cb468b283334368d6d95acc20019833b22140b1b1b32c71405"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/b4/26/50512198aa737ac676469ada3c923dbd97c014883efaaaf38dca808857ed/barbican-10.0.0.0rc1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-10.0.0-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "c774f29304879ee57e6766aa45d6acf6821add423380026c6628c64cc66e6c6e"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/76/ca/07f0afba348b23023c8a00846917ac2edeee565913fc65ff1518b591e446/barbican-10.0.0-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-10.0.0.tar.gz",
      "hashes": {
        "sha256": "4b246cb0308211395702366de39a768eaf21100cc254df17f19f16166cdbdc40"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/87/c1/66c6cd01d0730aeda20e6ca3c8c62f09ecebb3259c4b4a230922aa4073b5/barbican-10.0.0.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-10.1.0-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "0f6617872fae95731776360f7f443ce989b99658a7d47c83353f6e68fb343a5b"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/e1/a9/8bbf0ad9f6c198d5ff38e1cbaa183e8bec0e577af46024a6ec39c3072aad/barbican-10.1.0-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-10.1.0.tar.gz",
      "hashes": {
        "sha256": "1b0a390a7081a554fda4c39418529b790081e12fcd0ed0a47a7b73ef02e723bd"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/71/8a/b2922f21b51a1b84c459b47118bfc11fc7920622a7eaa7908e30b5db7ca3/barbican-10.1.0.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-11.0.0.0rc1-py3-none-any.whl",
      "hashes": {
        "sha256": "6ec981c4a7a61273973be67035ab78e028301ef297f1f2980286981b93253ef8"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/8f/90/ec22dc289b43c973e5b2e51eb061056aa7a143f59588605d047a2e7f7dd9/barbican-11.0.0.0rc1-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-11.0.0.0rc1.tar.gz",
      "hashes": {
        "sha256": "7b218e834f2450eb6b1c4a947da993f5fb30f7de00feeef5d4300f9fa5298925"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/1f/82/0a785c979a44c38b9a57db165374be663d6fd9485b1619dbbbd33077b92c/barbican-11.0.0.0rc1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-11.0.0-py3-none-any.whl",
      "hashes": {
        "sha256": "653484f0be8bbb9a1706b23816c0439747c1959965d0a22aa74674000d6895c8"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/76/7c/e9e828437e04d15d839587cab44ff7881d2d82ec9ed5d94503b3c788e7cd/barbican-11.0.0-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-11.0.0.tar.gz",
      "hashes": {
        "sha256": "2b0aa92d1beafd6eba907507892c029483b0f9ec4d5264ac093e9cac268c3f88"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/8f/eb/3b51a197ec7fd12f9081bd3195067af00098011ed0386794730718452659/barbican-11.0.0.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-12.0.0.0rc1-py3-none-any.whl",
      "hashes": {
        "sha256": "71efb12ec383163efa9c467a0188d3cfa23ff069eafc85fc51c5e0111535571c"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/9c/8f/576dc6d67a7d29aef053478c2887b14d29e754b8fa00b203c9537b30661d/barbican-12.0.0.0rc1-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-12.0.0.0rc1.tar.gz",
      "hashes": {
        "sha256": "3563e3983cf109bc33471b8c096f71cf5b14df87367a0ce24e5c7ac32585f5df"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/1c/9f/47e0c6e26db7412662775b07e27d3c15e7cb386e865403f4e1264ed24f64/barbican-12.0.0.0rc1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-12.0.0.0rc2-py3-none-any.whl",
      "hashes": {
        "sha256": "0bccce1cbe197ad259e4b158a696ea10937d8efd78a0d226a988b4950d72d413"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/ae/c7/07aaf632d8e1441bd76dd4f2d7cdba5f9d5fd615fe59c0304eed22f9b9c3/barbican-12.0.0.0rc2-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-12.0.0.0rc2.tar.gz",
      "hashes": {
        "sha256": "62f96e4096622698921558bf0a8299b8b2c712c8545ff655cba2ae9948eda414"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/21/a8/bcc9ab14814fff33cc3d1a02563f4a4968ac843ce6bbb1322decbecdfe6a/barbican-12.0.0.0rc2.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-12.0.0-py3-none-any.whl",
      "hashes": {
        "sha256": "ddb70de5125a3e9d958fe17505a3049d16b260af5dc2825ea357c653fd3583ef"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/90/d6/dbcc8c287f865437c538abe0420aec60c7518caa36095225a4c80bd24f59/barbican-12.0.0-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-12.0.0.tar.gz",
      "hashes": {
        "sha256": "e39eaafd350ecff03c827c46bb7b7d81bc104325e4e6a3402b6d3bbbf47f278a"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/f8/9e/11ba79ba61f8c03ea821394e46112c34306817c12704556b3f8fa26e6398/barbican-12.0.0.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-13.0.0.0rc1-py3-none-any.whl",
      "hashes": {
        "sha256": "c097ec39bf5880f5c458ed8dd0c7b41f571304621369792c3ebf1ee0e28daba6"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/3b/b2/23c66be9909efa7b585dc9d02fe7f083805fe0041ee308577151bd18ca7c/barbican-13.0.0.0rc1-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-13.0.0.0rc1.tar.gz",
      "hashes": {
        "sha256": "a3fd4a0005e8005f986088e7a66182c1c87134984ef5d72883652b01c0543da7"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/2c/21/4ac69a05762fe354a4266237b24d86ce8d85933ee35dbf52733e5595a070/barbican-13.0.0.0rc1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-13.0.0-py3-none-any.whl",
      "hashes": {
        "sha256": "1ba79a72fff3fb6cab8001d0ad39a4ace93964283f53cd5b47d1f2f0883d8356"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/34/de/beebf35aea1716d64b9f2b526aef14a411779b98f832a5e60208e0db6d2d/barbican-13.0.0-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-13.0.0.tar.gz",
      "hashes": {
        "sha256": "08a5285d9d283a99d88079ee14c6dde3cd6ffcdaccad6caef1ba8b921576e84e"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/92/2d/c59de2ce4d6d5bccdec28a4005df0b4f3d47dcb5cc058b95f3a8ed0089c0/barbican-13.0.0.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-14.0.0.0rc1-py3-none-any.whl",
      "hashes": {
        "sha256": "9c8e7925786e184ec114e9d3f84c2fe33529b62fa386800a2d4e68560529bfc4"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/86/3e/b115a43477d52ea69c5b0736099c8fc7706090f111839aa600370963be04/barbican-14.0.0.0rc1-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-14.0.0.0rc1.tar.gz",
      "hashes": {
        "sha256": "0450a699500a9f757d18ea810aefa970230a402d65e92d4c39b6f1dda48d6e26"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/46/9c/8d2ef41e6bc95661e457c8317b9882534031d30d4722c2b902189d32656f/barbican-14.0.0.0rc1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-14.0.0-py3-none-any.whl",
      "hashes": {
        "sha256": "27c679ba1d30a8a31545c9738fc70b43e58ba7e9fda0cc2415d1cac3825e5d95"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/8e/d2/e383faafb6ac8d47d447d26549a341bee52b619aa2a6d0d0c04c5d17e157/barbican-14.0.0-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-14.0.0.tar.gz",
      "hashes": {
        "sha256": "1a034410189d045974bf70b703ecdce17c1a7b6a14814541e05ba5cb34f6e419"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/7a/93/551e43aefa86a6f57e1852d60568024e12a20f5f9bf316a37fc869c0c274/barbican-14.0.0.tar.gz",
      "yanked": false
    }
  ],
  "meta": {
    "_last-serial": 13345165,
    "api-version": "1.0"
  },
  "name": "barbican"
}

I’m not going to purge the cache because that hammers our origin servers something fierce, but it should naturally become available over the next 24-48 hours or so.

Thanks everyone who contributed to making this PEP better!

2 Likes

Also want to echo thanks to all and especially to Donald for the original draft PEP and hammering the initial implementation through!

Now it’s finally time to start discussing xmlrpc deprecation!

1 Like

I know I’m a little late to this, and I know it’s already been discussed and decided – but wanted to also voice my concern that choosing headers to route essentially rules out entirely static indexes from ever implementing this PEP – yes I read this section but I am unhappy with the conclusion there. (the section basically says s3: no, github pages: no, apache: yes, but this is no longer a “static site” if you need to run apache to negotiate headers).

it would essentially mean GitHub - chriskuehl/dumb-pypi: PyPI generator, backed entirely by static files would be dead in the water

Not dead in the water, you would just have to choose one content-type to serve. I think it will be a while before package downloaders will only support the JSON API.

Also for S3, I would recommend setting up CloudFront anyway to enable authentication (for private indices) and HTTPS.

you would just have to choose one content-type to serve. I think it will be a while before package downloaders will only support the JSON API.

yeah which means it would not be able to adopt this PEP – which is my point. It would be unable to serve both which means it would be incompatible with either new tools or old tools – also making it unable to transition. path-based json would be trivial to implement with static files whereas header-based is impossible.

I would recommend setting up CloudFront anyway

as far as I know cloudfront doesn’t help – it can only route by path not by header. you can have it front a lambda but again that’s not static files

The alternative to using a header is using different paths for json and html and expecting users to configure the right path, which is still entirely possible with PEP 691.

You’re not required to serve everything at one URL, you’re just able to. If you can’t do that it degrades to the same user experience we would have had otherwise.

Many thanks to @dstufft and the other contributors for working on this PEP and getting it accepted!


I did one more read over the PEP since the last changes, and besides some small tweaks (which I made a PR for), I’ve noticed one more key detail that seemed overlooked…

The name field in Project List being normalized vs un-normalized has been discussed, but the name field in Project Detail was not.
For the same reason the url field was dropped from the Project list, I don’t see any need for the normalized name to be returned in the Project Detail response, because you wouldn’t even be able to get that response without already knowing that name in the first place.
With the url field you needed to combine known information (the name) with knowledge of the spec. But this field is literally just returning back known information, which PEP503 also didn’t do.

So, I see 2 options:

  1. This field is dropped from the response. This would be the most consistent with PEP503.
  2. The field will be the un-normalized name instead of the normalized name. And unlike with the Project List, there is no need to be ambiguous about the name being normalized or not, because it wasn’t in PEP503 before.

I don’t precisely know what the allowed scope for changes are after a PEP has been accepted. But option 1 could be considered backwards-incompatible compared to the already accepted PEP-0691, while option 2 doesn’t have to be.

That’s technically only true if you were given/calculated the URL via “discovery” from the PEP’s URL structure or the project index. But if you used some other mechanism to get the JSON data then that doesn’t hold true.

I would be fine with that, but I’m also fine with leaving it so the JSON data is a bit more self-contained (it’s also probably not a ton of bytes compared to the rest of the JSON payload, plus it may compress okay since it will match what’s in the wheel file name).

See PEP 691: JSON-based Simple API for Python Package Indexes - #61 by brettcannon for my view of the purposes of normalized vs. non-normalized. I also don’t know how much it would add to the payload in the face of compression like the normalized name might be.

If we made the change right now, before anyone has had time to really implement it, then the PEP authors propose a change and I make a call to accept it or not.