PEP 691: JSON-based Simple API for Python Package Indexes

Why are you coupling your application data model and the serialisation data model in the first place? I would have expected you to have an internal Project class, and serialise/deserialise from the wire transport forms at the boundary of the application.

I understand the problem you’re describing, but it seems like it’s one of your own making. Remember that an index provider like Warehouse is only going to have one internal Project class, so there’s a practical constraint here that ensures that serialisation formats will always be convertible to a common class…

I don’t think the version number and the wire formats are tied together. I just don’t think that (de)serialization has to be exactly the same between formats.

To use your later example, if the projects key was a map in the JSON API, you could still write something like:

def filter_project(project: simple_index.v1.Project) -> bool:
    # Return True if the project is to be removed from the index.
    return project.name in blacklisted_projects

You would just do something like this:

import email.message

from dataclasses import dataclass
from urllib.parse import urljoin

import json
import html5lib
import requests

from packaging.utils import canonicalize_name


@dataclass(slots=True, frozen=True)
class Meta:
    api_version: str


@dataclass(slots=True, frozen=True)
class Project:
    name: str
    normalized: str
    url: str


@dataclass(slots=True, frozen=True)
class SimpleIndex:
    meta: Meta
    projects_set: set[Project]
    projects_dict: dict[str, Project]


def from_v1_html(base_url: str, content: bytes) -> SimpleIndex:
    html = html5lib.parse(content, namespaceHTMLElements=False)
    meta = Meta(
        api_version=html.find(
            ".//meta[@name='pypi:repository-version']"
        ).attrib["content"]
    )
    projects_set = set()
    projects_dict = {}

    for link in html.findall(".//a"):
        normalized = canonicalize_name(link.text)
        project = Project(
            name=link.text,
            normalized=normalized,
            url=link.attrib["href"],
        )
        projects_set.add(project)
        projects_dict[project.normalized] = project
    return SimpleIndex(
        meta=meta, projects_set=projects_set, projects_dict=projects_dict
    )


def from_v1_json(base_url: str, content: bytes) -> SimpleIndex:
    data = json.loads(content)
    meta = Meta(api_version=data["meta"]["api-version"])
    projects_set = {
        Project(name=v["name"], normalized=k, url=urljoin(base_url, f"{k}/"))
        for (k, v) in data["projects"].items()
    }
    projects_dict = {
        k: Project(
            name=v["name"], normalized=k, url=urljoin(base_url, f"{k}/")
        )
        for (k, v) in data["projects"].items()
    }

    return SimpleIndex(
        meta=meta, projects_set=projects_set, projects_dict=projects_dict
    )


def _parse_content_type(header: str) -> str:
    m = email.message.Message()
    m["content-type"] = header
    return m.get_content_type()


def get_simple_index(url: str) -> SimpleIndex:
    content_types = [
        "application/vnd.pypi.simple.v1+json",
        "application/vnd.pypi.simple.v1+html;q=0.2",
        "text/html;q=0.01",  # For legacy compatibility
    ]
    accept = ", ".join(content_types)

    resp = requests.get(url, headers={"Accept": accept})
    resp.raise_for_status()

    content_type = _parse_content_type(resp.headers.get("content-type", ""))
    match content_type:
        case "application/vnd.pypi.simple.v1+json":
            return from_v1_json(url, resp.content)
        case "application/vnd.pypi.simple.v1+html" | "text/html":
            return from_v1_html(url, resp.content)
        case _:
            raise Exception(f"Unknown content type: {content_type}")

Of course you wouldn’t have projects_set and projects_map, you’d just have projects and you’d pick one of the two types (or maybe you’d pick a list, or something else), I just included both to show that either are possible.

This is what I mean though when I say that the way that serialization formats don’t need to match 1:1 on the wire, as long as the underlying data model has all the same semantics, how that is mapped to a specific format kind of doesn’t matter.

PEP 592 technically should have increased the minor version of the API, it just didn’t really matter because nothing was really using the minor version yet. The minor version is largely advisory so that something like pip can warn users that maybe they need a newer version of pip to fully understand the index they’re using.

I agree. For the HTML case it can still be extracted from the data, but in the JSON case be calculated.

I have a use-case: users in VS Code want to install something via a UI and we want completions of project names as they type. That requires a complete list of project names.

The spec could require the key identify whether it is a display name or canonical name. It could also allow for both or either keys to be provided as long as at least one of the keys is present. That would effectively make the display name optional but still allow for the canonical name to always be something you could calculate if you were not given it.

Otherwise we know what can go into v2. :wink:

And I don’t think they will because …

I will personally write that code for mousebender. As a user you have to make the networking call and you let the library handle the interpretation of the bytes based on the returned content-type.

I think we are all agreeing that an underlying Simple Index data-model is a good idea (at least on the client side). My whole point is that neither PEP-503 nor PEP-691 actually document the underlying data-model, so any implementation you have of this is brittle / subject to future change (as will be seen by those who assumed there would only one hash type based on PEP 503, for example).

So to reiterate my proposal:

  • PEP-691 to document the underlying data-model of a Simple Index (using array, object, number, string, booleans and null, or some other primitive form (e.g. dataclasses, UML, etc.)), and not the serialization format. For JSON it is trivial to go from the model to the serialization. Even JSON itself is an OK way to define the underlying data-model, if this is less work / more palette-able - the key thing is that the PEP provides a guarantee that no matter what serialization comes afterwards (under the same version), it can fit inside the underlying data-model losslessly (incl. order :wink:).
  • Version the Simple Index API based on the endpoints + data-model. New serializations do not require a new version. Changes to the underlying data-model may introduce changes to any/all of the serializations, this is when we change the version.
  • Ideal, but not essential: In recognition that Simple Index HTML serialization was ambiguous with regard to the underlying data-model, and that there are already interpretations of the underlying data-model which were different to the one being discussed here (see Brett’s example regarding hashes), call the data-model defined in PEP-691 v2. (I see very little cost to this, but also recognise there is very little value beyond setting ourselves up to do versioning correctly)
2 Likes

I’ve updated the PEP with the feedback in this thread, the main change being switching to a format like:

{
  "meta": {
    "api-version": "1.0"
  },
  "projects": [
    {"name": "Frob"},
    {"name": "spamspamspam"},
  ]
}

for the index.

You can see the changes on Github or rendered shortly once I’m able to merge.


Some specific replies:

I’ve mostly added this, but I’ve explicitly disclaimed things like order, etc by using set, since none of the PEPs have said anything about order, I don’t believe the order to be part of the API model, just a natural consequence of HTML and JSON not having a set type.

This was already the case in the PEP, brett’s comment was just providing a different way to think about why v1.0, but the versioning is independent of serialization (and the PEP allows different serializations to emit the same data in different ways, including excluding the data completely).

The PEP now leaves things at v1.0, but provides a FAQ about why.

2 Likes

I touched up the PEP via PEP 691: touch up (#2668) · python/peps@f1af4a7 · GitHub , but it was mostly grammar and formatting stuff.

After reading the PEP, I’m happy to pronounce my acceptance of the PEP as its delegate! While I expect some things will get added to the spec in the future (e.g. application.json is going to get debated pretty fast once this rolls out, an optional url key for each project in the index object), the PEP covers all the details in PEP 503 in a JSON encoding and in a way that code can gracefully handle the transition.

5 Likes

@brettcannon You missed a couple spots. Aside from two instances of “concpetual” and “it’s presence” and “it’s inclusion” instead of “its …”, the data models code shows dist_info_metadata as an attribute of page details when it should be an attribute of files. (Is this the right way to report these things?)

Thanks Brett!

I went ahead and finished up my Warehouse PR, and got it deployed.

The existing cached responses in our CDN are for pre PEP 691, so you have to wait for them to fall out of the cache before PEP 691 is fully available, however:

$ curl -Is https://pypi.org/simple/barbican/ | grep content-type:
content-type: text/html

$ curl -Is https://pypi.org/simple/barbican/ -H 'Accept: application/vnd.pypi.simple.v1+html' | grep content-type:
content-type: application/vnd.pypi.simple.v1+html

$ curl -Is https://pypi.org/simple/barbican/ -H 'Accept: application/vnd.pypi.simple.v1+json' | grep content-type:
content-type: application/vnd.pypi.simple.v1+json

$ curl -Is https://pypi.org/simple/barbican/ -H 'Accept: application/vnd.pypi.simple.v1+html;q=0.2, application/vnd.pypi.simple.v1+json, text/html;q=0.01' | grep content-type:
content-type: application/vnd.pypi.simple.v1+json

$ curl -Is https://pypi.org/simple/barbican/ -H 'Accept: application/vnd.pypi.simple.latest+json' | grep content-type:
content-type: application/vnd.pypi.simple.v1+json

$ curl -Is 'https://pypi.org/simple/barbican/?format=application/vnd.pypi.simple.latest+json' -H 'Accept: text/html' | grep content-type:
content-type: application/vnd.pypi.simple.v1+json

Data looks like:

{
  "files": [
    {
      "filename": "barbican-6.0.0.0b1-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "bbf547f3b624714d9f7e94316cfea14aefe9e71472d198f0b104149c7aeb19f6"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/14/47/5295ed8dee1104ee2919b453e687e47c4b2ec6f5e1ee8cffff42c95c3759/barbican-6.0.0.0b1-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-6.0.0.0b1.tar.gz",
      "hashes": {
        "sha256": "4af39b8559de5640a11af6df9391718962eaad94dfdf79c24962ad83c9abc678"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/d4/33/b17816f19d213dd4cf0d92292c8f0841f2cfd8b3e780f418913036f0446d/barbican-6.0.0.0b1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-8.0.0.0rc1-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "a091f5116bf0fab1268cd57a14e4f33a773ee622d73a3b58d3786bbb5d081f09"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/03/27/e5d0feb554e89eacef10138df727083e3068069d5d3f4f158f7bfe42729f/barbican-8.0.0.0rc1-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-8.0.0.0rc1.tar.gz",
      "hashes": {
        "sha256": "2b77f95e1588ea817d0b150708059bd4c3035a43100b78501c880114cefa5c13"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/13/96/3ae0071793fb95464919158957f66977a7cd52b1ba0443af4123355df8fa/barbican-8.0.0.0rc1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-8.0.0-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "e24766d4161a34c573dd1a061bf5de94fb2d074d50a393c05ec270246512c639"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/49/69/8ed38f54b1ae9ad74e16e70c56d73697bb94a1bb3ab2619c2b6fe4d19a51/barbican-8.0.0-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-8.0.0.tar.gz",
      "hashes": {
        "sha256": "80b42b4c4f4274d1a4a5f22656627ac296587cc1742741306415208e80354104"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/c7/77/3d4bda1a2f533a49de56ef9f3944a041faf25de6ecea4cc3c4544db573b5/barbican-8.0.0.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-8.0.1-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "64f809b1d4973f21d5c007f4c70d8d28d56f980bf3fbcb5e7d807020432cf2dd"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/43/d9/0831e96e9642228525391079bb013133e39164b4be109b179e399bcd623c/barbican-8.0.1-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-8.0.1.tar.gz",
      "hashes": {
        "sha256": "2c2ae21ce7e9f4dc3cd08a2d8f639b6ad543104a1083a0cb89cae5d069fcc579"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/07/f2/fd006c128fcaffb28a302d9a67c4252ba241187a57b9a74462108048847e/barbican-8.0.1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-9.0.0.0rc1-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "d0975d2a4d4b6decb6b8408a6f19013ff4586ac3018d6f941ca694c68789aeaa"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/57/a1/514a5dbed2cd8779ee29075420c2207768f7b17ec90d3ca83936c1c1fd4d/barbican-9.0.0.0rc1-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-9.0.0.0rc1.tar.gz",
      "hashes": {
        "sha256": "103a06c6e775205bccb315c957b9aaad74ad6e6a93268544c2fe4c037b985eab"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/64/af/50076b444e1ae046c502f6ddf6c2ed1f2115bc41bf85b4457144e572d38f/barbican-9.0.0.0rc1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-9.0.0-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "134eb4e04f992ab10344e3a7ccb3eeae7274e589cbc293e55d5077a9e31da7c9"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/20/fd/59dc1408d22cf67700ddbde337a5af71671c0b3c50149035d62d8457689c/barbican-9.0.0-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-9.0.0.tar.gz",
      "hashes": {
        "sha256": "8d1f3d0a6bc338fe69196cfd2b90eb0a1b8d53f3ad7b9bc17838a348becd1c10"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/5c/91/c6c24739168ce05322e0d57b60cf833de5dcb4332ba7806a0632f4c356f5/barbican-9.0.0.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-9.0.1-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "da0f90ec423f0cc1dc1f0c30faa567f54727e610cec909604a78a9bc7e014486"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/4c/84/4edfcad99739fec4d4a8973d5192e98fdea6edb8b37f034c0a6b728c4252/barbican-9.0.1-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-9.0.1.tar.gz",
      "hashes": {
        "sha256": "980ba9a19650b87900a380d1a4745ecb1162c9e1281e83eec9e68528bc7e272e"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/e6/80/2586f3eab531223a37f0b5aea321a6fd5aeceade11e879799c0296fdd099/barbican-9.0.1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-10.0.0.0rc1-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "800ab1f170369edeaba9ecf3ddef1d3c869589a2aa56fc7ef7031190b81f6253"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/7b/d3/bfb59e01374afb293372bee45f236875ff855886a7af07f1f3cf4e04b990/barbican-10.0.0.0rc1-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-10.0.0.0rc1.tar.gz",
      "hashes": {
        "sha256": "b2e1996f3cf113cb468b283334368d6d95acc20019833b22140b1b1b32c71405"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/b4/26/50512198aa737ac676469ada3c923dbd97c014883efaaaf38dca808857ed/barbican-10.0.0.0rc1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-10.0.0-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "c774f29304879ee57e6766aa45d6acf6821add423380026c6628c64cc66e6c6e"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/76/ca/07f0afba348b23023c8a00846917ac2edeee565913fc65ff1518b591e446/barbican-10.0.0-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-10.0.0.tar.gz",
      "hashes": {
        "sha256": "4b246cb0308211395702366de39a768eaf21100cc254df17f19f16166cdbdc40"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/87/c1/66c6cd01d0730aeda20e6ca3c8c62f09ecebb3259c4b4a230922aa4073b5/barbican-10.0.0.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-10.1.0-py2.py3-none-any.whl",
      "hashes": {
        "sha256": "0f6617872fae95731776360f7f443ce989b99658a7d47c83353f6e68fb343a5b"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/e1/a9/8bbf0ad9f6c198d5ff38e1cbaa183e8bec0e577af46024a6ec39c3072aad/barbican-10.1.0-py2.py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-10.1.0.tar.gz",
      "hashes": {
        "sha256": "1b0a390a7081a554fda4c39418529b790081e12fcd0ed0a47a7b73ef02e723bd"
      },
      "requires-python": "",
      "url": "https://files.pythonhosted.org/packages/71/8a/b2922f21b51a1b84c459b47118bfc11fc7920622a7eaa7908e30b5db7ca3/barbican-10.1.0.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-11.0.0.0rc1-py3-none-any.whl",
      "hashes": {
        "sha256": "6ec981c4a7a61273973be67035ab78e028301ef297f1f2980286981b93253ef8"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/8f/90/ec22dc289b43c973e5b2e51eb061056aa7a143f59588605d047a2e7f7dd9/barbican-11.0.0.0rc1-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-11.0.0.0rc1.tar.gz",
      "hashes": {
        "sha256": "7b218e834f2450eb6b1c4a947da993f5fb30f7de00feeef5d4300f9fa5298925"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/1f/82/0a785c979a44c38b9a57db165374be663d6fd9485b1619dbbbd33077b92c/barbican-11.0.0.0rc1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-11.0.0-py3-none-any.whl",
      "hashes": {
        "sha256": "653484f0be8bbb9a1706b23816c0439747c1959965d0a22aa74674000d6895c8"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/76/7c/e9e828437e04d15d839587cab44ff7881d2d82ec9ed5d94503b3c788e7cd/barbican-11.0.0-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-11.0.0.tar.gz",
      "hashes": {
        "sha256": "2b0aa92d1beafd6eba907507892c029483b0f9ec4d5264ac093e9cac268c3f88"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/8f/eb/3b51a197ec7fd12f9081bd3195067af00098011ed0386794730718452659/barbican-11.0.0.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-12.0.0.0rc1-py3-none-any.whl",
      "hashes": {
        "sha256": "71efb12ec383163efa9c467a0188d3cfa23ff069eafc85fc51c5e0111535571c"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/9c/8f/576dc6d67a7d29aef053478c2887b14d29e754b8fa00b203c9537b30661d/barbican-12.0.0.0rc1-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-12.0.0.0rc1.tar.gz",
      "hashes": {
        "sha256": "3563e3983cf109bc33471b8c096f71cf5b14df87367a0ce24e5c7ac32585f5df"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/1c/9f/47e0c6e26db7412662775b07e27d3c15e7cb386e865403f4e1264ed24f64/barbican-12.0.0.0rc1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-12.0.0.0rc2-py3-none-any.whl",
      "hashes": {
        "sha256": "0bccce1cbe197ad259e4b158a696ea10937d8efd78a0d226a988b4950d72d413"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/ae/c7/07aaf632d8e1441bd76dd4f2d7cdba5f9d5fd615fe59c0304eed22f9b9c3/barbican-12.0.0.0rc2-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-12.0.0.0rc2.tar.gz",
      "hashes": {
        "sha256": "62f96e4096622698921558bf0a8299b8b2c712c8545ff655cba2ae9948eda414"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/21/a8/bcc9ab14814fff33cc3d1a02563f4a4968ac843ce6bbb1322decbecdfe6a/barbican-12.0.0.0rc2.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-12.0.0-py3-none-any.whl",
      "hashes": {
        "sha256": "ddb70de5125a3e9d958fe17505a3049d16b260af5dc2825ea357c653fd3583ef"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/90/d6/dbcc8c287f865437c538abe0420aec60c7518caa36095225a4c80bd24f59/barbican-12.0.0-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-12.0.0.tar.gz",
      "hashes": {
        "sha256": "e39eaafd350ecff03c827c46bb7b7d81bc104325e4e6a3402b6d3bbbf47f278a"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/f8/9e/11ba79ba61f8c03ea821394e46112c34306817c12704556b3f8fa26e6398/barbican-12.0.0.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-13.0.0.0rc1-py3-none-any.whl",
      "hashes": {
        "sha256": "c097ec39bf5880f5c458ed8dd0c7b41f571304621369792c3ebf1ee0e28daba6"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/3b/b2/23c66be9909efa7b585dc9d02fe7f083805fe0041ee308577151bd18ca7c/barbican-13.0.0.0rc1-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-13.0.0.0rc1.tar.gz",
      "hashes": {
        "sha256": "a3fd4a0005e8005f986088e7a66182c1c87134984ef5d72883652b01c0543da7"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/2c/21/4ac69a05762fe354a4266237b24d86ce8d85933ee35dbf52733e5595a070/barbican-13.0.0.0rc1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-13.0.0-py3-none-any.whl",
      "hashes": {
        "sha256": "1ba79a72fff3fb6cab8001d0ad39a4ace93964283f53cd5b47d1f2f0883d8356"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/34/de/beebf35aea1716d64b9f2b526aef14a411779b98f832a5e60208e0db6d2d/barbican-13.0.0-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-13.0.0.tar.gz",
      "hashes": {
        "sha256": "08a5285d9d283a99d88079ee14c6dde3cd6ffcdaccad6caef1ba8b921576e84e"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/92/2d/c59de2ce4d6d5bccdec28a4005df0b4f3d47dcb5cc058b95f3a8ed0089c0/barbican-13.0.0.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-14.0.0.0rc1-py3-none-any.whl",
      "hashes": {
        "sha256": "9c8e7925786e184ec114e9d3f84c2fe33529b62fa386800a2d4e68560529bfc4"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/86/3e/b115a43477d52ea69c5b0736099c8fc7706090f111839aa600370963be04/barbican-14.0.0.0rc1-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-14.0.0.0rc1.tar.gz",
      "hashes": {
        "sha256": "0450a699500a9f757d18ea810aefa970230a402d65e92d4c39b6f1dda48d6e26"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/46/9c/8d2ef41e6bc95661e457c8317b9882534031d30d4722c2b902189d32656f/barbican-14.0.0.0rc1.tar.gz",
      "yanked": false
    },
    {
      "filename": "barbican-14.0.0-py3-none-any.whl",
      "hashes": {
        "sha256": "27c679ba1d30a8a31545c9738fc70b43e58ba7e9fda0cc2415d1cac3825e5d95"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/8e/d2/e383faafb6ac8d47d447d26549a341bee52b619aa2a6d0d0c04c5d17e157/barbican-14.0.0-py3-none-any.whl",
      "yanked": false
    },
    {
      "filename": "barbican-14.0.0.tar.gz",
      "hashes": {
        "sha256": "1a034410189d045974bf70b703ecdce17c1a7b6a14814541e05ba5cb34f6e419"
      },
      "requires-python": ">=3.6",
      "url": "https://files.pythonhosted.org/packages/7a/93/551e43aefa86a6f57e1852d60568024e12a20f5f9bf316a37fc869c0c274/barbican-14.0.0.tar.gz",
      "yanked": false
    }
  ],
  "meta": {
    "_last-serial": 13345165,
    "api-version": "1.0"
  },
  "name": "barbican"
}

I’m not going to purge the cache because that hammers our origin servers something fierce, but it should naturally become available over the next 24-48 hours or so.

Thanks everyone who contributed to making this PEP better!

2 Likes

Also want to echo thanks to all and especially to Donald for the original draft PEP and hammering the initial implementation through!

Now it’s finally time to start discussing xmlrpc deprecation!

1 Like

I know I’m a little late to this, and I know it’s already been discussed and decided – but wanted to also voice my concern that choosing headers to route essentially rules out entirely static indexes from ever implementing this PEP – yes I read this section but I am unhappy with the conclusion there. (the section basically says s3: no, github pages: no, apache: yes, but this is no longer a “static site” if you need to run apache to negotiate headers).

it would essentially mean GitHub - chriskuehl/dumb-pypi: PyPI generator, backed entirely by static files would be dead in the water

Not dead in the water, you would just have to choose one content-type to serve. I think it will be a while before package downloaders will only support the JSON API.

Also for S3, I would recommend setting up CloudFront anyway to enable authentication (for private indices) and HTTPS.

you would just have to choose one content-type to serve. I think it will be a while before package downloaders will only support the JSON API.

yeah which means it would not be able to adopt this PEP – which is my point. It would be unable to serve both which means it would be incompatible with either new tools or old tools – also making it unable to transition. path-based json would be trivial to implement with static files whereas header-based is impossible.

I would recommend setting up CloudFront anyway

as far as I know cloudfront doesn’t help – it can only route by path not by header. you can have it front a lambda but again that’s not static files

The alternative to using a header is using different paths for json and html and expecting users to configure the right path, which is still entirely possible with PEP 691.

You’re not required to serve everything at one URL, you’re just able to. If you can’t do that it degrades to the same user experience we would have had otherwise.

Many thanks to @dstufft and the other contributors for working on this PEP and getting it accepted!


I did one more read over the PEP since the last changes, and besides some small tweaks (which I made a PR for), I’ve noticed one more key detail that seemed overlooked…

The name field in Project List being normalized vs un-normalized has been discussed, but the name field in Project Detail was not.
For the same reason the url field was dropped from the Project list, I don’t see any need for the normalized name to be returned in the Project Detail response, because you wouldn’t even be able to get that response without already knowing that name in the first place.
With the url field you needed to combine known information (the name) with knowledge of the spec. But this field is literally just returning back known information, which PEP503 also didn’t do.

So, I see 2 options:

  1. This field is dropped from the response. This would be the most consistent with PEP503.
  2. The field will be the un-normalized name instead of the normalized name. And unlike with the Project List, there is no need to be ambiguous about the name being normalized or not, because it wasn’t in PEP503 before.

I don’t precisely know what the allowed scope for changes are after a PEP has been accepted. But option 1 could be considered backwards-incompatible compared to the already accepted PEP-0691, while option 2 doesn’t have to be.

That’s technically only true if you were given/calculated the URL via “discovery” from the PEP’s URL structure or the project index. But if you used some other mechanism to get the JSON data then that doesn’t hold true.

I would be fine with that, but I’m also fine with leaving it so the JSON data is a bit more self-contained (it’s also probably not a ton of bytes compared to the rest of the JSON payload, plus it may compress okay since it will match what’s in the wheel file name).

See PEP 691: JSON-based Simple API for Python Package Indexes - #61 by brettcannon for my view of the purposes of normalized vs. non-normalized. I also don’t know how much it would add to the payload in the face of compression like the normalized name might be.

If we made the change right now, before anyone has had time to really implement it, then the PEP authors propose a change and I make a call to accept it or not.

I don’t have strong opinions either way, I included it because in practice PyPI’s implementation of /simple/foo/ included it, and it was the normalized name. It’s a relatively small amount of data so I wasn’t too concerned either way.

In PEP 694: Upload 2.0 API for Python Package Repositories, it states that unlike this PEP, the new upload API uses a new endpoint.

Unlike PEP 691, this PEP does not change the existing 1.0 API in any way, so servers will be required to host the new API described in this PEP at a different endpoint than the existing upload API.

Why was the upload API chosen to be a separate endpoint, while the JSON API focuses on a single endpoint differentiated by headers (with the option of course to point at different endpoints per format)? It’s definitely more natural to use a different endpoint for new upload semantics, but that same argument could be applied to the new download API as well.

1 Like

Because the URL structure, and the semantics of what those URLs are, of the simple API hasn’t changed at all in PEP 691. It was just creating a different representation of the same data.

For the Upload API the semantics of the URLs are drastically different to the point that they have almost nothing in common with each other.

1 Like

Just to close the loop here since there were some concerns with static mirrors.

I have working configurations for both Apache and Nginx for bandersnatch.

Assuming you have Apache configured to have mod_negotiation enabled and to allow .htaccess, you can implement basic support for PEP 691 by writing index.html, index.v1_html, and index.v1_json files for all of the URLs, and dropping a top level .htaccess that looks like:

Options -Indexes +Multiviews

DirectoryIndex index

AddType application/vnd.pypi.simple.v1+json v1_json
AddType application/vnd.pypi.simple.v1+html v1_html

That doesn’t support the latest version (Apache doesn’t make it easy to separate the returned content type from the content type specified in the Accept header) or the ?format= query param. Both of those things are supportable I think using mod_rewrite, but I didnt’ have time to dig into it further.

A weird artifact of the Apache configuration is that it doesn’t have any option to configure a server side preference, so in cases where multiple content types are equally preferred by the client, it will return which ever response is the smallest.

The Nginx configuration is a little more complex, to see the whole thing you’re best off looking at the bandersnatch issue, but the important parts are:

http {

    # ...

    map $http_accept $mirror_suffix {
        default ".html";

        "~*application/vnd\.pypi\.simple\.latest\+json" ".v1_json";
        "~*application/vnd\.pypi\.simple\.latest\+html" ".v1_html";

        "~*application/vnd\.pypi\.simple\.v1\+json" ".v1_json";
        "~*application/vnd\.pypi\.simple\.v1\+html" ".v1_html";

        "~*text/html" ".html";
    }

    map $arg_format $mirror_suffix_via_url {
        "application/vnd.pypi.simple.latest+json" ".v1_json";
        "application/vnd.pypi.simple.latest+html" ".v1_html";

        "application/vnd.pypi.simple.v1+json" ".v1_json";
        "application/vnd.pypi.simple.v1+html" ".v1_html";

        "text/html" ".html";
    }

    server {

        # ...

        location /simple/ {
            index index$mirror_suffix_via_url index$mirror_suffix;

            types {
                application/vnd.pypi.simple.v1+json v1_json;
                application/vnd.pypi.simple.v1+html v1_html;
                text/html html;
            }
        }
    }
}

This doesn’t actually implement conneg, in that Nginx is not parsing the Accept header and doing the full content negotiation algorithm as recommended by the RFC, and instead it’s just doing a regex match against the Accept header (and a basic string equals against the ?format= parameter) and mapping that to a file extension that gets set in the index directive.

In practice, this should be fine. The main downside is it won’t let clients express a relative preference between the content types in their Accept header (they can specify it, nginx just won’t pay attention to it). The RFCs don’t require the server to take the relative client preferences into account, so it’s valid not to do that, it’s just somewhat better if you do.

Unlike the Apache example, the Nginx example allows setting the default value you want when there is no Accept header, or the Accept header doesn’t contain one of the specified content types, which is controlled by the default value in the first map. It also allows the server to express a preference between the content types, controlled by putting the preferred, non-default, option higher in the map.

In addition, the nginx example also supports:

  • The latest version, which will return the correct Content-Type.
  • The ?format= query string, which correctly overrides the Accept header.

Those two web servers probably cover the bulk of all static mirrors out there, and of course (as mentioned in the PEP), if someone is in a situation where they cannot use conneg, the PEP still supports using independent URLs for different versions, and selecting html or json by configuring your index url in the client.

1 Like