PEP 691: JSON-based Simple API for Python Package Indexes

Just to close the loop here since there were some concerns with static mirrors.

I have working configurations for both Apache and Nginx for bandersnatch.

Assuming you have Apache configured to have mod_negotiation enabled and to allow .htaccess, you can implement basic support for PEP 691 by writing index.html, index.v1_html, and index.v1_json files for all of the URLs, and dropping a top level .htaccess that looks like:

Options -Indexes +Multiviews

DirectoryIndex index

AddType application/vnd.pypi.simple.v1+json v1_json
AddType application/vnd.pypi.simple.v1+html v1_html

That doesn’t support the latest version (Apache doesn’t make it easy to separate the returned content type from the content type specified in the Accept header) or the ?format= query param. Both of those things are supportable I think using mod_rewrite, but I didnt’ have time to dig into it further.

A weird artifact of the Apache configuration is that it doesn’t have any option to configure a server side preference, so in cases where multiple content types are equally preferred by the client, it will return which ever response is the smallest.

The Nginx configuration is a little more complex, to see the whole thing you’re best off looking at the bandersnatch issue, but the important parts are:

http {

    # ...

    map $http_accept $mirror_suffix {
        default ".html";

        "~*application/vnd\.pypi\.simple\.latest\+json" ".v1_json";
        "~*application/vnd\.pypi\.simple\.latest\+html" ".v1_html";

        "~*application/vnd\.pypi\.simple\.v1\+json" ".v1_json";
        "~*application/vnd\.pypi\.simple\.v1\+html" ".v1_html";

        "~*text/html" ".html";
    }

    map $arg_format $mirror_suffix_via_url {
        "application/vnd.pypi.simple.latest+json" ".v1_json";
        "application/vnd.pypi.simple.latest+html" ".v1_html";

        "application/vnd.pypi.simple.v1+json" ".v1_json";
        "application/vnd.pypi.simple.v1+html" ".v1_html";

        "text/html" ".html";
    }

    server {

        # ...

        location /simple/ {
            index index$mirror_suffix_via_url index$mirror_suffix;

            types {
                application/vnd.pypi.simple.v1+json v1_json;
                application/vnd.pypi.simple.v1+html v1_html;
                text/html html;
            }
        }
    }
}

This doesn’t actually implement conneg, in that Nginx is not parsing the Accept header and doing the full content negotiation algorithm as recommended by the RFC, and instead it’s just doing a regex match against the Accept header (and a basic string equals against the ?format= parameter) and mapping that to a file extension that gets set in the index directive.

In practice, this should be fine. The main downside is it won’t let clients express a relative preference between the content types in their Accept header (they can specify it, nginx just won’t pay attention to it). The RFCs don’t require the server to take the relative client preferences into account, so it’s valid not to do that, it’s just somewhat better if you do.

Unlike the Apache example, the Nginx example allows setting the default value you want when there is no Accept header, or the Accept header doesn’t contain one of the specified content types, which is controlled by the default value in the first map. It also allows the server to express a preference between the content types, controlled by putting the preferred, non-default, option higher in the map.

In addition, the nginx example also supports:

  • The latest version, which will return the correct Content-Type.
  • The ?format= query string, which correctly overrides the Accept header.

Those two web servers probably cover the bulk of all static mirrors out there, and of course (as mentioned in the PEP), if someone is in a situation where they cannot use conneg, the PEP still supports using independent URLs for different versions, and selecting html or json by configuring your index url in the client.

1 Like