Thanks everyone for the great feedback and questions! We are keeping an eye on them and plan to respond in more detail on technical matters and administrative questions, but I think the most important thing to discuss right now is this:
The question then is whether we believe that a more diverse federation of independent indexes is better, feasible, and usable. PEP 759 definitely takes the position that PyPI is the primary, authoritative index for the Python community, and that additional indexes are unwieldy to maintain and difficult to use. Unfixably so? Let’s dive in and see where it leads us.
PEP 708, which has been provisionally accepted, is part of the trust mix that reduces the chances for dependency confusion attacks across multiple indexes. It doesn’t address the ease of setting up and maintaining alternate indexes and doesn’t address the end user UX of enabling multiple indexes. Even so, it’s not clear what the status and prognosis is for the full acceptance criteria and adoption across the ecosystem.
Let’s say that simpleindex makes it at least as easy to set up and maintain an alternative index as it would be to set up and maintain a PEP 759 external host. There’s an example in simpleindex’s README for enabling S3 routes, and in our minds we were thinking about S3 as a possibility for PEP 759 external hosted URLs. Let’s further say that simpleindex or something like it could handle the sustained and peak loads it could potentially be put under for large, popular indexes.
The question then is, can we make the UX for multiple indexes work?
Imagine a complex dependency graph that ends up hitting four indexes: PyPI, and indexes AI, BI, and CI. All four host packages that need to be installed for the user’s application to work. Their top-level dependency lives on PyPI, so they just pip install mydep
. That would break at some point when the package they need refers to AI. So then they add --extra-index-url
for AI and try to install again, but now it fails a little later trying to access BI. Rinse and repeat.
Or worse, it seems to work but in fact doesn’t really install the correct mix of package and versions.
There’s no way for that user, who may not even know anything about extra indexes or deep transitive dependencies, to get a working venv out of the box, let alone after minutes or hours of frustrating interwebs searching. That scenario will leave a bad taste, so the “out of the box experience for non-experts” has to be concretely addressed. I’d go so far as to say that’s the most important UX problem to solve.
A quick read of @sinoroc idea wouldn’t help here, but also, how do you make such pre-configurations portable, to other desktops or CI machines? I don’t think you really want to have to copy-and-paste long pip
command lines either.
But let’s say we solve that too. Now comes the problem of index priority, which has been brought up elsewhere. E.g. if each of AI, BI, and CI host rootpkg
how will pip
know which index is the “right” one to get it from? How will it know that BI’s version is out of date, so get it from CI? Extend that to many packages in the transitive dependency graph, and now you have an even more complex configuration problem. I’d need to tell pip
, AI is the highest priority for pkgA
, but BI is the highest priority for pkgB
and pkgZ
, CI is the highest priority for pkgC
and use PyPI for everything else. Both discovering this and configuring this is going to be extremely challenging I suspect.
A further challenge is implementing all of this across the ecosystem. As @dustin points out, these aren’t challenges that can be overcome by standards. You’d have to work it out in pip
and uv
at a minimum, find UX and configuration that works for both tools, wait for those changes to roll out, and have some suboptimal answer for the long tail.
I think that’s an enormous effort, and I’m skeptical as a practical matter that it will can happen. But I’m also open and eager to read the persuasive refutations I expect will follow soon! ![:smiley: :smiley:](https://emoji.discourse-cdn.com/apple/smiley.png?v=12)
In contrast, PEP 759 unequivocally doubles down on PyPI being both the canonical source of truth for package metadata, and the default such index, with all the trust that implies.