Adding a global config to specify package indexes

While we’re throwing ideas into the mix, I would love to have:

  • per-site settings (i.e. somewhere under sys.prefix, so that my environment/distro/venv can have its own repos)
  • “final” markers (so that no other configuration is able to override/remove it, because in many security-conscious environments this is the best way to handle it), or alternatively:
  • remapping indexes, that take a URL prefix and replace it (same reasons - if anyone specifies https://pypi.io/... in an env var then my site/global config can point it back to our internal mirror)

That last suggestion (last two points) would obviously not be recommended for frequent use - not every setting has to be used all the time - but they are the only way to enforce these kinds of restriction in the spec.

1 Like

This is a very important question.

There is a pain point with the current pip config, where other tools wrap around it - but extend it in their own way, resulting in fragmentation and reinventing each other’s wheels. It is my understanding that this is likely the primary reason why so many people in this thread throw ideas out onto the table and wants to see improvements in this area.

So, a question to raise here is for whom this PEP would be for. Is it for pip or the tools or both?
The userbase of pip is ginormous and can’t just break backwards-compatibility, like you say @pf_moore. It’s likely that the other tools have an easier time doing this, especially if this starts solving long-standing problems.

One thing to consider is to improve the existing pip.ini instead. But wouldn’t this mean that a handshake with the pip development team needs to take place, taking both their roadmap and funding into consideration, making it even harder to nail a PEP and get it passed?

EDIT: typos, wording…

1 Like

We don’t need a PEP for if it’s just for pip, just a feature request. But then what about other installers? I know there aren’t any right now, realistically, but I’m a very strong believer that we mustn’t lock ourselves back into a state where we cannot replace pip because too many things depend on its exact behaviour.

So the PEP has to be for everyone to agree on a common means of specifying some config information. So there are a few questions:

  1. What config information precisely? @FFY00 specifically wants index details. I suggested network config. Anything else?
  2. What functionality do we need? Global/per-venv/user levels of setting? Setting via file and environment variables? What about per-invocation (command line options)? Pip’s config system is pretty complex, not because we wanted it to be, but because users wanted that level of flexibility. Can we realistically replace that with a less-complex system? If not, what will happen to users who rely on the flexibility pip offers?
  3. What about backward compatibility? How will we migrate from the existing pip-specific settings to the new standard? During the transition, what will we do about people who have both types of settings?

At the moment, we have a lot of questions, but very few answers.

IMO, someone needs to make some decisions, come up with a proposal, and let it be challenged. Otherwise, we’ll just continue asking more and more questions, but not making progress.

Here’s a strawman proposal, not to be taken seriously, but to give us something to measure other proposals against, and also in the hope that if people challenge it, we’ll get more clarity on what the actual problems are with the status quo:

The Status Quo

  • We retain pip’s existing configuration system, and document it as “the standard approach”.
  • Only the --index-url, --no-index and --extra-index-url options are considered “standard”. This meets the scope defined by @FFY00 - we can add more if people want, by naming pip options explicitly.
  • Tools that want to read the configuration can read it in the same way as pip does. If there is sufficient interest, pip’s option handling code could be extracted to a separate utility library, which pip would then vendor. The pip devs likely won’t do that, it would be down to a 3rd party to take that work on.
  • There are a couple of warts with this proposal, but these are not considered showstoppers:
    • The filenames include “pip” in the name. This could be changed, but it’s not clear that the compatibility impact justifies such a change.
    • Pip stores additional configuration in the files, which is not covered by this standard. That’s again sub-optimal but acceptable.
  • There are no backward compatibility problems to consider, because nothing is changing.

I repeat - this is not a serious proposal¹!!! But it can act as a benchmark against which genuine proposals can be measured.

¹ Except in the sense that it’s what will happen if nobody suggests anything else…

Improve how? I’m not being awkward here, someone needs to explain precisely what’s wrong with pip.ini, otherwise how can we change it?

4 Likes

This thread started as “let’s share the URLs to the package indexes”, and seems to have taken a turn towards “let’s share as much config as possible”. I have definitely seen a need for the former (see this for example, that involves indexes of pip, poetry and tox), and would also be on board for the latter.

Maybe all that’s needed to get this started is to write a (PyPA) PEP to say: "we have this PEP-standardized ~/.config/pypa/pepxxx.toml file and we will slowly start adding sections and fields to it, just like we did with pyproject.toml". The first section might be a [pep503.repositories]. Then build up from there.


Probably out of scope but related…

Things I would like to see, in the same vein, is a shared download cache for example. Wheels and sdists downloaded by pip, poetry, etc. are all the same, but as far as I know they are not shared. Once the config for PEP 503 indexes is standardized, it could be a good idea to have a shared cache where poetry could see that pip already downloaded some of the distributions from that exact same package index.

3 Likes

IMO going after different configs with unclear practical use-cases might be a bit too ambitious. Personally, I believe having all of this complexity in config discovery is a bad thing, but there are use-cases that require it, so we may need to think about that.

I really only want the config options already supported in pip.conf, but available to everyone. So, my proposal would be to introduce a new configuration file that simply defines the same index settings that pip.conf does (format can be different, but same information). It would be adopted by pip and the old configuration options deprecated. During the adoption, users might have to have both configs, if their pip version does not yet support the new config, which seems perfectly reasonable to me.

Afterwards, people that want to tackle different use-cases can submit a new PEP outlining their needs and proposing additions, either to the config itself or the config discovery, if needed.

1 Like

Coming back to the index proposal, here’s a bunch of questions for the PEP to answer:

  • Should we take indexes as a list of strings, even if there’s only 1?
  • How many places do we want to allow this configuration to be loaded from?
  • How do the various files interact with each other? Merge with override (similar to dict.update)?

If you want some novelty that’ll create more design+concensus building work:

  • Do we want to have “index shadowing”? This would be something like pkgA is in the first index, use it from there and don’t look if there’s an newer version in index B.
2 Likes

As I mentioned above, I think it would be best to just leave the scope of this PEP to one location.

1 Like

However many locations is fine, provided they are well defined (for all platforms we care to define them for) and the interactions between them are well defined (i.e. append vs. prepend vs. replace vs. final).

I don’t think it has to change pip’s current settings at all, just provides another default location to load the values from. If pip’s own config/env/CLI exists, then that can take precedence for pip.

I would like to have this, or something to this effect (I’d prefer an index constraint for packages listed in a constraints file, I think…), but it’s out of scope for this.

Arguably the only thing that is needed for specifying indexes is a name=url mapping, so that “–index foo” and “-r foo” and “–channel foo” in three different tools can all look up foo in the same place.

And perhaps it is most easily handled by just adding it to packaging and saying “dear tool developer, this isn’t a formal standard, but if you want to save yourself the trouble of choosing where to put config files and writing docs for your users, you can just use this”, which is at least as compelling as a written standard would be.

1 Like

I’m not clear what the actual use case for this feature is, so I have no real feel for whether pip’s config taking precendence is actually useful behaviour. What would someone putting a setting in this new file be expecting to happen?

1 Like

I’ve begun drafting something.

This is something I’ve not thought of before. What would the ideal default behavior be here, if this was a per-repository attribute?

I’m guessing you normally don’t want “shadowing”, if you think of it as “variable shadowing”?
Meaning, if you had shadow=false, a pkg was found in index1 and index2, it would take it from index1. Right?

EDIT: Managed to get the logic wrong in my example…

1 Like

Pip’s documented behaviour is that it’s undefined. If you have files for package-a version X.Y in both index1 and index2, then you can take either as they are expected to be identical.

In reality pip does pick one over the other, but it’s implementation-defined. This is a contentious behaviour (it’s linked to the “dependency confusion” debate that happened not too long ago). Whatever you decide, someone’s bound to hate you :wink:

There are two types of shadowing in play here:

  1. When the same package and version appears in both index A and B prefer A or B over the other.

  2. When the same package appears in both index A and B, and no specific version was requested, use the highest compatible version from A (or B) even if there is a newer version in B (or A).

2 Likes

It has been a bit silent here. Just want to chime in and say I am still drafting something up together with a few others. Will post the link to the draft and invite everyone to the party as soon as we have ironed out the basics.

I don’t want to give you an ETA, really. Life with work, family and kids makes these things always take a bit longer than you expect. :slight_smile:

EDIT: By the way, I’m doing this because I thought it would be rewarding to have the experience of - and interesting to be part of - drafting a PEP. So I don’t know how the process is usually like. Please don’t take offence if you feel I am going about this the wrong way. Let me know what you think!

The conversation over at my draft has gone stale for some time and so I’m posting it here:

If anyone has any input/feedback, that would be greatly appreciated. I think that what’s the top priority right now is to define the scope, to avoid scope creep here.