PEP 755: Implicit namespace policy for PyPI

You’re missing my point. You specifically said:

I’m saying that the idea that we might want to work on explicit namespaces, the precise solution to this problem that you rejected in favour of this proposal, is an admission that this proposal is not the best solution, but merely a half-measure that you feel is cheap enough to get implemented now.

If companies value namespaces enough, they could fund work on explicit namespaces. The fact that they aren’t willing to do so indicates that the cost isn’t worth the benefit to them. And it appears not to be for open source projects, so if it’s not worth it for companies either, why should we do it? If implementing implicit namespaces is enough to get companies to pay for the feature, we should absolutely not direct that money towards a further solution in this area - companies have what they are willing to pay for, so we should use the money elsewhere, for projects the community feels are worthwhile.

While I’m not saying this is what you intend, the way you present this feels like a “bait and switch” - get the community to agree to implicit namespaces on the promise of it generating funding, and then use that funding to improve namespaces, rather than provide benefits for the community. At this point, you really need to focus more on trying to bring the community on board, rather than discussing how companies might gain from it - doing that feels like it’s losing you support. I suggest you think about what the benefits of this proposal would be if nobody was willing to pay to use it. If there isn’t a case for the feature in that situation, then in a very real sense, this is a case of pure funded work for the benefit of the people paying. And expecting volunteers to spend time on such work, rather than hiring contributors to work on it, feels wrong to me.

6 Likes

I think if explicit namespaces were implemented already that this would still be a valid proposal (unless the flat namespace goes away, which it won’t). I view the concept of implicit namespaces over the flat namespace as a prerequisite for explicit namespaces.

I don’t. I view the explicit namespace idea (using a new syntax to specify a namespaced package, leaving the existing flat namespace unchanged) as a way of avoiding the need to carve up the existing flat namespace. And that’s fundamentally what I object to in this proposal - having a way for an organisation to claim a chunk of the existing namespace, ignoring or overriding current naming conventions like flufl.*[1].

You seem to be ignoring the idea that there are people who like the existing open, flat, namespace.


  1. Particularly the fact that only organisations can claim chunks of the namespace, not individuals ↩︎

Explicit namespaces already exist in the form of alternate index servers/repositories, and I’d argue that the companies who value them enough are already paying to create and/or use them.
What we really find is missing here is a user experience that leads users to be able to use them easily and safely.

A good example of “easily” is conda’s -c option replacing simple names with a hosted URL by that name (e.g. -c pytorch implies -c https://repo.anaconda.org/pytorch,[1] which is owned by the Pytorch team).

An example of not-safely is torchtriton.

For these reasons, big orgs prefer to release into the flat PyPI namespace, because it ensures their users get the best experience. But I’m sure if these problems were solved in a different way, then there’d be less pressure to restrict the PyPI namespace.


  1. Or something like that, don’t quote me. It’s definitely on anaconda.org, which is a Github-like site for hosting your own conda and PyPI-like feeds. ↩︎

2 Likes

I argued that, but @ofek put it in the rejected section stating that alternate index servers don’t address the issue :person_shrugging:

Agreed better UX would help. That’s not a standards matter, though. I believe uv is working on that. Pip had someone (I can’t recall who) offer to try to get funding for work on it, but it hasn’t materialised yet.

Agreed. So let’s reject this proposal in favour of those other solutions. And if there’s company pressure behind this proposal, let’s politely suggest that such pressure gets turned into concrete funding for UX studies and work on tool UI improvements…

Someone could start by adding this to GitHub - psf/fundable-packaging-improvements: Packaging improvements that could be funded

3 Likes

I doubt it I’m curious if either of you actually disagree with the section. I’ll reproduce it here:


Critically, this imposes a burden on projects to maintain their own infra. This is an unrealistic expectation for the vast majority of companies and a complete non-starter for community projects.

This does not help in most cases because the default behavior of most package managers is to use PyPI so users attempting to perform a simple pip install would already be vulnerable to malicious packages.

In this theoretical future every project must document how to add their repository to dependency resolution, which would be different for each package manager. Few package managers are able to download specific dependencies from specific repositories and would require users to use verbose configuration in the common case.

The ones that do not support this would instead find a given package using an ordered enumeration of repositories, leading to dependency confusion. For example, say a user wants two packages from two custom repositories X and Y. If each repository has both packages but one is malicious on X and the other is malicious on Y then the user would be unable to satisfy their requirements without encountering a malicious package.

3 Likes

It’s accurate enough, but it presumes that nobody does anything to make it any better. We already have a PEP that deals with dependency confusion (if/when installers implement it), and there was an old proposal standardising repository config which would go a long way to handling the rest.

1 Like

I think nothing can be done to make the first part better except for offering free infrastructure.

Critically, this imposes a burden on projects to maintain their own infra. This is an unrealistic expectation for the vast majority of companies and a complete non-starter for community projects.

I’m not sure it has to. What if user.pypi.org was a generated subdomain/index that effectively exposed a mirror of that’s user’s packages? There’s a lot still unexplored here.

7 Likes

I don’t think this is true. And even if it is, what’s to stop someone (the PSF, or someone else) from setting up a public index server, using something like devpi, offering hosting for anyone with a PyPI organisation (so companies still pay to use the service, giving us a revenue stream to support it)? I did mention this option previously…

That’s precisely what “improving the UI” would fix, so dismissing this as a solution because of fixable issues seems unreasonable to me.

Again, this is what funding UX improvement studies and work would address. It’s very easy to dismiss alternative suggestions as not workable because they don’t exist yet, but it’s hardly a valid reason to support your proposal (which also doesn’t exist yet).

Again, fixable, and also addressed (to an extent) by PEP 708 (which needs implementation effort, and working on this proposal would divert needed resources from that work).

Precisely this. And presuming nobody does anything to make it better while at the same time suggesting people do work to implement your preferred solution isn’t particularly convincing.

3 Likes

And there is prior art here, e.g. Anaconda organizations can create their own channels. Of course that’s a company with their own (perhaps precarious) funding source for infrastructure. But it is an example of a centralized system for something like this.

3 Likes

It’s a bit of a chicken-and-egg problem, though, and it ends up feeling like funding a Kickstarter to companies: “If you fund this upfront, you might get namespaces”. And that’s not even talking about whether you get credits or something if there’s still a component of paying for something on PyPI later on.

Now, it we accepted a PEP that said, “this will definitely be available once the PSF has enough funds to make it happen,” then you might be able to make it work.

That’s an interesting idea, especially if you add in org accounts as getting a namespace and it isn’t a cost burden on PyPI. I guess you would have to then have a UX that didn’t make it a burden to list all of those indexes as you would have to cut out PyPI as a default to guarantee you only got the packages from the indexes you trusted.

But the thing is we already do offer free infrastructure, namely PyPI. It’s just that the UX for using that free infrastructure is all funneled through a single repository and the widely used tools assume that repository is the default or even the only one. From an infrastructure perspective it doesn’t seem like it would be insanely more resource-intensive if the URLs were different and users uploaded packages into “channels” a la conda (or similar to @mikeshardmind’s suggestion) instead of into a global namespace.

Of course none of these are problems with your proposal per se. They’re just ways of saying “maybe we should work on a more comprehensive solution that is a bit more complex but fixes a bigger chunk of the pain points”.

The thing, is Python is already the chicken and has laid the egg: companies already get to use PyPI to distribute their packages for free.

5 Likes

I think the idea of coupling funding with namespaces is not one worth discussing. We (Sentry) are funding things happily but the value proposition of controlling a namespace is dubious and we rather not have that coupled to a payment. We control a namespace on NPM and nuget and we might have paid for that if that was the only way to do so, but probably just a nominal amount. Sure: we’re just a data point of one, but generally namespace are established in other package managers too and nowhere are they used to generate reasonable amount of revenue.

7 Likes

I’ve read the discussion and I find it really interesting, and would like to explain how it looks like from a non-corporate, big open-source, vendor-neutral project that might be good user of this - Apache Airflow (also mentioned by @ofek in the PEPs).

There are two distinct issues discussed here, I think:

  1. whether it’s ok to use implicit namespace as defined by prefix in package name or whether it should be a new feature of packaging to somewhat associate the packages with it’s parent organisation

  2. whether it’s ok to get it as a “mostly paid” feature for big-gish organisations (I assume and this was suggested by me some time ago that some organisations - like Apache Software Foundation which are effectively open-source non-profit stewards should also join the club.

Since I am not very active here and have no merit to make or lobby for decisions, I will rather explain how it loooks like from our side - but of course the decision what to do is in the hands of the people who have merit in PSF/packaging.

For 1) I have a bit mixed feeling.

On one hand fior years we have adopted such schema: apache-airflow-providers-PROVIDER_NAME is the standard way how we name our provider packages (we have over 90 of those). And whenever we add a new integration, we always have this moment of thrilll - “is the package already taken by someone?”. We had recently an example with apache-airflow-providers-teradata that was (2 years ago) created by somoene from our community who had no idea that this is somewhat confusing. And when Teradata company actually contributed their provider to Airflow, we had a bit of a problem - how to name it. Fortunately this was a good member of the community and his provider was not really used by anyone and he agreed to transfer ownership, but I easily imagine a situation where someone is proactively pushing malware packages with “apache-airflow-providers-NEW_FANCY_INTEGRATION” before we manage to reserve it (we typically reserve package names when we start discussing adopting them via the “organisation” feature). It’s a matter of someone being malicious against us - for whatever reason and doing it first - our naming convention is known and our users know what to expect as a name there.

And having “apache-” upfront is a soft (not enforceable for now) requirement of the Foundation. One of the important aspects for the ASF is trademark policy Apache Software Foundation Trademark Policy - a lot of the policy is about making sure that our users can easily distinguish an “ASF” released and maintained code from a “3rd-party” managed code. It’s all about trademarks and brand. And Trademarks and Brand are the only way ASF can protect its communities and assets and the main mechanism the foundation can control their right. Simply because of Trademark and Brands it can build sustainable communities, that can be stronger than vendors that might want to take ownership/control of some of the open-source projects of the ASF.

And ASF has informal (and not enforceable) convention during incubation of projects that the package names should be “apache-” (where PMC is the project) - precisely to avoid the confusion.

So having PEP 755/752 implemented would make our life way easier to follow the convention.

But on the other hand I see that there is a risk of abusing it by reserving some obvious and “precious” chunks of the namespace. I am not sure if just “paying” is enough of a gate. Which leads me to problem 2)

  1. What criteria to apply - I personaly like what has been proposed earlier in the discussion - base it on actual trademarks registered by the organisation that wants to reserve the name. This has some obvious problems (which country? trademarks for what scope ? etc. ) - but I think that is solvable - and at least will be signifcant barrier to gate bad actors. This way simply PyPI would be outsourcing verification of such claims to the authorities.

Of course maybe I am biased - because ASF strength (besides community) is all about trademarks, ASF has the “Apache” trademark for software in multiple regions, and “Airflow” is registered trademark in US since June. So both “apache-” and “apache-airflow-” prefixes could be up for grabs for us. Maybe the criteria should be combined:

  • you have trademark
  • and you either pay or are reputable non-profit - in which case such thing will be treated as “donation” to the ASF and ASF would then list PyPi as “sponsor” or “partner” - similarly to what ASF does with Jetbrains, GitHub and a number of others.

Not judging what the final decision should be - but wanted to present our view as one of the potential important users of that feature.

4 Likes

I want to echo this sentiment and reiterate just how much I am against everything about this proposal.

Would this proposal potentially address (some of?) the concerns folks have about PEPs 752/755? It’s a bit of a different way of attacking the problem, but could have similar benefits?