Is PEP 541 still the correct solution?

Background

PEP 541 introduced an approval system for name transfers on PyPi. My understanding it that it was a response to the increasing number of zombie projects and typo squatters. In the former case, when unable to contact the owner of a dead project, a user can request an admin override and a name transfer. In the latter case, PyPi blocks projects whose names are similar to other projects to prevent the uploading of malware which is downloaded when users make a typo, a user can point to a legitimate project they wish to upload and again, request an admin override.

I and my colleagues have used this system a few times to request name transfers here. Unfortunately, by the very nature of this system, only a trusted few can addres these requests. This causes understandable, but very long, delays. Given a steady state and predictable delay this might be manageable but there are reasons to believe it will only ever increase. Since the number of projects on PyPi trends upwards, so does the number of potential name clashes and dead projects.

Ultimately, this trend suggests that the system will become less and less usable. Perhaps we need a new/modified one before then. I had a few ideas I wanted to put forward, although I recognise their flaws. I also welcome other ones:

Expiry Times on Project Ownership
For example, if the owner of a project does not log in for a specified period, their project names could be released to any who wish to claim them. The period could be monitored and adjusted as needed to keep the number of zombie projects down.

Automatic Project Verification
If a project, say on Github, could be verified as non-malware automatically, it could be uploaded as a similar name to another project without worry.

Allow Typo Squatting
Unpalatable, but if the resources are not there to police it, it may be better to simply allow similarly named projects. This increases the risk to users but they also have a responsibility to check their software. If the companies that use Python libraries want better policing, they can pay for it. In practice, that won’t happen but at least everyone who uses PyPi knows and accepts the balance of risks/costs.

This is mostly a stream of conciousness about a system that I feel is broken, based on my experiences. I would be interested to hear from others.

In cases of typosquatting and malware, those projects are removed under our Acceptable Use Policy.

Namesquatting, abandoned projects, and intellectual property are the motivation of PEP 541.

Taking over an existing project name is a sensitive action on the index. I don’t forsee us removing much of the red tape here.

It sounds like your primary concern is more to do with our blocking of names which have been used for malware in the past, names which conflict with standard library modules, and names which “ultra normalize” to the same name (e.g. project-x and ProjectX).

Can you share some specific examples of the issue you’re trying to solve?

4 Likes

Hi, that makes sense. I can see the use of the red tape.

Here are a couple of recent examples:

Another related one: PEP 541 Request: pvi · Issue #1558 · pypi/support · GitHub

I wonder how long does it usually take for a PEP 541 request to take over the name of a zombie project (in my case, the name exists in the test instance of PyPi only but not in production) to propagate through the system. My request (PEP 541 Request: pybot · Issue #2877 · pypi/support · GitHub) has been sitting there without any visible progress or even an acknowledgement for 3 weeks.

Had another quick look today Issues · pypi/support · GitHub

It doesn’t look as though any PEP 541 requests have been acknowledged since Jan/Feb this year, around 7 months. I would almost suggest it is preferable to decline the requests due to lack of resources than leave them in perpetual limbo.

2 Likes

Some have been handled in easy cases (where there is no previous owner to contact), but they do seem to be piling up.

Is there a temporary shortage of people who are able to review, or has it become too much effort now that PyPI has gotten so large?

I know it’s done with volunteered time, but maybe there is some way to speed up the dead-name squatting requests? I’m not sure how many of the ~200 PEP541 requests fall under this, but it seems like it’s been over 9 months since any have been processed where the name has to be removed by an admin (even in the case when there is no release history/repo available PEP 541 Request: dolphin · Issue #2562 · pypi/support · GitHub )

1 Like

Relatedly, I’ve been wondering if there’s a way to get more volunteers involved in 541 support and processing.
e.g. I’d be happy to throw an hour here or there at triage to tell the support team “this looks like the process has been followed”, but only if my doing so would have any benefit. If the support team has to redo the work to verify, maybe it doesn’t help enough. Or maybe it makes it easier to chip away at the review queue because triage volunteers can handle cases where the request is improper or users have questions.

My assumption is that there is some vetting process for admin access to pypi, so it’s nontrivial to add more personnel who can do the work of changing ownership.

3 Likes

I opened an issue to volunteer to prune the request queue. There’s no clear link on how to volunteer. Of the 230 requests open, a number of them can probably be addressed or triaged easily.

2 Likes

It’s not temporary; PyPI has always had a volunteer/staffing shortage.

Correct, since people are making decisions about changing what gets installed from PyPI.

1 Like

Hi @brettcannon - how does one volunteer? I couldn’t see any link anywhere. I’d be happy to help clear out some of the chaff, it looks like others here are as well. I understand that volunteer intake takes time and effort.

The real problem is trust as name-related things are a rather sensitive thing. My guess is you need to get involved with Warehouse in general and build trust with those running the project.

2 Likes

Volunteer in this case means that it’s not really a paid position with guaranteed hours, not that it’s something anyone from the public can volunteer for. The PyPI access needed for this, and the decisions involved, requires an extremely high level of trust.

The best way to get involved is somewhat adjacent to it. Become familiar with Warehouse (the code that runs PyPI), then start triaging issues, making high quality PRs, answering questions, etc. The more work you can help with there, the more time the people who do have trusted access will have for this work. It’s not an easy or quick thing to do, you need to stick with it for a long time. But it’s something anyone can do, and establishing long term commitment and trust through that will go a long way to maybe being able to handle these requests.

10 Likes

[Warning: Naive take ahead, assume I have no idea what I am talking about. And also maybe it has been discussed already.]

Is there a way to pay someone to do this work? Not the coding work, but the management of these kinds of things.

I get the feeling there is no other way to get out of this situation. Volunteer time is limited. Becoming a trusted volunteer takes time in itself and I can only assume that it would also take time from the already active volunteers (to train and gauge if the new candidates can be trusted or not).

So, is the solution to get the Python Software Foundation (or whoever can) to hire someone? PyPI is quite an important piece of the Python ecosystem, it might be worth it.

1 Like

I believe @ewdurbin, @dustin and @miketheman are all employed by the PSF to work on PyPI already? And I think @dstufft has permission from his employer to spend part of his work time on PyPI? But I’m not sure. Actually, “who runs PyPI?” would be a good thing to put somewhere on the PyPI docs or blog.

What I could find: What does it take to power the Python Package Index? - Dustin Ingram | PyPI hires a Safety & Security Engineer - The Python Package Index Blog | A New Home · caremad

(PS: Since I realize this may not have been clear, I’m absolutely not suggesting anything like “these people are paid to work on PyPI, therefore they must work on these requests”. Obviously there are a whole lot of other things PyPI needs.)

1 Like

I’m not employed by the PSF, I’m just a volunteer.

2 Likes

It’s here, but I guess you’re asking for specific names? https://pypi.org/help/#maintainers

I am employed by the Python Software Foundation as Director of Infrastructure, PyPI is a small portion of what I spend my time on and my focus is primarily on the operations/upkeep. Mike Fiedler is employed by the Python Software Foundation as PyPI Safety and Security Engineer and his primary focus is improvements to security for PyPI. Donald and Dustin are volunteers.

In this case, yes there should be paid support staff for PyPI to do this work. But there are not as such. Neither of the paid staff are focused even part time on support. Ultimately our capacity for support is prioritized based on urgency as follows:

  1. Security and Malware
  2. Production incidents
  3. Maintenance
  4. Basic user-support
  5. Account Recovery
  6. 541/name requests

Even if (once?) we had paid support, 541/name requests would remain lowest priority as there is always the self-service option of “choose another name”.

7 Likes

Related: PyPI account recovery process triaging on halt

1 Like