Community testing of packaging tools against non-Warehouse indexes

Background is here.

Yesterday, the maintainer of pip released version 22.0. This release contains a major change in the way that HTML content from package indexes is parsed and processed, and fairly quickly after the release a number of people noticed that they were unable to obtain packages from their own indexes (not PyPI).

It turns out that most (possibly all) of the commercial software products that support Python packages are not actually compliant with PEP 503 and the new parser expects them to be.

In this case the maintainer went way beyond expectations and worked with the community to isolate the issue and get a new version (22.0.2) released which can fall back to the old parser, this problem should not have been found after the release. The fact that it was not found before the release is a failing on the part of the pip user community, because this code could have been tested against these non-Warehouse indexes weeks (or months) ago.

So, it’s on us to help solve this problem, and we can do that by putting together a coordinated group of testers to test new versions of the PyPA tools (at minimum build, pip, and twine) against the indexes that we have available to us. Certainly it would be good if the providers of those index tools took on this burden of testing, but let’s take the first step and hopefully they will follow.

In order to do this, we’ll need to determine some things:

  • Which versions should be tested? Pre-releases on PyPI, or release candidates somewhere else, or tags on GitHub, or every PR?

  • How will they be tested? Are there sufficient tests already run with these tools against Warehouse that can be pointed to alternative indexes, or will tests need to be created?

  • How will the results be reported?

  • Who can commit to implementing automated testing against one or more third-party index tools? Personally I can commit to testing against up-to-date releases of Sonatype’s Nexus Repository Manager.

It’s not reasonable to expect the PyPA maintainers to take on any of this burden, but if we can provide them a low-cost/low-friction way to find out whether future versions of these tools will be found incompatible with third-party index software, it may help them avoid creating an uproar in the community when software they don’t control is suddenly ‘broken’.


I think the first question to ask is: Do the pip release managers agree they have the resources to do release candidates?

I’ve been getting involved with the pip project this last year and it seems like the volunteer time of the release managers is currently the scarcest resource for the project. Perhaps first consider some way to ease this burden?

I’m not sure we do. Particularly if we get the type of response that we currently get for a full release after a rc. But conversely, people can install from any commit on the pip main branch, so the community could simply choose to say that they’ll test against the main branch as at the first day of the month before the scheduled release (or whatever alternative criterion they choose).

1 Like

FYI I (and I’m sure others) can’t do that inside the corporate network I work in because interacting with github that way is blocked. We do however have a process of grabbing releases from pypi.

While I can empathize with your desire to improve things here, and appreciate how you are trying to drive a change, I disagree that it’s the community’s job to fix this.

I think that such testing is absolutely the responsibility of for-profit third-party services (not their users, and definitely not the PyPA maintainers as you noted), especially when there are existing standards that are defined and have been defined for a long time that they’re not in compliance with.

There’s no reason that an organization that experienced breakage as a result of this change couldn’t have treated every commit to as a release candidate and run their own integration tests against their own product and caught this before the release was made, if they cared about avoiding such issues.

For a community project like, by all means, let’s figure out how the community can make our projects more resilient to issues like this. But for a commercial project, it’s not our job to do their job for them.


Oh, I certainly agree, and I’m also certain that many users of such products will push their vendors to do a better job in this area.

In my case I’m using the OSS version of Nexus Repository Manager, and as such the vendor has no obligation to even accept bug reports from me, let alone do anything about them :slight_smile:

In any case, putting together a well-understood testing regime and reporting process will be beneficial to any vendors who decide to participate as a first-party tester.

1 Like

Do we have a representative set of community projects? I know of one-off indexes that pytorch and scipy maintain. There’s piwheels that has its own wheels. There’s pypiserver that implements the relevant things. What else?

If there’s any vendor reading this, I’m sure it would be really good to see you engage positively here and let us know that you’re doing something about this. :slight_smile:

They’ve got to have some sort of community support story there? Again though, I think the answer there is for the for-profit arm of the provider to figure out how to avoid such breakages.

1 Like

I don’t disagree, and my usage of ‘us’ in the OP does not preclude the vendors from participating; after they all they are part of the community too. My purpose in starting this topic was to at least inspire a discussion, and get some sort of ‘interest group’ formed who could define the parameters and expected results of what this sort of testing would need to be. If people who are not vendors choose to test and report results, that’s fine, but if the vendors do it that’s even better, and if we have some of both, that’s still a win in my book.

One reasonable outcome of such an ‘interest group’ could be indicating that vendors of such tools need to specify that they are “PEP 503” compliant, and not just “support Python package repositories”, and that they’ve used some sort of community-blessed test suite to verify their compliance.

I would use a Docker image based on the up-to-date Python image with the main-branch pip (and other packaging tools) installed in our CI (specifically in our Python multi-version testing pipeline).

I would also champion and maintain this image if the community would bless it.

I didn’t know of this change before pip’s release. I read these Discourse forums regularly, so the first I heard of it was in the pip release thread. Perhaps communication also needs working on, even if it’s just on Twitter/Discourse/python-dev (and maybe in pip’s output) a week before a new pip release saying a new release is coming out soon

pip releases are now done on a schedule, so this won’t be necessary.

Pip is not exactly a big enough part of my life to make a calendar reminder :smiley:

That wasn’t my point though: notifying the community of an impending release would be an additional burden for the maintainers, and there’s no need to do that. Since there is a schedule, there are probably a thousand or more ways that interested parties could be notified of an upcoming scheduled release, none of which would require effort by a person beyond the initial setup.

1 Like

FWIW, even I don’t have a calendar reminder for pip releases – but I’m sure I’m on the other side of the spectrum.

Your definition of community is different from mine. :slight_smile:

None the less, I agree with the spirit of what you’re saying. I also think that (a) those vendors have to initiate the conversation and let us know that they want to participate in good faith, (b) they have to take on the workload to make sure their product evolves as Python packaging does, and (c) let us know when we have a bug.

I’m happy to work with them (heck, I’d appreciate having a channel to talk to some of these enterprise solutions; because it’ll make things easier with my maintainer hat on). That said, I’m not keen on doing their work for them.


OK, so how about:

  1. Ask your company for it to be unblocked.
  2. Download the commit from git at home, and bring it into work. If necessary, get it approved for installing on a corporate machine.
  3. Set up a private build machine on the internet somewhere, and set it to make builds of pip, and publish them on a webpage that is accessible from your corporate network.

I’ve done variations of all of these at my work, at one time or another.

And if none of the above work, sort out something that does. Again, it shouldn’t be up to me or any of the other pip maintainers to help you work with your company’s policies. And if your company refuses to support you, we’re back in the one-sided situation where all of the effort is expected to come from the open source community, for free.


I agree with everything you’re saying and I am not trying illicit any sympathy for the company I work for.

I’m just highlighting the realities of being someone who wants to assist and raise issues to pip before a big fallout happens. It’s no skin off my back if I can’t assist though, I don’t upgrade pip automatically and it doesn’t interrupt my workflow if I have to wait for the company index or pip is forced to change course because so many users complain.

Yes all your suggestions are unworkable, however release candidates on pypi are workable. But as you may of noticed I was the first person to bring up pip release manager capacity on this discuss thread so I guessed this was fairly unlikely.

P.s., I tried to find a way to download Pradyun’s current changes, I could not get around restrictions.

Thanks. I’m a pip maintainer, and I’ve worked for many years in the sort of environment you describe. So I do appreciate the limitations and difficulties. Many times, I’ve made decisions with my pip maintainer hat on that make my life more difficult with my “work” hat on. We really do understand the constraints here. That’s not the problem. The problem is with (some) people[1] who believe that their situation should be the top priority for the pip maintainers, even though a huge majority of pip’s user base don’t work under those sorts of constraint.

The point for the pip maintainers is not that we don’t want to fix regressions, or that we don’t care about our users. It’s that we don’t want to stagnate. Users in big companies typically value stability, backward compatibility, and “long term support” types of release. But that’s not the pip maintainers’ priority, like it or not. We prioritise moving the ecosystem forward, improving standards and standards-compliance, flexibility (to handle multiple classes of user) and maintainability[2] (because we’re such a small group).

Ideally, there would be a “long term support” branch of pip for risk-averse users. But that’s not possible - we don’t have the manpower, or frankly the interest. Maybe some other group could maintain such a project, somehow. I don’t know. Personally, I’m all for more competition in the packaging tools space, so if anyone wants to do something like this, then I’d say go for it.

But in the meantime, every time we make a change that doesn’t align with the needs of our (relatively) small group of risk-averse, tightly constrained users, we get massive pushback, and a complete lack of respect for the possibility that we may simply have different priorities.

  1. Not the people participating here. ↩︎

  2. The HTML 5 issue was a maintainability change, we are reducing our dependencies and simplifying our code, to help us better maintain it. ↩︎


Maybe you can build that on top of ci-images.

1 Like

Just wanted to share my perspective (and experience) of another, big Open-Source project - Apache Airflow. I know it’s a diferent governance, stakholder structure, number of volunteers, money involved. Sure. In Airlfow there are some maintainers (including myself) who are sponsored or paid for the community work, but a lot of our contributors and commiters are volunteers. I would not go into “commercial” discusssion with “pip” governance model and who “should” and who “should not” pay for whose job.

I am usually in the camp “i see problem” - > “I think out and propose solutions” so just an example here, maybe that can be an inspiration. And I believe things wont sort out on their own, so I usually try to make some constructive proposal.

But coming back with the experience - we have quite a success in that matter by engaging our users. As an Apache Software Foundation project, we follow the “Comunity over Code” motto - and stability for our users (who we consider part of our community) is important.
We try to improve the world around us as well a lot . More than 2 years ago we introduced a huge breaking change with Airflow 1-> Airflow 2, and we did try involve a lot of our community members in testing alpha/beta/pre release/ release candidates. It was not easy to gather feedback and we did broke a number of things. Fixed since.

Since then we introduced rather straightforward way of engaging our users in RC testing (we are releasing up to 70 PIP packages month with so called “providers” + main release of Airlfow every few months).

Initially when I proposed the solution described below which involved engaging our contributors and users who raised issues, we had a very strong push-back in our community “is it needed?”, “people can test anyway”, “we have no resources” to test alll that. I was quite persistent and started an experiment, which turned into a regular process which turned out pretty successful.

Every month we have like 60-70% success rate on users testing our release candidates on members of our community (both those who raised issues solved and those who implemented or commented on PRS).

As an effect we did - few times - decided to cancel and re-do the release with a follow-up RC2, 3 etc. when our users managed to find some bugs there,

I believe, we turned the engagement from our users on it’s head by following few simple principles:

  • don’t rely on people watching the schedule of release or announcement. Call them out specifically in GitHub Issue when you mention them by @handle (the issue is preapared automatically so it involves no “regular” maintenance effort"

  • make it as easy as possible to install the new version. reduce friction of the testing as much as possible - the issue contains direct link to the RC being released (automated of course), contains information on what has been changed (in the form of GH issue links - you guessed, automated)

  • make it personsal - each person is mentioned next to the issue and we ask personally for help reaching out with positive spirit, thanking in advance for help

  • we ask people to comment when they tested in a single issue with mutliple people to add “positive feedback loop” and reinforcement (if other tested why should not I help too?) and positive reaction and cheering where people did test. The nice thing about it is that you create a small “sub-community” of people who participate in this test round which IMHO is a very nice “social” efffect.

  • we thank the people to test it when we close the issue

After initial (yes) investment in automating it, I was personally able to release waves of providers every month since with very little effor of mine (and sometimes I released 70 packages at a time).

You can see the list of our issues so far:

Example issue here:

Now - summarizing my point.

I am not telling it should be copied or done the same. In the discussion above I see that there might be various reasons why you would like or wouldn’t like to do something similar. And I am not even trying to argue with it.

But I just want to tell that there are ways to make it happen, and I believe they require maintainers initiative and coordination, but when done in a smart way and with “social” thinking and “community” building positive approach, they might succeed and result with the solution that requires very little effort and where you can delegate a lot of the work to those interested.

1 Like