Community testing of packaging tools against non-Warehouse indexes

Background is here.

Yesterday, the maintainer of pip released version 22.0. This release contains a major change in the way that HTML content from package indexes is parsed and processed, and fairly quickly after the release a number of people noticed that they were unable to obtain packages from their own indexes (not PyPI).

It turns out that most (possibly all) of the commercial software products that support Python packages are not actually compliant with PEP 503 and the new parser expects them to be.

In this case the maintainer went way beyond expectations and worked with the community to isolate the issue and get a new version (22.0.2) released which can fall back to the old parser, this problem should not have been found after the release. The fact that it was not found before the release is a failing on the part of the pip user community, because this code could have been tested against these non-Warehouse indexes weeks (or months) ago.

So, itā€™s on us to help solve this problem, and we can do that by putting together a coordinated group of testers to test new versions of the PyPA tools (at minimum build, pip, and twine) against the indexes that we have available to us. Certainly it would be good if the providers of those index tools took on this burden of testing, but letā€™s take the first step and hopefully they will follow.

In order to do this, weā€™ll need to determine some things:

  • Which versions should be tested? Pre-releases on PyPI, or release candidates somewhere else, or tags on GitHub, or every PR?

  • How will they be tested? Are there sufficient tests already run with these tools against Warehouse that can be pointed to alternative indexes, or will tests need to be created?

  • How will the results be reported?

  • Who can commit to implementing automated testing against one or more third-party index tools? Personally I can commit to testing against up-to-date releases of Sonatypeā€™s Nexus Repository Manager.

Itā€™s not reasonable to expect the PyPA maintainers to take on any of this burden, but if we can provide them a low-cost/low-friction way to find out whether future versions of these tools will be found incompatible with third-party index software, it may help them avoid creating an uproar in the community when software they donā€™t control is suddenly ā€˜brokenā€™.

4 Likes

I think the first question to ask is: Do the pip release managers agree they have the resources to do release candidates?

Iā€™ve been getting involved with the pip project this last year and it seems like the volunteer time of the release managers is currently the scarcest resource for the project. Perhaps first consider some way to ease this burden?

Iā€™m not sure we do. Particularly if we get the type of response that we currently get for a full release after a rc. But conversely, people can install from any commit on the pip main branch, so the community could simply choose to say that theyā€™ll test against the main branch as at the first day of the month before the scheduled release (or whatever alternative criterion they choose).

1 Like

FYI I (and Iā€™m sure others) canā€™t do that inside the corporate network I work in because interacting with github that way is blocked. We do however have a process of grabbing releases from pypi.

While I can empathize with your desire to improve things here, and appreciate how you are trying to drive a change, I disagree that itā€™s the communityā€™s job to fix this.

I think that such testing is absolutely the responsibility of for-profit third-party services (not their users, and definitely not the PyPA maintainers as you noted), especially when there are existing standards that are defined and have been defined for a long time that theyā€™re not in compliance with.

Thereā€™s no reason that an organization that experienced breakage as a result of this change couldnā€™t have treated every commit to https://github.com/pypa/pip/tree/main as a release candidate and run their own integration tests against their own product and caught this before the release was made, if they cared about avoiding such issues.

For a community project like https://www.piwheels.org/, by all means, letā€™s figure out how the community can make our projects more resilient to issues like this. But for a commercial project, itā€™s not our job to do their job for them.

10 Likes

Oh, I certainly agree, and Iā€™m also certain that many users of such products will push their vendors to do a better job in this area.

In my case Iā€™m using the OSS version of Nexus Repository Manager, and as such the vendor has no obligation to even accept bug reports from me, let alone do anything about them :slight_smile:

In any case, putting together a well-understood testing regime and reporting process will be beneficial to any vendors who decide to participate as a first-party tester.

1 Like

Do we have a representative set of community projects? I know of one-off indexes that pytorch and scipy maintain. Thereā€™s piwheels that has its own wheels. Thereā€™s pypiserver that implements the relevant things. What else?

If thereā€™s any vendor reading this, Iā€™m sure it would be really good to see you engage positively here and let us know that youā€™re doing something about this. :slight_smile:

Theyā€™ve got to have some sort of community support story there? Again though, I think the answer there is for the for-profit arm of the provider to figure out how to avoid such breakages.

1 Like

I donā€™t disagree, and my usage of ā€˜usā€™ in the OP does not preclude the vendors from participating; after they all they are part of the community too. My purpose in starting this topic was to at least inspire a discussion, and get some sort of ā€˜interest groupā€™ formed who could define the parameters and expected results of what this sort of testing would need to be. If people who are not vendors choose to test and report results, thatā€™s fine, but if the vendors do it thatā€™s even better, and if we have some of both, thatā€™s still a win in my book.

One reasonable outcome of such an ā€˜interest groupā€™ could be indicating that vendors of such tools need to specify that they are ā€œPEP 503ā€ compliant, and not just ā€œsupport Python package repositoriesā€, and that theyā€™ve used some sort of community-blessed test suite to verify their compliance.

I would use a Docker image based on the up-to-date Python image with the main-branch pip (and other packaging tools) installed in our CI (specifically in our Python multi-version testing pipeline).

I would also champion and maintain this image if the community would bless it.


I didnā€™t know of this change before pipā€™s release. I read these Discourse forums regularly, so the first I heard of it was in the pip release thread. Perhaps communication also needs working on, even if itā€™s just on Twitter/Discourse/python-dev (and maybe in pipā€™s output) a week before a new pip release saying a new release is coming out soon

pip releases are now done on a schedule, so this wonā€™t be necessary.

Pip is not exactly a big enough part of my life to make a calendar reminder :smiley:

That wasnā€™t my point though: notifying the community of an impending release would be an additional burden for the maintainers, and thereā€™s no need to do that. Since there is a schedule, there are probably a thousand or more ways that interested parties could be notified of an upcoming scheduled release, none of which would require effort by a person beyond the initial setup.

1 Like

FWIW, even I donā€™t have a calendar reminder for pip releases ā€“ but Iā€™m sure Iā€™m on the other side of the spectrum.

Your definition of community is different from mine. :slight_smile:

None the less, I agree with the spirit of what youā€™re saying. I also think that (a) those vendors have to initiate the conversation and let us know that they want to participate in good faith, (b) they have to take on the workload to make sure their product evolves as Python packaging does, and (c) let us know when we have a bug.

Iā€™m happy to work with them (heck, Iā€™d appreciate having a channel to talk to some of these enterprise solutions; because itā€™ll make things easier with my maintainer hat on). That said, Iā€™m not keen on doing their work for them.

2 Likes

OK, so how about:

  1. Ask your company for it to be unblocked.
  2. Download the commit from git at home, and bring it into work. If necessary, get it approved for installing on a corporate machine.
  3. Set up a private build machine on the internet somewhere, and set it to make builds of pip, and publish them on a webpage that is accessible from your corporate network.

Iā€™ve done variations of all of these at my work, at one time or another.

And if none of the above work, sort out something that does. Again, it shouldnā€™t be up to me or any of the other pip maintainers to help you work with your companyā€™s policies. And if your company refuses to support you, weā€™re back in the one-sided situation where all of the effort is expected to come from the open source community, for free.

5 Likes

I agree with everything youā€™re saying and I am not trying illicit any sympathy for the company I work for.

Iā€™m just highlighting the realities of being someone who wants to assist and raise issues to pip before a big fallout happens. Itā€™s no skin off my back if I canā€™t assist though, I donā€™t upgrade pip automatically and it doesnā€™t interrupt my workflow if I have to wait for the company index or pip is forced to change course because so many users complain.

Yes all your suggestions are unworkable, however release candidates on pypi are workable. But as you may of noticed I was the first person to bring up pip release manager capacity on this discuss thread so I guessed this was fairly unlikely.

P.s., I tried to find a way to download Pradyunā€™s current changes, I could not get around restrictions.

Thanks. Iā€™m a pip maintainer, and Iā€™ve worked for many years in the sort of environment you describe. So I do appreciate the limitations and difficulties. Many times, Iā€™ve made decisions with my pip maintainer hat on that make my life more difficult with my ā€œworkā€ hat on. We really do understand the constraints here. Thatā€™s not the problem. The problem is with (some) people[1] who believe that their situation should be the top priority for the pip maintainers, even though a huge majority of pipā€™s user base donā€™t work under those sorts of constraint.

The point for the pip maintainers is not that we donā€™t want to fix regressions, or that we donā€™t care about our users. Itā€™s that we donā€™t want to stagnate. Users in big companies typically value stability, backward compatibility, and ā€œlong term supportā€ types of release. But thatā€™s not the pip maintainersā€™ priority, like it or not. We prioritise moving the ecosystem forward, improving standards and standards-compliance, flexibility (to handle multiple classes of user) and maintainability[2] (because weā€™re such a small group).

Ideally, there would be a ā€œlong term supportā€ branch of pip for risk-averse users. But thatā€™s not possible - we donā€™t have the manpower, or frankly the interest. Maybe some other group could maintain such a project, somehow. I donā€™t know. Personally, Iā€™m all for more competition in the packaging tools space, so if anyone wants to do something like this, then Iā€™d say go for it.

But in the meantime, every time we make a change that doesnā€™t align with the needs of our (relatively) small group of risk-averse, tightly constrained users, we get massive pushback, and a complete lack of respect for the possibility that we may simply have different priorities.


  1. Not the people participating here. ā†©ļøŽ

  2. The HTML 5 issue was a maintainability change, we are reducing our dependencies and simplifying our code, to help us better maintain it. ā†©ļøŽ

3 Likes

Maybe you can build that on top of ci-images.

1 Like

Just wanted to share my perspective (and experience) of another, big Open-Source project - Apache Airflow. I know itā€™s a diferent governance, stakholder structure, number of volunteers, money involved. Sure. In Airlfow there are some maintainers (including myself) who are sponsored or paid for the community work, but a lot of our contributors and commiters are volunteers. I would not go into ā€œcommercialā€ discusssion with ā€œpipā€ governance model and who ā€œshouldā€ and who ā€œshould notā€ pay for whose job.

I am usually in the camp ā€œi see problemā€ - > ā€œI think out and propose solutionsā€ so just an example here, maybe that can be an inspiration. And I believe things wont sort out on their own, so I usually try to make some constructive proposal.

But coming back with the experience - we have quite a success in that matter by engaging our users. As an Apache Software Foundation project, we follow the ā€œComunity over Codeā€ motto - and stability for our users (who we consider part of our community) is important.
We try to improve the world around us as well a lot . More than 2 years ago we introduced a huge breaking change with Airflow 1-> Airflow 2, and we did try involve a lot of our community members in testing alpha/beta/pre release/ release candidates. It was not easy to gather feedback and we did broke a number of things. Fixed since.

Since then we introduced rather straightforward way of engaging our users in RC testing (we are releasing up to 70 PIP packages month with so called ā€œprovidersā€ + main release of Airlfow every few months).

Initially when I proposed the solution described below which involved engaging our contributors and users who raised issues, we had a very strong push-back in our community ā€œis it needed?ā€, ā€œpeople can test anywayā€, ā€œwe have no resourcesā€ to test alll that. I was quite persistent and started an experiment, which turned into a regular process which turned out pretty successful.

Every month we have like 60-70% success rate on users testing our release candidates on members of our community (both those who raised issues solved and those who implemented or commented on PRS).

As an effect we did - few times - decided to cancel and re-do the release with a follow-up RC2, 3 etc. when our users managed to find some bugs there,

I believe, we turned the engagement from our users on itā€™s head by following few simple principles:

  • donā€™t rely on people watching the schedule of release or announcement. Call them out specifically in GitHub Issue when you mention them by @handle (the issue is preapared automatically so it involves no ā€œregularā€ maintenance effort"

  • make it as easy as possible to install the new version. reduce friction of the testing as much as possible - the issue contains direct link to the RC being released (automated of course), contains information on what has been changed (in the form of GH issue links - you guessed, automated)

  • make it personsal - each person is mentioned next to the issue and we ask personally for help reaching out with positive spirit, thanking in advance for help

  • we ask people to comment when they tested in a single issue with mutliple people to add ā€œpositive feedback loopā€ and reinforcement (if other tested why should not I help too?) and positive reaction and cheering where people did test. The nice thing about it is that you create a small ā€œsub-communityā€ of people who participate in this test round which IMHO is a very nice ā€œsocialā€ efffect.

  • we thank the people to test it when we close the issue

After initial (yes) investment in automating it, I was personally able to release waves of providers every month since with very little effor of mine (and sometimes I released 70 packages at a time).

You can see the list of our issues so far:

Example issue here:

Now - summarizing my point.

I am not telling it should be copied or done the same. In the discussion above I see that there might be various reasons why you would like or wouldnā€™t like to do something similar. And I am not even trying to argue with it.

But I just want to tell that there are ways to make it happen, and I believe they require maintainers initiative and coordination, but when done in a smart way and with ā€œsocialā€ thinking and ā€œcommunityā€ building positive approach, they might succeed and result with the solution that requires very little effort and where you can delegate a lot of the work to those interested.

1 Like