Yesterday, the maintainer of pip released version 22.0. This release contains a major change in the way that HTML content from package indexes is parsed and processed, and fairly quickly after the release a number of people noticed that they were unable to obtain packages from their own indexes (not PyPI).
It turns out that most (possibly all) of the commercial software products that support Python packages are not actually compliant with PEP 503 and the new parser expects them to be.
In this case the maintainer went way beyond expectations and worked with the community to isolate the issue and get a new version (22.0.2) released which can fall back to the old parser, this problem should not have been found after the release. The fact that it was not found before the release is a failing on the part of the pip user community, because this code could have been tested against these non-Warehouse indexes weeks (or months) ago.
So, itās on us to help solve this problem, and we can do that by putting together a coordinated group of testers to test new versions of the PyPA tools (at minimum build, pip, and twine) against the indexes that we have available to us. Certainly it would be good if the providers of those index tools took on this burden of testing, but letās take the first step and hopefully they will follow.
In order to do this, weāll need to determine some things:
Which versions should be tested? Pre-releases on PyPI, or release candidates somewhere else, or tags on GitHub, or every PR?
How will they be tested? Are there sufficient tests already run with these tools against Warehouse that can be pointed to alternative indexes, or will tests need to be created?
How will the results be reported?
Who can commit to implementing automated testing against one or more third-party index tools? Personally I can commit to testing against up-to-date releases of Sonatypeās Nexus Repository Manager.
Itās not reasonable to expect the PyPA maintainers to take on any of this burden, but if we can provide them a low-cost/low-friction way to find out whether future versions of these tools will be found incompatible with third-party index software, it may help them avoid creating an uproar in the community when software they donāt control is suddenly ābrokenā.
I think the first question to ask is: Do the pip release managers agree they have the resources to do release candidates?
Iāve been getting involved with the pip project this last year and it seems like the volunteer time of the release managers is currently the scarcest resource for the project. Perhaps first consider some way to ease this burden?
Iām not sure we do. Particularly if we get the type of response that we currently get for a full release after a rc. But conversely, people can install from any commit on the pip main branch, so the community could simply choose to say that theyāll test against the main branch as at the first day of the month before the scheduled release (or whatever alternative criterion they choose).
FYI I (and Iām sure others) canāt do that inside the corporate network I work in because interacting with github that way is blocked. We do however have a process of grabbing releases from pypi.
While I can empathize with your desire to improve things here, and appreciate how you are trying to drive a change, I disagree that itās the communityās job to fix this.
I think that such testing is absolutely the responsibility of for-profit third-party services (not their users, and definitely not the PyPA maintainers as you noted), especially when there are existing standards that are defined and have been defined for a long time that theyāre not in compliance with.
Thereās no reason that an organization that experienced breakage as a result of this change couldnāt have treated every commit to https://github.com/pypa/pip/tree/main as a release candidate and run their own integration tests against their own product and caught this before the release was made, if they cared about avoiding such issues.
For a community project like https://www.piwheels.org/, by all means, letās figure out how the community can make our projects more resilient to issues like this. But for a commercial project, itās not our job to do their job for them.
Oh, I certainly agree, and Iām also certain that many users of such products will push their vendors to do a better job in this area.
In my case Iām using the OSS version of Nexus Repository Manager, and as such the vendor has no obligation to even accept bug reports from me, let alone do anything about them
In any case, putting together a well-understood testing regime and reporting process will be beneficial to any vendors who decide to participate as a first-party tester.
Do we have a representative set of community projects? I know of one-off indexes that pytorch and scipy maintain. Thereās piwheels that has its own wheels. Thereās pypiserver that implements the relevant things. What else?
If thereās any vendor reading this, Iām sure it would be really good to see you engage positively here and let us know that youāre doing something about this.
Theyāve got to have some sort of community support story there? Again though, I think the answer there is for the for-profit arm of the provider to figure out how to avoid such breakages.
I donāt disagree, and my usage of āusā in the OP does not preclude the vendors from participating; after they all they are part of the community too. My purpose in starting this topic was to at least inspire a discussion, and get some sort of āinterest groupā formed who could define the parameters and expected results of what this sort of testing would need to be. If people who are not vendors choose to test and report results, thatās fine, but if the vendors do it thatās even better, and if we have some of both, thatās still a win in my book.
One reasonable outcome of such an āinterest groupā could be indicating that vendors of such tools need to specify that they are āPEP 503ā compliant, and not just āsupport Python package repositoriesā, and that theyāve used some sort of community-blessed test suite to verify their compliance.
I would use a Docker image based on the up-to-date Python image with the main-branch pip (and other packaging tools) installed in our CI (specifically in our Python multi-version testing pipeline).
I would also champion and maintain this image if the community would bless it.
I didnāt know of this change before pipās release. I read these Discourse forums regularly, so the first I heard of it was in the pip release thread. Perhaps communication also needs working on, even if itās just on Twitter/Discourse/python-dev (and maybe in pipās output) a week before a new pip release saying a new release is coming out soon
That wasnāt my point though: notifying the community of an impending release would be an additional burden for the maintainers, and thereās no need to do that. Since there is a schedule, there are probably a thousand or more ways that interested parties could be notified of an upcoming scheduled release, none of which would require effort by a person beyond the initial setup.
FWIW, even I donāt have a calendar reminder for pip releases ā but Iām sure Iām on the other side of the spectrum.
Your definition of community is different from mine.
None the less, I agree with the spirit of what youāre saying. I also think that (a) those vendors have to initiate the conversation and let us know that they want to participate in good faith, (b) they have to take on the workload to make sure their product evolves as Python packaging does, and (c) let us know when we have a bug.
Iām happy to work with them (heck, Iād appreciate having a channel to talk to some of these enterprise solutions; because itāll make things easier with my maintainer hat on). That said, Iām not keen on doing their work for them.
Download the commit from git at home, and bring it into work. If necessary, get it approved for installing on a corporate machine.
Set up a private build machine on the internet somewhere, and set it to make builds of pip, and publish them on a webpage that is accessible from your corporate network.
Iāve done variations of all of these at my work, at one time or another.
And if none of the above work, sort out something that does. Again, it shouldnāt be up to me or any of the other pip maintainers to help you work with your companyās policies. And if your company refuses to support you, weāre back in the one-sided situation where all of the effort is expected to come from the open source community, for free.
I agree with everything youāre saying and I am not trying illicit any sympathy for the company I work for.
Iām just highlighting the realities of being someone who wants to assist and raise issues to pip before a big fallout happens. Itās no skin off my back if I canāt assist though, I donāt upgrade pip automatically and it doesnāt interrupt my workflow if I have to wait for the company index or pip is forced to change course because so many users complain.
Yes all your suggestions are unworkable, however release candidates on pypi are workable. But as you may of noticed I was the first person to bring up pip release manager capacity on this discuss thread so I guessed this was fairly unlikely.
P.s., I tried to find a way to download Pradyunās current changes, I could not get around restrictions.
Thanks. Iām a pip maintainer, and Iāve worked for many years in the sort of environment you describe. So I do appreciate the limitations and difficulties. Many times, Iāve made decisions with my pip maintainer hat on that make my life more difficult with my āworkā hat on. We really do understand the constraints here. Thatās not the problem. The problem is with (some) people[1] who believe that their situation should be the top priority for the pip maintainers, even though a huge majority of pipās user base donāt work under those sorts of constraint.
The point for the pip maintainers is not that we donāt want to fix regressions, or that we donāt care about our users. Itās that we donāt want to stagnate. Users in big companies typically value stability, backward compatibility, and ālong term supportā types of release. But thatās not the pip maintainersā priority, like it or not. We prioritise moving the ecosystem forward, improving standards and standards-compliance, flexibility (to handle multiple classes of user) and maintainability[2] (because weāre such a small group).
Ideally, there would be a ālong term supportā branch of pip for risk-averse users. But thatās not possible - we donāt have the manpower, or frankly the interest. Maybe some other group could maintain such a project, somehow. I donāt know. Personally, Iām all for more competition in the packaging tools space, so if anyone wants to do something like this, then Iād say go for it.
But in the meantime, every time we make a change that doesnāt align with the needs of our (relatively) small group of risk-averse, tightly constrained users, we get massive pushback, and a complete lack of respect for the possibility that we may simply have different priorities.
Just wanted to share my perspective (and experience) of another, big Open-Source project - Apache Airflow. I know itās a diferent governance, stakholder structure, number of volunteers, money involved. Sure. In Airlfow there are some maintainers (including myself) who are sponsored or paid for the community work, but a lot of our contributors and commiters are volunteers. I would not go into ācommercialā discusssion with āpipā governance model and who āshouldā and who āshould notā pay for whose job.
I am usually in the camp āi see problemā - > āI think out and propose solutionsā so just an example here, maybe that can be an inspiration. And I believe things wont sort out on their own, so I usually try to make some constructive proposal.
But coming back with the experience - we have quite a success in that matter by engaging our users. As an Apache Software Foundation project, we follow the āComunity over Codeā motto - and stability for our users (who we consider part of our community) is important.
We try to improve the world around us as well a lot . More than 2 years ago we introduced a huge breaking change with Airflow 1-> Airflow 2, and we did try involve a lot of our community members in testing alpha/beta/pre release/ release candidates. It was not easy to gather feedback and we did broke a number of things. Fixed since.
Since then we introduced rather straightforward way of engaging our users in RC testing (we are releasing up to 70 PIP packages month with so called āprovidersā + main release of Airlfow every few months).
Initially when I proposed the solution described below which involved engaging our contributors and users who raised issues, we had a very strong push-back in our community āis it needed?ā, āpeople can test anywayā, āwe have no resourcesā to test alll that. I was quite persistent and started an experiment, which turned into a regular process which turned out pretty successful.
Every month we have like 60-70% success rate on users testing our release candidates on members of our community (both those who raised issues solved and those who implemented or commented on PRS).
As an effect we did - few times - decided to cancel and re-do the release with a follow-up RC2, 3 etc. when our users managed to find some bugs there,
I believe, we turned the engagement from our users on itās head by following few simple principles:
donāt rely on people watching the schedule of release or announcement. Call them out specifically in GitHub Issue when you mention them by @handle (the issue is preapared automatically so it involves no āregularā maintenance effort"
make it as easy as possible to install the new version. reduce friction of the testing as much as possible - the issue contains direct link to the RC being released (automated of course), contains information on what has been changed (in the form of GH issue links - you guessed, automated)
make it personsal - each person is mentioned next to the issue and we ask personally for help reaching out with positive spirit, thanking in advance for help
we ask people to comment when they tested in a single issue with mutliple people to add āpositive feedback loopā and reinforcement (if other tested why should not I help too?) and positive reaction and cheering where people did test. The nice thing about it is that you create a small āsub-communityā of people who participate in this test round which IMHO is a very nice āsocialā efffect.
we thank the people to test it when we close the issue
After initial (yes) investment in automating it, I was personally able to release waves of providers every month since with very little effor of mine (and sometimes I released 70 packages at a time).
You can see the list of our issues so far:
Example issue here:
Now - summarizing my point.
I am not telling it should be copied or done the same. In the discussion above I see that there might be various reasons why you would like or wouldnāt like to do something similar. And I am not even trying to argue with it.
But I just want to tell that there are ways to make it happen, and I believe they require maintainers initiative and coordination, but when done in a smart way and with āsocialā thinking and ācommunityā building positive approach, they might succeed and result with the solution that requires very little effort and where you can delegate a lot of the work to those interested.