Community testing of packaging tools against non-Warehouse indexes

kpfleming · January 31, 2022, 4:33pm

Background is here.

Yesterday, the maintainer of pip released version 22.0. This release contains a major change in the way that HTML content from package indexes is parsed and processed, and fairly quickly after the release a number of people noticed that they were unable to obtain packages from their own indexes (not PyPI).

It turns out that most (possibly all) of the commercial software products that support Python packages are not actually compliant with PEP 503 and the new parser expects them to be.

In this case the maintainer went way beyond expectations and worked with the community to isolate the issue and get a new version (22.0.2) released which can fall back to the old parser, this problem should not have been found after the release. The fact that it was not found before the release is a failing on the part of the pip user community, because this code could have been tested against these non-Warehouse indexes weeks (or months) ago.

So, it’s on us to help solve this problem, and we can do that by putting together a coordinated group of testers to test new versions of the PyPA tools (at minimum build, pip, and twine) against the indexes that we have available to us. Certainly it would be good if the providers of those index tools took on this burden of testing, but let’s take the first step and hopefully they will follow.

In order to do this, we’ll need to determine some things:

Which versions should be tested? Pre-releases on PyPI, or release candidates somewhere else, or tags on GitHub, or every PR?
How will they be tested? Are there sufficient tests already run with these tools against Warehouse that can be pointed to alternative indexes, or will tests need to be created?
How will the results be reported?
Who can commit to implementing automated testing against one or more third-party index tools? Personally I can commit to testing against up-to-date releases of Sonatype’s Nexus Repository Manager.

It’s not reasonable to expect the PyPA maintainers to take on any of this burden, but if we can provide them a low-cost/low-friction way to find out whether future versions of these tools will be found incompatible with third-party index software, it may help them avoid creating an uproar in the community when software they don’t control is suddenly ‘broken’.

notatallshaw · January 31, 2022, 5:23pm

I think the first question to ask is: Do the pip release managers agree they have the resources to do release candidates?

I’ve been getting involved with the pip project this last year and it seems like the volunteer time of the release managers is currently the scarcest resource for the project. Perhaps first consider some way to ease this burden?

pf_moore · January 31, 2022, 5:53pm

I’m not sure we do. Particularly if we get the type of response that we currently get for a full release after a rc. But conversely, people can install from any commit on the pip main branch, so the community could simply choose to say that they’ll test against the main branch as at the first day of the month before the scheduled release (or whatever alternative criterion they choose).

notatallshaw · January 31, 2022, 6:03pm

FYI I (and I’m sure others) can’t do that inside the corporate network I work in because interacting with github that way is blocked. We do however have a process of grabbing releases from pypi.

dustin · January 31, 2022, 6:04pm

While I can empathize with your desire to improve things here, and appreciate how you are trying to drive a change, I disagree that it’s the community’s job to fix this.

I think that such testing is absolutely the responsibility of for-profit third-party services (not their users, and definitely not the PyPA maintainers as you noted), especially when there are existing standards that are defined and have been defined for a long time that they’re not in compliance with.

There’s no reason that an organization that experienced breakage as a result of this change couldn’t have treated every commit to https://github.com/pypa/pip/tree/main as a release candidate and run their own integration tests against their own product and caught this before the release was made, if they cared about avoiding such issues.

For a community project like https://www.piwheels.org/, by all means, let’s figure out how the community can make our projects more resilient to issues like this. But for a commercial project, it’s not our job to do their job for them.

kpfleming · January 31, 2022, 7:03pm

Oh, I certainly agree, and I’m also certain that many users of such products will push their vendors to do a better job in this area.

In my case I’m using the OSS version of Nexus Repository Manager, and as such the vendor has no obligation to even accept bug reports from me, let alone do anything about them

In any case, putting together a well-understood testing regime and reporting process will be beneficial to any vendors who decide to participate as a first-party tester.

pradyunsg · January 31, 2022, 9:01pm

Do we have a representative set of community projects? I know of one-off indexes that pytorch and scipy maintain. There’s piwheels that has its own wheels. There’s pypiserver that implements the relevant things. What else?

If there’s any vendor reading this, I’m sure it would be really good to see you engage positively here and let us know that you’re doing something about this.

They’ve got to have some sort of community support story there? Again though, I think the answer there is for the for-profit arm of the provider to figure out how to avoid such breakages.

kpfleming · January 31, 2022, 9:12pm

I don’t disagree, and my usage of ‘us’ in the OP does not preclude the vendors from participating; after they all they are part of the community too. My purpose in starting this topic was to at least inspire a discussion, and get some sort of ‘interest group’ formed who could define the parameters and expected results of what this sort of testing would need to be. If people who are not vendors choose to test and report results, that’s fine, but if the vendors do it that’s even better, and if we have some of both, that’s still a win in my book.

One reasonable outcome of such an ‘interest group’ could be indicating that vendors of such tools need to specify that they are “PEP 503” compliant, and not just “support Python package repositories”, and that they’ve used some sort of community-blessed test suite to verify their compliance.

EpicWink · January 31, 2022, 10:12pm

I would use a Docker image based on the up-to-date Python image with the main-branch pip (and other packaging tools) installed in our CI (specifically in our Python multi-version testing pipeline).

I would also champion and maintain this image if the community would bless it.

I didn’t know of this change before pip’s release. I read these Discourse forums regularly, so the first I heard of it was in the pip release thread. Perhaps communication also needs working on, even if it’s just on Twitter/Discourse/python-dev (and maybe in pip’s output) a week before a new pip release saying a new release is coming out soon

kpfleming · January 31, 2022, 10:20pm

pip releases are now done on a schedule, so this won’t be necessary.

EpicWink · January 31, 2022, 10:24pm

Pip is not exactly a big enough part of my life to make a calendar reminder

kpfleming · January 31, 2022, 10:27pm

That wasn’t my point though: notifying the community of an impending release would be an additional burden for the maintainers, and there’s no need to do that. Since there is a schedule, there are probably a thousand or more ways that interested parties could be notified of an upcoming scheduled release, none of which would require effort by a person beyond the initial setup.

pradyunsg · February 1, 2022, 6:29pm

FWIW, even I don’t have a calendar reminder for pip releases – but I’m sure I’m on the other side of the spectrum.

Your definition of community is different from mine.

None the less, I agree with the spirit of what you’re saying. I also think that (a) those vendors have to initiate the conversation and let us know that they want to participate in good faith, (b) they have to take on the workload to make sure their product evolves as Python packaging does, and (c) let us know when we have a bug.

I’m happy to work with them (heck, I’d appreciate having a channel to talk to some of these enterprise solutions; because it’ll make things easier with my maintainer hat on). That said, I’m not keen on doing their work for them.

pf_moore · February 1, 2022, 8:19pm

OK, so how about:

Ask your company for it to be unblocked.
Download the commit from git at home, and bring it into work. If necessary, get it approved for installing on a corporate machine.
Set up a private build machine on the internet somewhere, and set it to make builds of pip, and publish them on a webpage that is accessible from your corporate network.

I’ve done variations of all of these at my work, at one time or another.

And if none of the above work, sort out something that does. Again, it shouldn’t be up to me or any of the other pip maintainers to help you work with your company’s policies. And if your company refuses to support you, we’re back in the one-sided situation where all of the effort is expected to come from the open source community, for free.

notatallshaw · February 1, 2022, 9:19pm

I agree with everything you’re saying and I am not trying illicit any sympathy for the company I work for.

I’m just highlighting the realities of being someone who wants to assist and raise issues to pip before a big fallout happens. It’s no skin off my back if I can’t assist though, I don’t upgrade pip automatically and it doesn’t interrupt my workflow if I have to wait for the company index or pip is forced to change course because so many users complain.

Yes all your suggestions are unworkable, however release candidates on pypi are workable. But as you may of noticed I was the first person to bring up pip release manager capacity on this discuss thread so I guessed this was fairly unlikely.

P.s., I tried to find a way to download Pradyun’s current changes, I could not get around restrictions.

pf_moore · February 1, 2022, 9:46pm

Thanks. I’m a pip maintainer, and I’ve worked for many years in the sort of environment you describe. So I do appreciate the limitations and difficulties. Many times, I’ve made decisions with my pip maintainer hat on that make my life more difficult with my “work” hat on. We really do understand the constraints here. That’s not the problem. The problem is with (some) people^[1] who believe that their situation should be the top priority for the pip maintainers, even though a huge majority of pip’s user base don’t work under those sorts of constraint.

The point for the pip maintainers is not that we don’t want to fix regressions, or that we don’t care about our users. It’s that we don’t want to stagnate. Users in big companies typically value stability, backward compatibility, and “long term support” types of release. But that’s not the pip maintainers’ priority, like it or not. We prioritise moving the ecosystem forward, improving standards and standards-compliance, flexibility (to handle multiple classes of user) and maintainability^[2] (because we’re such a small group).

Ideally, there would be a “long term support” branch of pip for risk-averse users. But that’s not possible - we don’t have the manpower, or frankly the interest. Maybe some other group could maintain such a project, somehow. I don’t know. Personally, I’m all for more competition in the packaging tools space, so if anyone wants to do something like this, then I’d say go for it.

But in the meantime, every time we make a change that doesn’t align with the needs of our (relatively) small group of risk-averse, tightly constrained users, we get massive pushback, and a complete lack of respect for the possibility that we may simply have different priorities.

Not the people participating here. ↩︎
The HTML 5 issue was a maintainability change, we are reducing our dependencies and simplifying our code, to help us better maintain it. ↩︎

barry · February 1, 2022, 11:51pm

Maybe you can build that on top of ci-images.

potiuk · February 4, 2022, 5:58pm

Just wanted to share my perspective (and experience) of another, big Open-Source project - Apache Airflow. I know it’s a diferent governance, stakholder structure, number of volunteers, money involved. Sure. In Airlfow there are some maintainers (including myself) who are sponsored or paid for the community work, but a lot of our contributors and commiters are volunteers. I would not go into “commercial” discusssion with “pip” governance model and who “should” and who “should not” pay for whose job.

I am usually in the camp “i see problem” - > “I think out and propose solutions” so just an example here, maybe that can be an inspiration. And I believe things wont sort out on their own, so I usually try to make some constructive proposal.

But coming back with the experience - we have quite a success in that matter by engaging our users. As an Apache Software Foundation project, we follow the “Comunity over Code” motto - and stability for our users (who we consider part of our community) is important.
We try to improve the world around us as well a lot . More than 2 years ago we introduced a huge breaking change with Airflow 1-> Airflow 2, and we did try involve a lot of our community members in testing alpha/beta/pre release/ release candidates. It was not easy to gather feedback and we did broke a number of things. Fixed since.

Since then we introduced rather straightforward way of engaging our users in RC testing (we are releasing up to 70 PIP packages month with so called “providers” + main release of Airlfow every few months).

Initially when I proposed the solution described below which involved engaging our contributors and users who raised issues, we had a very strong push-back in our community “is it needed?”, “people can test anyway”, “we have no resources” to test alll that. I was quite persistent and started an experiment, which turned into a regular process which turned out pretty successful.

Every month we have like 60-70% success rate on users testing our release candidates on members of our community (both those who raised issues solved and those who implemented or commented on PRS).

As an effect we did - few times - decided to cancel and re-do the release with a follow-up RC2, 3 etc. when our users managed to find some bugs there,

I believe, we turned the engagement from our users on it’s head by following few simple principles:

don’t rely on people watching the schedule of release or announcement. Call them out specifically in GitHub Issue when you mention them by @handle (the issue is preapared automatically so it involves no “regular” maintenance effort"
make it as easy as possible to install the new version. reduce friction of the testing as much as possible - the issue contains direct link to the RC being released (automated of course), contains information on what has been changed (in the form of GH issue links - you guessed, automated)
make it personsal - each person is mentioned next to the issue and we ask personally for help reaching out with positive spirit, thanking in advance for help
we ask people to comment when they tested in a single issue with mutliple people to add “positive feedback loop” and reinforcement (if other tested why should not I help too?) and positive reaction and cheering where people did test. The nice thing about it is that you create a small “sub-community” of people who participate in this test round which IMHO is a very nice “social” efffect.
we thank the people to test it when we close the issue

After initial (yes) investment in automating it, I was personally able to release waves of providers every month since with very little effor of mine (and sometimes I released 70 packages at a time).

You can see the list of our issues so far:

Example issue here:

github.com/apache/airflow

Status of testing of Apache Airflow 2.2.3rc2

opened 10:23PM - 10 Dec 21 UTC

closed 10:05PM - 21 Dec 21 UTC

jedcunningham

kind:meta

We have a kind request for all the contributors to the latest [Apache Airflow RC… 2.2.3rc2](https://pypi.org/project/apache-airflow/2.2.3rc2/). Could you please help us to test the RC versions of Airflow? Please let us know in the comment if the issue is addressed in the latest RC. - [ ] [Handle case of nonexistent file when preparing file path queue (#18998)](https://github.com/apache/airflow/pull/18998): @SamWheating - [ ] [Simplify "invalid TI state" message (#19029)](https://github.com/apache/airflow/pull/19029): @dstandish - [x] [Rename execution date in forms and tables (#19063)](https://github.com/apache/airflow/pull/19063): @bbovenzi - [ ] [Fix: Add taskgroup tooltip to graph view (#19083)](https://github.com/apache/airflow/pull/19083): @bbovenzi @cb149 Linked issues: - [TaskGroup tooltip missing actual tooltip and default_args issue (#19078)](https://github.com/apache/airflow/issues/19078) - [x] [Do not create dagruns for DAGs with import errors (#19367)](https://github.com/apache/airflow/pull/19367): @ephraimbuddy - [x] [Fix log timezone in task log view (#19342) (#19401)](https://github.com/apache/airflow/pull/19401): @calfzhou @sartyukhov Linked issues: - [Webserver shows wrong datetime (timezone) in log (#19342)](https://github.com/apache/airflow/issues/19342) - [x] [Fix field relabeling when switching between conn types (#19411)](https://github.com/apache/airflow/pull/19411): @josh-fell @cb149 Linked issues: - [Relabeling in Connection form causes falsely named fields (UI only) (#19386)](https://github.com/apache/airflow/issues/19386) - [x] [Do not crash with stacktrace when task instance is missing (#19478)](https://github.com/apache/airflow/pull/19478): @potiuk Linked issues: - ['NoneType' object has no attribute 'refresh_from_task' Error when manually running task instance (#19477)](https://github.com/apache/airflow/issues/19477) - [ ] [KubernetesExecutor should default to template image if used (#19484)](https://github.com/apache/airflow/pull/19484): @dstandish - [x] [Fix task instance api cannot list task instances with None state (#19487)](https://github.com/apache/airflow/pull/19487): @ephraimbuddy - [x] [Fix IntegrityError in `DagFileProcessor.manage_slas` (#19553)](https://github.com/apache/airflow/pull/19553): @ephraimbuddy - [ ] [Cast macro datetime string inputs explicitly (#19592)](https://github.com/apache/airflow/pull/19592): @uranusjr @ShaharPotash1 Linked issues: - [Cast to string in ds macro functions (#19249)](https://github.com/apache/airflow/pull/19249) - [next_ds changed to proxy and it cannot be used in ds_add macro function (#19241)](https://github.com/apache/airflow/issues/19241) - [ ] [Declare data interval fields as serializable (#19616)](https://github.com/apache/airflow/pull/19616): @uranusjr @feluelle Linked issues: - [Fix PythonVirtualenvOperator not working with Airflow context (#9394)](https://github.com/apache/airflow/pull/9394) - [x] [Fix log endpoint for same task (#19672)](https://github.com/apache/airflow/pull/19672): @humit0 - [ ] [Enable task run setting to be able reinitialise (#19845)](https://github.com/apache/airflow/pull/19845): @khalidmammadov Linked issues: - [Capture container logs and upload as artficats when failed (#19755)](https://github.com/apache/airflow/pull/19755) - [x] [Workaround occasional deadlocks with MSSQL (#19856)](https://github.com/apache/airflow/pull/19856): @potiuk - [ ] [Relax timetable clas validation (#19878)](https://github.com/apache/airflow/pull/19878): @uranusjr @covaliov Linked issues: - [Custom Timetable Import Error (#19869)](https://github.com/apache/airflow/issues/19869) - [ ] [Context class handles deprecation (#19886)](https://github.com/apache/airflow/pull/19886): @uranusjr @jj-ookla Linked issues: - [[Airflow 2.2.2] execution_date Proxy object - str formatting error (#19716)](https://github.com/apache/airflow/issues/19716) - [Custom deprecation proxy implementing __format__ (#19734)](https://github.com/apache/airflow/pull/19734) - [ ] [Validate DagRun state is valid on assignment (#19898)](https://github.com/apache/airflow/pull/19898): @uranusjr @frapa-az Linked issues: - [DAGRun API endpoint returns status 500 for some run states (#19836)](https://github.com/apache/airflow/issues/19836) - [x] [Fix labels used to find queued KubeExecutor pods (#19904)](https://github.com/apache/airflow/pull/19904): @jedcunningham - [ ] [Fix possible reference to undeclared variable "return_code" (#19933)](https://github.com/apache/airflow/pull/19933): @PApostol - [ ] [Fix db downgrades (#19994)](https://github.com/apache/airflow/pull/19994): @jedcunningham - [ ] [Don't use CREATE TABLE AS SELECT ... with mySQL (#19999)](https://github.com/apache/airflow/pull/19999): @SamWheating - [x] [Fix infinite recursion on redact log (#20039)](https://github.com/apache/airflow/pull/20039): @aresabalo @potiuk Linked issues: - [logging error (#19816)](https://github.com/apache/airflow/issues/19816) - [ ] [Lift off upper bound for MarkupSafe (#20113)](https://github.com/apache/airflow/pull/20113): @tsingh2k15 Linked issues: - [removing upper bound for markupsafe (#19761)](https://github.com/apache/airflow/issues/19761) - [ ] [fixing #19028 by moving chown to use sudo (#20114)](https://github.com/apache/airflow/pull/20114): @jgmarcel @plockaby @kevinbsilva Linked issues: - [PermissionError when `core:default_impersonation` is set (#19028)](https://github.com/apache/airflow/issues/19028) - [Applied permissions to self._error_file (#15947)](https://github.com/apache/airflow/pull/15947) - [x] [Fix log link in gantt view (#20121)](https://github.com/apache/airflow/pull/20121): @jedcunningham - [x] [Bump minimum required ``alembic`` version (#20153)](https://github.com/apache/airflow/pull/20153): @kaxil - [x] [Log provider import warnings as debug (#20172)](https://github.com/apache/airflow/pull/20172): @potiuk @Aakcht Linked issues: - [Turn provider's import warnings into debug logs (#14903)](https://github.com/apache/airflow/pull/14903) - [Adds secret backend and logging information to provider yaml (#17625)](https://github.com/apache/airflow/pull/17625) - [Unnecessary warnings during airflow pods starts (#20159)](https://github.com/apache/airflow/issues/20159) - [x] [Switch default version of Python to 3.7 (#18922)](https://github.com/apache/airflow/pull/18922): @potiuk - [ ] [Move away from legacy importlib.resources API (#19091)](https://github.com/apache/airflow/pull/19091): @uranusjr - [x] [Fix race condition when starting DagProcessorAgent (#19935)](https://github.com/apache/airflow/pull/19935): @potiuk Linked issues: - [Restore stability and unquarantine all test_scheduler_job tests (#19860)](https://github.com/apache/airflow/pull/19860) - [Reloading of Logging and ORM in Dag Processing can be disruptive in spawn mode (#19934)](https://github.com/apache/airflow/issues/19934) - [x] [Move setgid as the first command executed in forked task runner (#20040)](https://github.com/apache/airflow/pull/20040): @potiuk - [x] [Fix flaky on_kill (#20054)](https://github.com/apache/airflow/pull/20054): @potiuk Linked issues: - [Move setgid as the first command executed in forked task runner (#20040)](https://github.com/apache/airflow/pull/20040) - [x] [Fix failing main. (#20094)](https://github.com/apache/airflow/pull/20094): @ephraimbuddy - [x] [Lazy Jinja2 context (#20217)](https://github.com/apache/airflow/pull/20217): @uranusjr @jedcunningham Linked issues: - [Context deprecation warnings when they aren't used (#20213)](https://github.com/apache/airflow/issues/20213) - [x] [Limit httpx to <0.20.0 (#20218)](https://github.com/apache/airflow/pull/20218): @potiuk - [ ] [Exclude snowflake-sqlalchemy v1.2.5 (#20245)](https://github.com/apache/airflow/pull/20245): @mik-laj Thanks to all who contributed to the release (probably not a complete list!): @SamWheating @ephraimbuddy @jgmarcel @humit0 @sartyukhov @potiuk @calfzhou @josh-fell @frapa-az @kevinbsilva @uranusjr @feluelle @jedcunningham @cb149 @khalidmammadov @aresabalo @bbovenzi @dstandish @tsingh2k15 @kaxil @PApostol @ShaharPotash1 @Aakcht @jj-ookla @covaliov @plockaby @mik-laj

Now - summarizing my point.

I am not telling it should be copied or done the same. In the discussion above I see that there might be various reasons why you would like or wouldn’t like to do something similar. And I am not even trying to argue with it.

But I just want to tell that there are ways to make it happen, and I believe they require maintainers initiative and coordination, but when done in a smart way and with “social” thinking and “community” building positive approach, they might succeed and result with the solution that requires very little effort and where you can delegate a lot of the work to those interested.