Packaging and Python 2

Found it! Add option and support to install and compile packages for a different Python interpreter · Issue #5472 · pypa/pip · GitHub

@ncoghlan Do you think there’s any way Red Hat could be convinced to support Py2 in pip without having to fork py2-pip and py3-pip into seperate lines of development?

OK, I’m going to add a personal perspective here.

When I was working on PEP 517 support, by far the most frustrating, time consuming, and demotivating aspect of the whole process was making sure the test suite continued to pass. Everything from stupid typos through to big design mistakes caused test failures (huge thanks at this point to everyone who’s ever contributed a test to pip - our test suite is an awesome resource :slight_smile:). And those test failures can be a massive PITA to debug, because we call so many layers of subprocesses and wrappers.

So far, nothing too exciting. Part of getting the job done.

But there were a number of times when only the Python 2 tests failed. And those often left me wanting to drop the whole exercise completely. I had no interest in fixing weird 2.7 Unicode bugs, or whatever silly quirk my latest change had triggered. And yet I had to spend ages debugging and diagnosing, just to get the blasted test suite to pass on a platform I couldn’t care less about.

That is the cost of keeping pip working on Python 2.7. It’s developer burnout and frustration.

From my POV, what would be ideal is if I could have simply ignored Python 2 only test failures, and left them for someone else to deal with (or not, I don’t care) as part of maintaining 2.7 support. I’d be happy for the fixes to be made on master, or otherwise merged back into the mainline, I just wouldn’t want to have to care about them myself, or have the work I wanted to do blocked by them.

Whether such a model (ideally with the Python 2.7 compatibility maintainers being funded by someone like Red Hat who gets paid to keep Python 2.7 support going) is workable, I don’t know, but it would suit my preference for “let me ignore Python 2.7” without pinning Python 2.7 at an older feature level (unless those Python 2.7 maintainers aren’t willing to put in the effort to keep up with new features). Personally, I wouldn’t even treat “tests don’t pass on 2.7” as a release blocker, but that would be an RM’s call, and would ultimately be contingent on how responsive the hypothetical “Python 2.7 maintenance team” were in practice.

3 Likes

But you know what? Less and less projects are going to maintain compatibility with 2.7. Major projects are already moving away or will be doing so in a year at most. Wanting to maintain compatibility with 2.7 means dealing with an obsolete toolchain becomes your dayjob, exactly like wanting to maintain compatibility with, say, Ubuntu 12.04 (which, incidentally, is still several years younger than Python 2.7).

And that “obsolete toolchain”, of course, includes Python itself.

+1 to this! If someone wants to stay on 2.7 for the next 10 years, it’s not in our interest to intentionally make it hard on them. OTOH, we don’t owe them anything either.

Sorry if I’m misinformed (not too familiar with the details of PyPI, package hosting, etc). Does it makes sense to have a Python 2.7 release that switches to use a different hostname for PyPI? E.g. use py2.pypi.python.org? That would achieve two things. First, it is easier to trace the PyPI traffic that is coming from Python 2 installs. Second, if we decide we want to outsource the maintenance of Python 2 packages, we can point that domain name to an organization who will do it.

I mean, in a way that’s kind of the nature of the beast isn’t it? I don’t necessarily care at all about Windows, and I’ve had to deal with Windows only breakages (and even had to brownbag whole releases and spin up VMs to reproduce issues that only affected Windows, and I assume likewise you’ve had to do so for Linux or macOS. I don’t see Python 2 as any different than that.

That’s not unique to Python 2, since any row in the testing/support matrix has that problem, since every contributor to a repository has a disparate “production” (even if production is just their desktop) environment and while they overlap, they are not a wholly uniform across the entire contributor base.

I also think PEP 517 is likely the worst case example, since it was touching code that hasn’t changed in years and is kind of gnarly and poorly factored to start with.

That being said, just because it’s not unique to Python 2 doesn’t mean it’s not a problem. The larger our support matrix is, the more likely those cases occur in a platform the current contributor doesn’t care about or possibly even have easy access to. That’s why finding the right balance for what we support is important to do. In the past we’ve always used a usage based metric for determining when we dropped support, because the balance was that the extra cost of keeping things working on that Python, was worth it because some significant share of users are currently using that platform.

So I am completely sympathetic to the idea that we should strive to trim our support matrix wherever we can, I’ve been one of the main drivers for the versions of Python that pip has dropped support for recently. I’m even sympathetic to the idea that Python 2 represents a higher than typical burden, and thus the balance should skew towards dropping support for it earlier rather than later.

What I’m arguing for is that saying 2020 is the date, simply because that’s when CPython decided their date was, is the wrong way to approach it. I think we should approach it basically the same way we’ve approached every other Python runtime drop. Look at usage and determine what % of users we’re willing to “leave behind” and monitor the usage numbers and drop when we’re hitting that target. It doesn’t have to be the < ~5% that we’ve historically used, because again I think it’s reasonable to say that Python 2 has a more significant cost than the other runtimes we’ve dropped support for. Maybe that correct number to target is 20% or even 25% or perhaps even 30%. I don’t think that 50-65% of PyPI is the right number though.

I think that an LTS branch is worthless, because I think that mainline will drift to the point you cannot reasonably backport features without effectively independently reimplementing them and keeping a frozen snapshot in time “working” is zero effort, since pip 0.2 still works perfectly fine on 2.7 much less a modern pip so without feature backports, a LTS branch is of limited utility.

Likewise I think the idea of “let people just merge stuff to mainline with tests failing on 2.7” is pointless unless we get someone to commit to fixing any test failure prior to a release and we gate all releases on getting tests passing again… but realistically I think that isn’t going to happen. Even if it did, I think it way overcomplicates things because there will still need to be a list of Python 3-isms that you simply cannot use in pip because they can’t be backported to 2.7… but with no testing to ensure that someone doesn’t inadvertently start using them. I think the most likely outcome is that even if we blocked a release on tests passing on 2.7, it would just stall the release (and our policy now is that master stays release-able and we don’t put master in an interm state).

If we don’t gate releases on 2.7 support then I think that it’s realistically just dropping support for 2.7, except without any of the tooling knowing we’re doing it, and users will just YOLO maybe get a pip that works on 2.7 and maybe not. It’s probably the worst of all possible options because if we don’t know if 2.7 is supported (but it might be!) then our packaging metadata is going to say that we support 2.7, when we don’t actually know if we do or not.

I think that there are really only 3 workable options:

  • We drop support for 2.7 wholly and unambiguously.
  • We make it possible to target a Python other than the one we’re running in, and we support targeting 2.7 (though as said above, this doesn’t solve this problem for build backends).
  • We continue to support (where support means, keep tests running, doesn’t need to include bugfixes that only repro on 2.7) 2.7 until we hit some target usage, and then drop it then.

Everything else to my eyes is either not solving the real problem, or is just adding so many additional problems that it’s not worth it.

I want to be clear that I’m not dictating anything here. I think it’s going to stagnate packaging improvements for years if we’re more aggressive than most of our users in dropping support for 2.7. That being said, if the ultimate decision is that we’re dropping support for 2.7, I’m not going to “take my ball and go home” or anything like that.

I mean, that’s great and all, but there’s like 60 total projects on that list and 165k projects on PyPI (or 65k with a release in the last year). Unfortunately metadata about what version of Python a project supports is… inconsistent (72% of the releases in the last year don’t specify).

However, of the top 20 we have:

  • urllib3 - Has https://github.com/urllib3/urllib3/issues/883, does not appear to have any solid plans to drop support for 2.7 (though likely wouldn’t until pip did anyways).
  • pip - Heh.
  • six - Goes away without Python 2.
  • botocore - Has no plans to drop 2.7 currently and still supports 2.6.
  • python-dateutil - Plans to drop 2.x support in 2020.
  • s3transfer - Has no plans to drop 2.7 currently and still supports 2.6.
  • pyyaml - Does not appear to have any plans to drop 2.7, still supports 2.6.
  • requests - Does not appear to have any solid plans to drop support for 2.7 (though likely wouldn’t until pip did anyways).
  • docutils - Cannot find any information on plans, (is it even still maintained?).
  • pyasyn1 - Does not appear to have any plans to drop support for 2.7.
  • jmespath - Has no plans to drop 2.7 currently and still supports 2.6.
  • awscli - Has no plans to drop 2.7 currently and still supports 2.6.
  • idna - Does not appear to have any plans to drop support for 2.7.
  • rsa - Does not appear to have any plans to drop support for 2.7, but it doesn’t look maintained either.
  • setuptools - Pretty much same boat as pip.
  • certifi - Don’t believe there’s any plans to drop support for 2.7 (but certifi has no meaningful code in it).
  • futures - Not a lot of point to it without 2.7.
  • colorama - Does not appear to have any pans to drop support for 2.7.
  • chardet - Does not appear to have any plans to drop support for 2.7.
  • simplejson - Does not appear to have any plans to drop support for 2.7.

So out of the top 20, only 1 of those libraries appears to have any plans currently to drop support for Python 2.7 (maybe top 18/19 if you remove pip/setuptools). This wasn’t an extensive survey, I mostly went through and looked all the repositories and issues trackers to see if there was any discussion or indication they were planning to drop 2.7. But best I can tell, projects that are currently planning to drop support for 2.7 are in the vast minority.

We can already trace the PyPI traffic for Python 2, well the downloads that come from it. We could do something similar for all HTTP requests to /simple/ if we wanted to (this would eliminate some of the caching / already installed issues that the other numbers had). So far in the month of January, 65% of all downloads from PyPI were initiated from Python 2.

Adding another domain doesn’t really help I don’t think.

2 Likes

I think this is the core of our disagreement then, as I see maintaining Python 2 compatibility as being categorically different from maintaining platform support, since platform support affects the experience of newcomers to the Python ecosystem as well as the upgrade experience of all existing users on that particular platform, while Python 2 support only affects long term Python users with existing projects to maintain, and for those long term users stability (i.e. not breaking things that currently work) is often going to matter more than having access to the latest and greatest features.

That’s the way maintenance of Python 2.7 itself has worked for years (pretty much since Python 3.3 was released). It mostly gets left alone, contributors that themselves only care about enhancing Python 3 could pretty much pretend 2.7 branch didn’t exist any more, and only when the absence of a feature was having significant negative network effects outside the community of Python 2 users did we look into what we could do to resolve those particular issues (hence PEP 466, PEP 476, and PEP 493). Compared to the annoyance of working on 2.6 & 2.7 in parallel with 3.0/3.1/3.2, working on 3.3+ was a joy (and even in the maintain-in-parallel era, developers could decide that a new change was going to be Python 3 only if they felt it was too intrusive to include in a Python 2.x release).

Note also that there’s a middle ground short of “Rip out Python 2.7 support from the main branch entirely, and allow the two code bases to completely diverge” that’s still more contributor-friendly than the status quo.

That middle ground might look something like the following:

  1. Python 2.7 CI on the main branch would be moved to an advisory job, rather than being a merge blocker. Contributors that personally or professionally care about Python 2.7 compatibility can make sure that CI is passing before they merge, but contributors that only care about Python 3 can just get the main Python-3-only CI passing and call it done (in cases where the Py2 compatibility fix isn’t a trivially obvious problem that can be resolved during the PR review). (Maintenance branchs would keep the Python 2.7 compatibility requirement, such that security fixes always hit all supported versions at the same time)
  2. Restoration of full Python 2.7 compatibility would become a pip release criterion, not a PR merge criterion
  3. Contributors that are willing to investigate and resolve issues with any Python 2.7 CI failures would then deal with them separately, rather than continuing to require that everyone contributing to the main line of development have to worry about Python 2.7 for every change.

That model would then be pretty close to the way CPython deals with its Buildbot fleet, where PRs have to pass CI on Linux/Windows/macOS, but then post-merge CI runs on a broader range of platforms, and hence can pick up issues with specific Linux distros, AIX, FreeBSD, etc.

The commits to fix Buildbot failures often come from contributors other than those that made the initial commit that introduced the failure.

This model does make release management more painful, and requires that you have some dedicated “I care about maintaining compatibility with X” contributors, but it means that the burden of maintaining compatibility with the legacy platform falls specifically on those folks, not on everyone.

(A slight variant on this would be to have a separate Python 2.7 compatible branch, and have a bot that automatically merged in the latest Python 2.7 compatible code. However, that only makes sense if you’re going to introduce a separate pip2 package, and that seems like it would introduce significant pain for nowhere near enough gain)

I’m not really concerned about the end users in terms of 2.7 as much as I am about the developers. For example, the cryptography project is pretty popular and is unlikely going to be dropping 2.7 support anytime soon. If we drop support for 2.7 in pip 20 then they cannot rely on any new packaging standard that doesn’t exist prior to pip 20 until they drop 2.7 support, which is likely years away.

Thst doesn’t just effect people on 2.7, because thst means thst packaging improvements will not exist even for people wholly on 3.x when they are attempting to install the cryptography project.

That isn’t unique to them. It is going to exist for every single project that chooses not to drop 2.7, and evidence suggests that the 3.x only projects are still vastly in the minority. The network effects are going to be huge, and are going to lock away improvements in packaging for new and old users alike.

The alternative way to frame this is it makes it more likely you burn out that particular contributor since they’re likely going to be spending most of their time playing clean up duty after other people’s code contributions rather than being able to meaningfully contribute their own code. It also adds the case of what do we do if Python 2.7 is broken on master and it’s time to release? Do we just expect the Releae Manager to go through and fix everything? Do we hold up the release indefinitely until someone fixes it? Do we go “surprise, No More 2.7!” And release anyways?

We just instituted the rule that we don’t commit partially done code to pip master because it put us in exactly that scenario and it sucked hard. Rolling thst back would be a bad decision IMO.

If there’s going to be some magic person who comes along and restores 2.7 compatibility, why can’t they do it in the PR, before it gets merged, instead of after? A PR isn’t required to only have a single author, we could add an “awaiting 2.7 comparability” tag. If we’re gating a releae on 2.7 compatibility anyways, then expecting it to be done before the merge doesn’t change the fact that someone is going to have to write that code. It just means that we don’t block other festures on 2.7 compatibility for a specific PR. Contributors would then have a choice. They can add 2.7 compatibility themselves to their PR, or wait until this hypothetical person is able to add 2.7 support to their PR for them.

1 Like

I wanted to further address this. I don’t think this accurately represents the actual policy of the CPython buildbots anyways. In a thread in ‘17 Victor brought up the point that if the buildbots weren’t fixed quickly they just bit rotted further and further. So he proposed reverting changes that were not trivial to fix that broke one of the “major” buildbots to force either the original author or someone else to fix the patch before it could get reapplied to CPython. Pretty much everyone on that thread agreed with the idea.

The policy you mentioned pretty much only applies in practice to esoteric platforms with little usage (like AIX), not the major platforms, and certainly not the primary platform still being used.

Yes. But “major” platforms are pretty much Linux, macOS and Windows with not-too-exotic configurations (e.g. if a commit were to break, say, a non-libc Linux platform, it’s not obvious it would be reverted). @vstinner can elaborate.

My notes on buildbots:

I concur with Antoine. If there is a regression on AIX, I will not revert immediately but help developers to write a fix. I wouldn’t quality AIX as a “well supported platform” yet, but we made great progress on AIX support last year. It depends on which CI is broken. If Travis CI is broken, here there is no need to discuss: the regression must be reverted immediately since it prevents to merge any new change.

Here I’m talking about recent changes, not 1 year old.

The last 3 months, random failures on buildbots became very very rare. We are getting less and less emails on the buildbot-status mailing list which are buildbot failures. Some tests fail randomly, but pass when run again. They are mostly race conditions which fail when other tests are running in parallel.

It means that when a buildbot fails, it becomes very likely that it’s an obvious regression. Especially when many buildbots fail on the same commit.

1 Like

Looks like major also includes FreeBSD, and possibly more relevant to this discussion EOL’d versions of macOS and Windows.

Whether the CPython developers want to admit it or not, Python 2.x is still a major platform in those terms. I mean thus far in January 65% of all downloads from PyPI originate from a Python 2.7 installation. It’s not just a major platform, it is the dominant platform, with almost two times the amount of downloads as all of the versions of Python 3 combined.

There is no supporting evidence that Python 2.7 is more like AIX is this parallel instead of one of the major supported platforms.

1 Like

While we’re talking about pip, another idea that I recall now (of @xafer, IIRC) was to adding warnings about EOL of 2.7 to pip, without committing to a date. Basically, pip saying “You’re running Python 2.7, which will reach end-of-life in 2020 <blah> <blah>” (the exact message would be a lot better than this, possibly linking to the PEP and https://python3statement.org/). I expect that’ll affect the trend of people moving off of Python 2 positively.


Agreed. I don’t want us to be too aggressive – dropping support for the dominant platform is a bad idea. That said, I do think that it does come down to: how long is it reasonable to expect volunteers to do the work to keep enterprise software, which hasn’t moved on from Python 2.7 for years, working? [1]

Dropping support in pip for Python 2.7 in Q3 2020, is the most we should do as volunteers IMO. To extend support beyond that, someone has to do the work – to add support for managing other environments than the ones pip is installed in or commit to maintaining 2.7 support (and keep the 2.7 CI green) in mainline pip, for X months/years.

[1]: in an ideal world, my answer would be “not more a few hours” but this isn’t the ideal world : )

The amount of volunteer developer time pip gets is already pretty low and adding more work to do during release time will do more harm than good.

Like I said before. I’m perfectly fine with adding a warning like that. I think it’s a good move and will hopefully encourage people to move off of supporting Python 2.7 which makes it way easier for us to do so.

I’m even fine with saying we close all 2.7 only bug reports (or otherwise tag them to make it easy to filter them out) with the message thst 2.7 is in “legacy support mode” and that the maintainers are not expending further effort to fix bugs for 2.7.

I think framing it as enterprise software is doing a disservice here. If it was just enterprise software then I’d be perfectly fine saying that we can drop it and those people can follow up with a company that wants to sell them support. I’m trying to support other open source projects who are, for one reason or another, continuing to support 2.7. Were not harming enterprise software here, were harming other volunteers just like us. When a Python has little usage it’s easy to argue that the harm is minimal because those projects obviously have few users on that Python, but the same isn’t true for 2.7.

I think targeting a date is the wrong metric to use. What if we get to Q3 2020 and 65% of downloads from PyPI are still for 2.7? What if we hit Q3 2019 and Py2 usage is dropping like a rock and we can drop support earlier? Setting a target usage number allows us more flexibility while still making sure we’re not leaving users (both developers and end users) out in the cold.

Provided you fix the collection of that usage data. I don’t suppose you can look at the source of the “is pip up to date” requests, rather than downloads?

1 Like

It would not be hard to start capturing metrics for any http request made to PyPI. Could be the “is pip up to date” ping or fetches of the simple API (which don’t have the same caching / already installed issues thst the download data has). I’ve been thinking about adding the latter for awhile anyways.

If folks are ok with using a usage based metric but would rather use one of those two metrics (or some combination of the three) I can do the work to start tracking that.

1 Like

Just a subtle point re: PyPI stats and overall adoption of Python 3.

Looking at PyPI stats without also looking at Anaconda and conda-forge stats will understate the adoption rate of Python 3. By how much it understates, I don’t know but I expect it is statistically significant.

The usage metrics proposed by Donald are reasonable. The metrics would reflect a conservative view of usage and adoption of Python 3 vs. Python 2.

Thanks to the PyPA/pip/PyPI folks for the ongoing work on metrics and transparency. These docs are excellent.

4 Likes

So I’d argue that in terms of pip, the overall Python 3 adoption matters less than the adoption of Python 3 among pip users. If 100% of conda users are using 3.x but 100% of pip users are using 2.7, then the fact that conda users are using 3.x exclusively doesn’t really matter in terms of what pip supports. I’d feel roughly the same about getting the numbers of 2.x vs 3.x uses from same Linux distros.

In terms of setuptools and other build backends there is more of an argument I think because they have to get invoked by the conda (or Debian/RHEL/etc) build systems.

1 Like

Completely agree re: pip. Sorry I wasn’t clearer.

Since everyone seems to be on board with 2.7 printing some kind of message, I’ve gone ahead and created https://github.com/pypa/pip/pull/6147.

I’ve purposely made the warning vague in terms of when pip itself will drop support for 2.7 since we don’t really have a good sense of that yet, but if anyone has suggestions on wording, feel free to comment on that PR.

3 Likes