Packaging and Python 2

I ran a query (group by project-python_version-week-year, top 16000 by count, for past 6 months) on the downloads table in BigQuery (context), and dumped the data into a GitHub Gist. The simple API records, are too many to query from it without hitting my puny 1TB free tier limit.

Be warned, opening the Gist will result in a huge HTML page being served to you (JSON is ~2.5 MB and GitHub serves all of it highlighted – the loaded DOM is ~500 MB).

The GitHub Gist: https://gist.github.com/pradyunsg/f6ebfd39a5b21a097179f8b5f8b5fcb9


If someone knows of a cool way to generate insights from this, that’d be nice. In the interest of maintaining everyone’s sanity, here’s the raw (plain text) dump of the Gist files:

The raw query: https://gist.githubusercontent.com/pradyunsg/f6ebfd39a5b21a097179f8b5f8b5fcb9/raw/abb15158674b576c896814e69a8ffe1b9f9dbf22/query.sql
The raw resulting JSON: https://gist.githubusercontent.com/pradyunsg/f6ebfd39a5b21a097179f8b5f8b5fcb9/raw/abb15158674b576c896814e69a8ffe1b9f9dbf22/results-20190621-195401.json


And, with that I hit the “3 consecutive posts only” limit of Discourse.

I wrote an article to advocate using Python 3 for awscli.

https://dev.to/methane/use-pip3-to-install-awscli-44pk

And I made a pull request to Travis document.

Let’s advocate Python 3 more!

1 Like

I think about 14 (awscli and it’s dependencies) * 200k DL/day are from not normal users. I suspect AWS downloads them.
I filed an issue about it.

1 Like

Thanks @methane for the investigation, the blog post and for filing the issue w/ AWS. Much appreciated. =D

1 Like

I found awslogs-agent-setup.py installs awscli-cwlogs after installing pip<7.0.0.

I reported it to GitHub already.

BTW, awscli-dwlogs is downloaded about 330k/day, and awslogs-agent-setup.py downloads 17 packages (pip, virtualenv, awscli-cwlogs, and it’s dependencies). 330k * 17 is about 5.6M DL/day.

If awslogs-agent-setup.py is updated to download from S3 instead of PyPI, 5.6M DL/day will be
disappeared from PyPI download stats!

2 Likes

Update regarding the general Python 2 sunset:

Here’s the simple FAQ to circulate widely, originally published in early September (focusing on the fact that updates to Python 2 will stop on Jan. 1, 2020), and here’s the press release that just went out (the last minor release of Python 2.7 will be released in April 2020).

I’d be curious as to whether the projects listed in the Python 3 statement have seen declines in the proportions of Python 2-related downloads from PyPI since we published the FAQ in September.

I made charts for some projects, between January 2016 and November 2019 (pip, six, coverage.py, Django, Flake8, pylint haven’t signed it):

(Jan 2016 - Oct 2019: https://dev.to/hugovk/python-version-share-over-time-5-7ik)

1 Like

@jaraco has merged and released a PR dropping Python 2 support for setuptools here. We still have the compatibility shims in place, so it shouldn’t stop working if people use it even in Python 2, but we are still discussing whether or not to revert this PR.

If other PyPA projects have strong feelings one way or the other, now is a good time to comment.

I’m on record as wanting pip to drop Python 2 support sooner rather than later, so please take this comment with that in mind.

As pip vendors pkg_resources (which is part of setuptools), we have to pick a version to vendor (I suppose we could in theory vendor two versions, and pick one at runtime, or something like that, but that’s a lot messier). Am I correct in that assumption, or is there some clever way of keeping up to date while still supporting Python 2 that I’m missing?

If we do have to (in effect) pin pkg_resources, what will be the impact on interoperability standards development? Do such standards typically affect pkg_resources? I’m not so worried about build processes, as we install an independent copy of setuptools in the build environment.

My feeling is that pip will need to pin pkg_resources, but this won’t be as significant a problem as it would be if we relied on the broader setuptools package. I do think that we should have a discussion, and projects should be careful about unilaterally dropping support for Python 2, so thanks for raising this, @pganssle.

I think the key thing we need to consider here is moving the ecosystem forward. If all that mattered was support and bug fixes, I think that telling Python 2 users to just use the last version of the packaging tools with Python 2 support would be fine. But the real question is how do we drive adoption of new standards like pyproject.toml and later versions of manylinux if Python 2 projects are required to pin their packaging tools. I don’t think it’s impossible (for example, pure Python projects can just build universal wheels using Python 3 - they might need to change their workflow, but I’m OK with that) but if we could offer transition advice then that would likely smooth the process.

This is definitely something we should discuss at the packaging summit, IMO.

2 Likes

https://github.com/pypa/virtualenv/issues/1493 appears relevant as well.

Edit: Apparently (see this pip issue) some people build their own wheel mirrors, and don’t necessarily remember to extract the Requires-Python metadata and expose it via the index. Unfortunately, this seems like it would be a common mistake to make, which somewhat reduces the effectiveness of the data-requires-python tag :slightly_frowning_face:

We have an open issue in pip to transition from pkg_resources to importlib_metadata. It doesn’t address the larger issue, but would make setuptools dropping support for Python 2 less of a risk to pip specifically.

1 Like

As much as I would love to drop Python 2 on pip, it seems that a significant number of our users aren’t there yet.
Concerning setuptools, I think a deprecation notice of a few months would have made sense and feel that the drop was quite sudden.
All the libraries using setuptools and still supporting Python 2 are now likely in an uncomfortable place of choosing between pinning setuptools or having a different version (and thus different options) depending on the interpreter version…

I wouldn’t be surprised if importlib_metadata were to also drop support for Python 2 in a near future :wink:

1 Like

The desupport did in fact include appropriate Requires-Python metadata, and so should in theory have been seamless for users. However, it looks as if in practice, Requires-Python (and more specifically the data-requires-python tag for index servers) isn’t always exposed on custom indexes, and that’s what caused the worst problems.

So I think that the lessons here should be:

  1. Requires-Python (and data-requires-python) isn’t as robust a transition mechanism as we’d like.
  2. Core PyPA tools (like pip, setuptools and virtualenv) are used heavily in automated setups that pick up the latest release without warning and without in-advance testing. We can bemoan that as bad practice as much as we want, but it’s a reality that we need to deal with.

While projects should remain able to decide for themselves how they handle Python 2 desupport, it’s not unreasonable to expect sufficient advance warning, at least so that the various PyPA tools can provide a co-ordinated approach. We’ve had a number of “new release broke X” issues recently, and while they mostly haven’t been things we can control (see the points above) we should do what we can to be more prepared.

Do we know which tools might need patching to expose the data? Or is it all custom servers instead of bandersnatch/devpi-based stuff?

The case reported in the pip ticket was someone building their own wheels and exposing them as an index by just pointing Apache at the directory. There’s not really much we can do about that sort of scenario.

Ultimately the problem is that the “simple” repository API isn’t that simple any more, and needs actual work to implement correctly…

I guess pip could potentially be taught to respect python-requires even when given bare wheel files? Even if getting that info into the resolver is hard, then it could still error out when asked to install a py3-only wheel on a py2 environment…

It does this already.

That’s the crux. It would require downloading and unpacking the wheel, which is something we do for dependencies, but it’s costly and we’d prefer not to do it any more often than we have to.

Thinking some more about it, why is setuptools 45.0.0 a universal wheel??? Surely the simplest solution here would be for the wheel to be setuptools-45.0.0-py3-none-any.whl, rather than setuptools-45.0.0-py2.py3-none-any.whl?

@pganssle is this something that the setuptools maintainers could/should have done, or is there a problem with it that I’ve missed? My suspicion is that we’ve all become so used to “unversal wheel” = “pure Python” that we’ve just forgotten that this option exists…

Assuming that’s a realistic solution, maybe we should start recommending that projects which drop Python 2 support should simply stop publishing py2.py3 wheels at that point?

1 Like

Without Python-Requires metadata, pip would still prefer and install the 45.0.0 sdist (except with the --prefer-binary option).

Rats. Apparently my reply must be at least 10 characters, but I don’t have anything more to say beyond “rats” :slight_smile:

5 Likes

pip’s maintainers have been keeping track of Python 2 usage trends, and based off them, we’ve decided that pip 20.3 will be the last pip version with Python 2 support.

More details can be found here: https://github.com/pypa/pip/issues/6148#issuecomment-616213532

13 Likes