Packaging and Python 2

I tried this query:

  COUNT(*) AS count
  file.project = "awscli"
  count DESC

The result is here:

Oh, “null” is the most major distro! I don’t know what environment are included in “null”.
Ubuntu 16.04 is the second most major distro. But I believe Ubuntu 18.04 will overtake it at some point.

I found very interesting comment.

For CLI v2 we’re planning on distributing standalone installers for all platforms to avoid these and other issues.

Not that I don’t share this goal, but IIRC Donald has 20% of his time allocated to OSS, so Amazon is one of the few companies paying people to support packaging. :wink: Of course, the burden spreads further than Donald, but I like to take every opportunity I can to recognize companies that are actually doing The Right Thing™ and supporting OSS. :smile:

1 Like

(Especially since py2 doesn’t ship with pip.)

( ?? ->

Oh yeah, for sure. I think it’s more than 20% even. I’m not saying we should get out the pitchforks :-). It’s much appreciated. But, just to overstate the obvious, it doesn’t mean they can stay on py2 forever while volunteers like Paul and Pradyun support them. So they still need some kind of plan.

Huh, whoops. Today I learned!

Huh, whoops. Today I learned!

You’re forgiven :slight_smile: I forgot that there is at least one good reason why ensurepip isn’t as well known in Python 2.7. Besides the fact that is was added late in 2.7’s life, we also elected to make its installation opt-in so, unlike with Python 3, you need to explicitly add --with-ensurepip to ./configure.


Is it included by default on the installer?

Edit: Yes. says:

On Windows and Mac OS X, the CPython installers now default to installing pip along with CPython itself (users may opt out of installing it during the installation process).

1 Like

I’m pretty sure it’s possible to craft a query that shows what proportion of Py2 downloads come from what packages.

If we see a substantial amount of downloads for a certain package, that’ll be an indicator here if there’s other places to look. (It’s basically the same analysis as @methane and @njs applied to aws-cli but more generally applied)

I ran a query (group by project-python_version-week-year, top 16000 by count, for past 6 months) on the downloads table in BigQuery (context), and dumped the data into a GitHub Gist. The simple API records, are too many to query from it without hitting my puny 1TB free tier limit.

Be warned, opening the Gist will result in a huge HTML page being served to you (JSON is ~2.5 MB and GitHub serves all of it highlighted – the loaded DOM is ~500 MB).

The GitHub Gist:

If someone knows of a cool way to generate insights from this, that’d be nice. In the interest of maintaining everyone’s sanity, here’s the raw (plain text) dump of the Gist files:

The raw query:
The raw resulting JSON:

And, with that I hit the “3 consecutive posts only” limit of Discourse.

I wrote an article to advocate using Python 3 for awscli.

And I made a pull request to Travis document.

Let’s advocate Python 3 more!

1 Like

I think about 14 (awscli and it’s dependencies) * 200k DL/day are from not normal users. I suspect AWS downloads them.
I filed an issue about it.

1 Like

Thanks @methane for the investigation, the blog post and for filing the issue w/ AWS. Much appreciated. =D

1 Like

I found installs awscli-cwlogs after installing pip<7.0.0.

I reported it to GitHub already.

BTW, awscli-dwlogs is downloaded about 330k/day, and downloads 17 packages (pip, virtualenv, awscli-cwlogs, and it’s dependencies). 330k * 17 is about 5.6M DL/day.

If is updated to download from S3 instead of PyPI, 5.6M DL/day will be
disappeared from PyPI download stats!