Handling modules on PyPI that are now in the standard library?

As a followup, I programmatically investigated how many packages on PyPI have project names matching stdlib top-level package/module names. I used the following quick n’ dirty code:

import sys

import requests

stdlib_module_names_filtered = {name.strip("_") for name in sys.stdlib_module_names}
results = {name: requests.get(f"https://pypi.org/project/{name}/").status_code
           for name in stdlib_module_names_filtered}
pypi_packages = {name for name, status in results.items() if status == 200}
print("\n".join(sorted(list(pypi_packages))))

which produced a total of 59 project results (a combination of backports and mostly un-maintained squatters—I haven’t yet done a complete manual survey):

List of PyPI projects sharing names with stdlib modules
antigravity
argparse
ast
asyncio
blake2
calendar
chunk
configparser
contextvars
csv
ctypes
dataclasses
datetime
dis
distutils
elementtree
email
enum
functools
future
graphlib
hashlib
hmac
html
http
importlib
io
ipaddress
logging
mailbox
modulefinder
multiprocessing
nntplib
numbers
pathlib
pydecimal
pyio
readline
resource
secrets
select
selectors
sha1
sha256
sha3
shelve
signal
ssl
statistics
time
token
turtle
typing
unittest
uuid
wave
weakrefset
winapi
wsgiref

One notable example of the latter is the turtle PyPI package, which came up in the discussion of whether to allow someone maintaining nntplib to claim the otherwise-protected name on PyPI. It is an ancient, unmaintained package that has a totally different purpose to the stdlib one that only ever had two releases, 0.0.1 and 0.0.2 both on the same day in 2009, has its homepage URL now blocked as a malware site, and was only ever Python 2 (maybe even 2.6) compatible, yet has 40k downloads per month.

I’ve analyzed it in more detail in the case study that follows (uncollapse to view):

Case study of the `turtle` package

Given it is mostly just an empty placeholder anyway, cannot work with any Python version with the stdlib turtle installed, nor any newer than 2.7 (and potentially even 2.6), and only gets a few dozen downloads per day on the latter, implying that with the former it is basically infintesimal, it seems it should just be deleted per PEP 541 as a placeholder and the name reserved like the others. However, it is perhaps a useful case study to note, in case there are more such packages.

It’s worth nothing that for turtle, around 95-98% of downloads are on Python 3, for which the package will error out on import, with half a percent on Python 2 and around 3% unknown (which I’m guessing may also be mostly Python 2). This is substantially higher than other packages like requests, which is at around 93-94% Python 3, 3.5% unknown and 2.5% Python 2 (the proportions of which appear to be mostly dominated by containers, CIs, services, etc. rather than directly installed, being very deep in the stack), or at the other end of the spectrum, Spyder (an end user application directly installed by users, mostly via means other than pip) at around 85% 3, 10% Null and 5% 2.

Looking at Python minor version for turtle, 3.11 has a large plurality, at over 30% on average, with 3.10 taking 2nd at modestly over 20%, and 3.8 and 3.9 being tied for third at around 15% each, 3.7 at 5-10%, with the remaining being Python 2 and increasingly smaller amounts in earlier versions. By contrast, both requests and spyder follow the conventional Python adoption curve, of a normal distribution with a peak around Python 3.9 and trailing off to both sides. This, coupled with the high proportion of Windows users for turtle (60%) relative to spyder (around 35%) seems to suggest that this is likely a high proportion of confused beginners downloading Python for the first time (rather than, as some initially speculated, users trying to download turtle on Linux distros that didn’t ship it).

2 Likes