Any list of well-known distribution/import name mismatches?

As part of the tutorial content I’m planning to write up on Codidact, I want to cover the fact that the name used to install a package from PyPI doesn’t necessarily match the name used to import it in code. To be more immediately useful as a reference, I’d like to include a table of common cases where there’s a mismatch.

I’m sure there are many more, but whenever I ask myself about this only two come to mind: OpenCV (pip install opencv-python / import cv2) and PIL (pip install pillow / import PIL).

What other not-too-obscure cases have you encountered? (Even if the mismatch is as simple as hyphen vs underscore - I might point out and round up those cases separately.)

That’s also the case for scikit-learn and scikit-image that are imported as sklearn and skimage :slightly_smiling_face:

Ah I also use pyserial and pyusb, imported respectively as serial and usb

1 Like

pygame-ce, a fork of pygame that keeps the import name.

Generically names with hyphen on PyPI imported with underscore in code.

Ah, right, forks are another common case for this sort of thing. Aside from the PIL situation (where the original died out), there’s also the case of manim (community-maintained, formerly manimce) vs. manimgl (the original, formerly manimlib… I think). All of which are imported as manim.

pip install beautifulsoup4 vs import bs4

1 Like

I came across these three today:

  • pip install python-multipart / import multipart
  • pip install multipart / import multipart (no mismatch but silently shadowed by the above)
  • pip install multipart-reader / import multipart_reader (hypen vs underscore)

Another common class is namespace packages, e.g. pip install google-auth | import google.auth

1 Like

It might be possible to pip install 'google.auth', though. Because of normalization rules. Correct?

1 Like

pip install setuptools vs. import pkg_resources (and import distutils?) and all the other distribution packages that contain more than one top-level import package or import module.

1 Like

True, in this case it works, although in general namespace packages aren’t required to follow that convention.

The flipside is still something to look out for, since the package is called google-auth and you might expect to import google_auth based on other experience.

1 Like

pyyaml is the main one I encounter that is broader than my niche.

Arguably any stub only package could fit the bill. For example, pip install pandas-stubs corresponds to import pandas.

Other examples that come to mind:

  • pip install django-debug-toolbar vs import debug_toolbar
  • pip install python-dateutil vs import dateutil
  • pip install matlabengine vs import matlab
  • pip install more-itertools vs import more_itertools

People already named most of the ones I know, but there are quite a few PyPI distribution packages with a prefix python-X (or the aforementioned pyX whose import packages are just X, for example the very popular dateutil, with the import name dateutil but the distribution name python-dateutil, or pyOpenSSL, with distribution name pyOpenSSL and import name OpenSSL.

One other case much rarer case is where the widely known project name is different from both the import and distribution name, e.g. pytorch where both the distribution and import package is torch, yet it is nearly universally referred to by its project name, PyTorch, including consistently within the project’s readme. The mistake is common enough that there’s a placeholder pytorch package specifically to warn people and avoid typosquatting.

Correct, google.auth, Google-Auth and GOOGLE__AUTH are all equivalent for the purposes of distribution package tooling due to the normalization originally specified in PEP 503.

Just to be clear, the name of the project is Pillow, not PIL—PIL was the original project that Pillow forked and eventually replaced and the import name for drop-in backward compatibility.

1 Like

When I wrote the first post, there was a nagging thought in the back of my mind that there was a third common case I knew about - it was this one.

Interesting. I should definitely include a bit about the normalization rules and naming conventions in the Q&A.

Interesting, but probably out of scope for what I want to write up. This is supposed to be for people who are having difficulty installing a library as a user, rather than as a developer. I imagine that anyone who has a use for stub-only packages will understand PyPI and/or the library in question well enough already.

Yes, it looks like these are two popular classes. Good point about PyTorch, too. I guess they felt the py is redundant and poor style for distribution, but good SEO for the webpage… ? Anyway, I hadn’t actually been thinking in these terms, even though I was apparently aware of it in the OpenCV case (and thanks for the related correction about Pillow).

I use:
python-ffmpeg, imported as ffmpeg.
pycryptodome, imported as Crypto
pycryptodomex, imported as Cryptodome

Shouldn’t this be something that someone with access to PyPI’s db could
answer with a query?

10 posts were split to a new topic: Determining top-level import package names programmatically

Per request of OP, as discussion had drifted to another related topic

Pyserial uses import serial

(Also mentioned previously; was one of my first thoughts as well :‍)