Requesting clarification for PEP 427/wheel file names and build numbers

If you read PEP 427’s file names spec, it says that the build number is a “tie breaker if two wheels have the same version”. But what is a “wheel” in this context"? Do the wheel tags have to match, or is it any wheel for the same project with the same version?

For instance for a project named spam at version 1.0, let’s assume you have the following wheels:

  • spam-1.0-cp39-cp39-manylinux.whl
  • spam-1.0-42-py3-none-any.whl

Would you put the second wheel ahead of the first thanks to having a build number, or would you put the first wheel in higher priority since it has the tighter wheel tags? Whatever the intent of what the PEP means by “wheel tags match” I can update based on this conversation.

For context, I’m asking in terms of how would you group and sort wheel files?

  1. List sorted by build number, grouped by wheel tags, e.g. [{'py3-none-any', ...}, {'cp39-cp39-manylinux', ...}, ...]
  2. Group by wheels and then sort the resulting list by build number, e.g. {'py-none-any': [...], 'cp39-cp39-manylinux': [...], ...}
  3. You can’t really know the answer as it’s up to the installer, so don’t even try to group by either and just treat every build-number/wheel-tag pair as an equal but unique thing
1 Like

Is there actually a “tie” in your example if you consider what PEP-425 has to say on tag ordering?

Sometimes there will be more than one supported built distribution for a particular version of a package. For example, a packager could release a package tagged cp33-abi3-linux_x86_64 that contains an optional C extension and the same distribution tagged py3-none-any that does not. The index of the tag in the supported tags list breaks the tie, and the package with the C extension is installed in preference to the package without because that tag appears first in the list.

I always assumed that there’s some specific ordering to these tags, and then if there’s a tie, the one with the higher build number is used, i.e. the tags have the precedence over build number.

That’s certainly how I read things. I have no real-world examples of build numbers being used, to make a judgement from, so this is purely theoretical as far as I’m concerned, but I think that tag priority beating build number makes more sense.

TBH, I’d be completely OK with deprecating and ultimately removing build number. It feels like an interesting idea that when it came down to it, didn’t have any actual use cases.

Please do not remove it. It proved useful when things went wrong trying to release wheels for a stable project where new hardware or OS versions occur more frequently than the project wishes to bump the version. The project (cffi) ended up mistakenly creating broken wheels for macOS when trying to build new wheels for aarch64, and needed to somehow override them. Here is the some of the discussion on the issue, and a stack overflow answer. Updating the version would have required needless churn for hundreds of dependent projects, as tools like dependabot and brew detect new releases.

1 Like

Cool, that’s really useful to know. To bring this back on-topic, did you need the build number to be higher or lower than the wheel compatibility when installers chose the wheel to install, in order for your solution to work? Or does it not matter, because the way you build means that there’s no platform where you ship more than one compatible wheel?

For an additional data point (since it doesn’t seem to have been mentioned by anyone yet), pip sorts the build tag before platform tags, so spam-1.0-42-py3-none-any.whl is preferred over spam-1.0-cp39-cp39-manylinux.whl.

From a theoratical perspective (I have personally never had to resolve this kind of precedence either), I would want the logic to be the other way around, i.e. the build tag is a tie breaker if the platform tags also match (i.e. I think pip’s current implementation is wrong). If a more generic wheel is superior and always preferred over more specifically-built ones, a new (post release) version is probably warranted, so the build tag can be used to only override one specific wheel without affecting any other wheels.

Thinking about this from another direction, any source change would likely need a version bump since sdists can’t have build tags. So a build tag can only be used for changes not related to the actual implementation, IOW packaging faults. Since metadata are tied to a specific distribution file in Python packaging (instead of to a version of distributions), a build tag should only raise the precedence over distributions that are otherwise tagged identically, not different distributions of the same version.

3 Likes

I’ll wait for @mattip to give a more definitive answer, but a quick glance at the thread suggests that build numbers tie-breaking would have worked as they just needed to override the wheel that macOS users would have received. IOW I think it was a way to do a yank at the wheel level instead of at the release version level (see CFFI 1.14.3 files for macOS and notice the 2 build number for e.g. cffi-1.14.3-2-cp39-cp39-macosx_10_9_x86_64.whl).

Yeah, I found that code but the reason why it was that way alluded me. :grinning_face_with_smiling_eyes:

That’s what my brain keeps wanting as well. And if I am right with what the CFFI team would want, it would give them what they are after.

If I remember correctly, Brett accurately describes the situation: they wanted to override a single, faulty, platform-specific wheel without incrementing the version and rebuilding all 30+ wheels.

If a more generic wheel is superior and always preferred over more specifically-built ones, a new (post release) version is probably warranted

It is quite a reach, but what about the scenario where (after a few years) a stable-and-difficult-to-build package with platform-specific acceleration code loses access to a build platform, and an issue is reported that invalidates only that platform-specific wheel? Then it would convenient to bump the build number for all the other platforms + non-accelerated generic package. Users of the special platform would be able to use the package and pip would install the non-accelerated generic version. This easily beats the option of having to build a whole new version of the package just to avoid the accelerated variant for the problematic platform.

The build number is like the release number in RPM. It is necessary because the person who does the build might not be the same person who released the versioned software, and they need to be able to correct a mistake. It should sort after the tags.

Installers decide on the version number separate from any wheel logic. Then the installer gets a list of all available wheels (and the source) for that version of the software.

Suppose a Python interpreter supports these wheel tags. The list of tags can be considered as a distance between or similarity to a wheel with that tag and the install target. We want the wheel that is the closest to the current interpreter. The first tag means “complied on this machine”. Later / farther tags mean “compatible, but not compiled for/on this specific interpreter+machine”. Wheels with tags not in the list are thrown out as incompatible.

[(0, 'py37-none-linux_x86_64'),
 (1, 'py3-none-linux_x86_64'),
 (2, 'py36-none-linux_x86_64'),
 (3, 'py35-none-linux_x86_64'),
 (4, 'py34-none-linux_x86_64'),
 (5, 'py33-none-linux_x86_64'),
 (6, 'py32-none-linux_x86_64'),
 (7, 'py31-none-linux_x86_64'),
 (8, 'py30-none-linux_x86_64'),
 (9, 'py37-none-any'),
 (10, 'py3-none-any'),
 (11, 'py36-none-any'),
 (12, 'py35-none-any'),
 (13, 'py34-none-any'),
 (14, 'py33-none-any'),
 (15, 'py32-none-any'),
 (16, 'py31-none-any'),
 (17, 'py30-none-any')]

90% of the time it is easy because there is only one compatible wheel.

Suppose we are installing beaglevote-1.0 that can be compiled without its extension modules. The installer finds

beaglevote-1.0-py37-none-linux_x86_64.whl
beaglevote-1.0-1-py37-none-linux_x86_64.whl
beaglevote-1.0-1a-py37-none-linux_x86_64.whl
beaglevote-1.0-py37-none-macosx_10_6_x86_64.whl
beaglevote-1.0-py3-none-any.whl

It throws out the macos wheel. The remaining wheels are ranked as follows

(0, (1, "a")) beaglevote-1.0-1a-py37-none-linux_x86_64.whl
(0, (1, "")) beaglevote-1.0-1-py37-none-linux_x86_64.whl
(0, (-1, "")) beaglevote-1.0-py37-none-linux_x86_64.whl
(10, (-1, "")) beaglevote-1.0-py3-none-any.whl

Sort by the smallest tag rank and the largest numeric and alphanumeric parts of the build tag. If not given an implicit build number (-1, "") sorts before any explicit build tag.

We have also discovered that you can have identically named wheels in your search path. Installers tend to prefer wheels earlier in the search path, e.g. an already downloaded wheel takes preference over one off pypi.

In this case, users on that special platform would no longer have a way to install that specific wheel except asking for it specifically (e.g. by URL), IOW functionally the same as yanking the wheel.

Ahh, right. My scenario would be covered by yanking the offending wheel. Thanks.

Thanks everyone for the clarifications! I will tweak PEP 427 slightly to make it clear that build numbers break ties when all other aspects of the wheel file name are the same.

I also think that my proposed grouping is going to change and only attach the wheel with the highest build number and leave the others out since they are purposefully no longer valid (e.g. Matti’s yank scenario).

Someone should probably also raise an issue against pip for this (or check that there isn’t already one…)

1 Like

I can do that after I fix the PEP.

PEP is updated and pip issue filed.

Aside: how did you get that neat issue template?!? Are you opted into some beta feature or is this just something you stumbled upon which has not been documented yet?

1 Like

I believe we’ve opted into a beta feature on github. But I don’t know any details, I’m afraid.

Edit: Use new issue forms for bug reports by webknjaz · Pull Request #9550 · pypa/pip · GitHub

1 Like