I was thinking that this is a very particular use-case that should not be used in a standard workflow. A maintainer uses pip wheel --actually-build-two-wheels-this-time [--with-shim] my_project/ for the releases where they want to upload parallel wheels[1].
You wouldn’t use any of that machinery to install an sdist, as there’s no need to care about the format of your wheel in that situation.
Doesn’t alter the fact that this would require changing the PEP 517 interface, and that’s a massive disruption, which is what we’re trying to avoid here…
Check that installer is compatible with Wheel-Version. Warn if minor version is greater, abort if major version is greater.
If Root-Is-Purelib == ‘true’, unpack archive into purelib (site-packages).
Else unpack archive into platlib (site-packages).
Spread. (a.k.a. do the rest of the unpacking)
I don’t see anything in here that would prevent us already from increasing the version number in WHEEL in a way that causes the spreading phase to be different. If installers aren’t checking the version, they’re contrary to the spec. Perhaps they’re making assumptions that would be less performant under a new spreading model, but it wouldn’t be incorrect.
Are we really trying this hard to avoid following the design we already spec’d out? (and which installers are in theory already following)
The PEP goes over the major version check in the wheel spec, and even references PEP 427 where the current spec was defined. The point of putting Wheel-Version in the metadata is so resolvers can factor it into their decisions before installation time. The issue with leaving this up to install time is that if we allow people to publish new wheels (with the same extension) right away, older resolvers will pick those wheels and then the installer will error out at the installation phase. This means that we would either have to choose to:
ban uploads for a long time (~5 years?), which would greatly slow adoption of new features and make testing difficult, or
Allow uploads, accept the errors it will cause and let people figure out what to do. This will practically make people adopt the new wheels much slower because they won’t want to break users. I also predict it will be largely unpopular as it will cause many failures in people’s CI pipelines and the finger will get pointed at packaging as the cause.
Make new packages not selected by current resolvers, and require users to upgrade to access the latest packages. Notify the user if they are missing out on an update that requires upgrading pip.
To me (and based on the discussion so far, many others) it sounds like 3 is the most appealing, and it is what this PEP specifies.
I suspect the most popular and correct option would be to error out if the user is missing out on an update, so that they can pin/constrain their dependencies explicitly. Silently or quietly choosing an earlier version (or building from source) is worse, and can’t be overridden in the same way.[1]
Plus it requires users to update to an installer version that will properly resolve to an earlier version. They don’t do that right now.
And even if we achieve that, unless we batch up all the incompatible changes into the new suffix, there’ll just be a series of changes over time as we update the format, and these will essentially be guarded by the wheel version in exactly the same way as they already are.
I would propose to discourage uploads, and warn package maintainers that the new wheel features are new and shiny and not widely available, but there’s no need to ban them. Package maintainers are the best placed to decide when they’re ready to update, not us.
Well, I guess you can add minimum version constraints to all your dependencies so that they won’t resolve to older wheels and will only get the new ones or sdists, but that’s going to melt minds worse than adding a constraints file of maximum versions. ↩︎
Reading through the PEP and the thread, I understand the desire to separate “how can we create a transition process to a wheel 2.0” to “what should wheel 2.0 do”. But I’d say we do need to think about what we’d want to accomplish with later changes in wheel 2.0, because we need to be sure that the transition process we consider is general enough to allow such changes. The examples above are good ones. I would ask whether the transition process allows future wheel evolution that could eventually. . .
support totally externalized metadata (i.e., “everything you ever wanted to know about package X” is obtainable without ever downloading or installing the entire package)?
resolve the issues that crop up in various packaging threads of the form “we can’t do this because pip doesn’t have access to information X at the right time in the resolve/install process and/or can’t backtrack later”?
move away from storing any important information at all in the wheel filename (which seems to me to be obviously unsustainable)?
Conda, incidentally, already does this, although typically with hardlinks rather than symlinks. And the relevance to the current discussion is that (as usual) I would like to ask “if we do this, will it allow us to eventually solve all the problems like this that conda already solves”? (There are also some problems that conda doesn’t solve, of course, but I don’t see the point in heading down a particular road if it’s not even going to get us to where conda is today.)
I agree wholeheatedly with this and I have significant trepidation about it. I think the main way to do this is to make sure the disruption is really worth it in the sense that using the new format will bring tangible benefits to users. I don’t think we can know that without looking forward a bit from “how do we transition” to “what are we going to transition to”.
Wholeheartedly agree. I have a long list of improvements that are achievable on a variety of time scales. I made this PEP specify as little as possible about the contents of a .whlx file other than .dist-info/METADATA and that it is a zip file because I want the format to be flexible enough to support these improvements. Here are some of the potential changes I’ve been thinking over:
Nearer term:
information on top-level packages
introduce metadata.json
move compatibility tags out of filename
(related to previous) variants
stricter dist-info naming? (I need to look into this one)
Wheel-Feature (i.e. feature flags for wheels, instead of major version bumps)
Support for specifying dynamic library dependencies across wheels
Longer term:
zstd compression
remove .dist-info/METADATA
I’m probably forgetting a few as well, but these are the ones I have at top of mind.
If you have more information about these kinds of issues, I’d be interested in learning how changes to the wheel spec could improve the status quo.
I think if wheel 2 comes alongside variants that may be of enough interest to make many users interested. However, I don’t think the features of wheel 2 need to be specified in this PEP. Wheel 2 will hopefully be the first in a series of improvements that should address many user pain points.
Users typically may upgrade for other reasons, so they’ll get there at some point. It’s just that we shouldn’t break their usage just for the sake of having them upgrade.
Do you mean, influence for or against? If, as I think is likely, there’s no way to evolve the wheel standard beyond v1 without changing the extension, at least I hope we only have to change the extension once while still allowing for evolution in the future.
Isn’t this a point against zip, not for since it locks us into zip or we’re back to needing a new extension? I understand that there is an appeal to using an existing format rather than a bespoke format where the version is just at a fixed offset, but I don’t think this in particular is a point in favor of zip.
The over-arching goal of PEP 777 and wheel-next is to accomplish significant changes to wheels (such as compression, variants, and symlinks) while allowing these changes to be adopted iteratively. I don’t want to re-write everything about wheels into a new format completely, because I think that would take much more work than is needed. If you have a feature that is impossible with the current zip format, or think the current format has issues that couldn’t be fixed in a future PEP, I definitely would like to hear about that.
That being said, one advantage of zip files that just occured to me is that we could include the wheel version as a binary prefix before the actual wheel contents. Since a zip file’s central directory entry is at the end of the file (see ZIP (file format) - Wikipedia), arbitrary bytes may be put at the front of the file (which is how things like self-extracting ZIP files work). So we could put a header in the front of the wheel file with relevant metadata in some binary encoding. That would side-step the need to read the zip at all and make resolvers not need range requests for performance purposes. I am a bit worried we will want to start stuffing more and more information in that prefix over time. But maybe that is a better tradeoff than forcing things to extract metadata in the zip file.
The downside of this of course is that any information in the prefix would be lost if someone unzips the file normally, as the zip only points to the first file or directory entry which starts after the binary prefix. If we were to do the binary prefix, I’d probably want the spec to require the contents match some human readable text files in the zip.
Okay, I can see how re-writing the entire format would be more work than anyone wants right now, and I don’t think a custom format needs to be a full rewrite, but we can still do better than locking in zip for eternity (at least without asking for .whlx where x actually is a version number) while keeping handling simple now.
prepending a version as a single byte at offset 0 (network order) is already supported by zip files as you pointed out, but I would want to leave open that the only thing that should be assumed is that the rest of the file corresponds to the version.
any future wheel version that has a reason to change the inner archive format can, the version is the very first byte, and the version says what the remainder of the file is. In the future that might be .tar.zst, or maybe someone comes up with something even better than that, but we’re not boxing ourselves into zip forever.
While it’s entirely unlikely that we exhaust that many major (incompatible) versions, as long as it is addressed prior to reaching major version 255, there’s always a way to extend this because reading the first byte gives enough information for tools to check if they know what to do with anything else. 255 could mean “read next byte to determine what to do”
Tools consuming a wheel read the first byte, and if it corresponds to a version they understand, use the file as specified for that version (in this case, as a zip file matching current specification matching current needs)
I was thinking this should not replace the .dist-info files and only be an implementation detail for tools that create and consume wheels, unpacking no longer containing reasonably understandable data wasn’t something I would suggest. especially since this does also means that currently compliant tooling would read the .dist-info files (and be able to) and see that the wheel version is higher than they understand, meaning there shouldn’t ever end up a single major version upgrade where a tool doesn’t know that something is possibly just a valid wheel it doesn’t understand. (going from current status quo to some future non-zip however, would be too far. In fact, it should only ever be tools that don’t understand the next version unable to detect a version mismatch for some further future version that may no longer be a zip file)
If a version in the future wants to put enough data at fixed offsets that it becomes a real concern rather than 1 byte extra, we can ensure robust tools are available to reconstruct human parsable files as a prerequisite of that.
What’s the specific need for changing the extension ?
Could we not just emit the metadata version in a way that’s compatible for both version but set it to some value that indicates the data will be adhering to the wheel 2 formal ?
I will note that I routinely use unzip -p to display the content of the metadata file in a wheel. Switching from zip format would require a dedicated tool, which would be significantly less convenient for the sort of quick initial debugging I’m thinking about.
Unless there’s a really good reason for dropping zip format, I’m -1. Arguments that “it might be more flexible in the future” feel too theoretical for me.
So that older tools, which fail on seeing an incompatible version, ignore new style wheels. This is discussed in the PEP, so I suggest you read the details there if you haven’t already.
Technically yes, but please don’t. We already have two metadata files that can be found within the zip and it seems are unlikely to go anywhere, so let’s just use either of them. All we’d gain from this idea is to switch away from a ZIP file entirely, but then the prefix may prevent us from choosing other formats.
The downsides of being a non-standard ZIP file are significant, and will inhibit broad adoption way more than any of the other ideas.
I’m sure it was discussed somewhere, but if we really want installers to automatically select an old version because the installer is too old to install the current version, then surely adding a wheel-requires data field to the index is more in line with how we do the same thing for Python versions? (I’m -1 on both ideas, to be clear, because I think automatically installing an old version when you could’ve just updated pip is a terrible and potentially dangerous experience.)
I read the main section regarding resolution which seem to indicate the main need for the change is because the wheel version will be added to the metadata and existing tools won’t use it
But existing tools already fail if the metadata version is bumped up to a new major version
so could we not just do that ?
The aim is to avoid those failures, so that if your installer is old, you will get whatever is the normal behaviour of your installer for when no wheels are available (usually, that’s to build from sdist, but you can often customise it to select an older version instead).
And existing tools need to download the wheel before they learn the metadata version, so this change would save a lot of time for them. Again this is in the PEP.