While talking with people about a wheel 2.0 design, it became very clear that before we could talk about what a wheel 2.0 could look like, we needed to talk about how to get there (beyond just incrementing the wheel major version number!).
This PEP defines a path to making wheel evolution easier, so that future PEPs can focus on the changes to the format and not get bogged down by details of how to deploy the update.
My hope is that once we have a compatibility story, we can move forward with discussions about what a wheel 2.0 should look like. If youâre interested in discussing that, come join us at the wheel-next ideas repo or in the #wheel-next channel on Discord!
An immediate but very minor nit. If weâre going to require a new wheel file extension, and itâs not going to be 3 characters anyway, why not just use .wheel?
Iâm still digesting the actually interesting parts of the PEP
The x suffix in whlx is intended to evoke an advancement of the whl format, but I have no attachment to the naming. I figured that particular detail will get bikeshed here a fair bit. Iâm fine with wheel, but for the purposes of the main content of the PEP I donât think it really matters.
One thing not covered in the PEP is why not store the major version in the file extension, e.g., whl2?
Actually, using .whlx threw me initially as I thought it was a placeholder for the major version digit, so I would either make that explicit or just make the switch to .wheel now in case anyone else gets confused.
I donât want to reveal any spoilers, but the plan is that there is an extension mechanism such that you wonât need to rev the major version number of the wheel spec in an backward incompatible way again.
Yeah, I definitely need to add this to rejected ideas. I have another draft PEP (that Barry alluded to) that I hope to polish soon that would introduce feature flags to wheels (with similar semantics as a major version bump, but allowing for clearer communication of intent). I think feature flags better encode the idea behind some changes, but others definitely seem like a real major version bump.
I think there are three issues with using whl2:
You need to encode the major version in the wheel name going forward, otherwise youâd have the confusing situation of a wheel of major version 3 named whl2
Part of the brittleness of the current wheel spec comes from encoding so much information into the filename. Filenames arenât well suited for storing complicated structured information. I hope with wheel 2, we can have a wheel format that encodes not much more than the name and version of the distribution. So putting the wheel version into the name goes against this goal.
It becomes a lot harder to define âwhat is a wheel?â and it requires tools to adapt every new wheel major version. If Iâm making a windows file association for wheels, how many versions do I register? How forwards compatible is that?
Iâll jump on the bike shed early. Please letâs pick an extension thatâs not pronounced âwheelâ, which is how everyone Iâve talked with pronounces âwhlâ. âDid you mean a â.wheelâ file or a â.whlâ file?â sounds like confusion waiting to happen.
It would stay .wheel or .whlx or whatever we bikeshed going forward. I will be explicit about this.
Thank you!
I chose this invariant because tools will need to read .dist-info/METADATA or .dist-info/WHEEL to be able to tell what the wheel major version is and if they can install a file on disk. Unless we go with .whl2, whl3, etc., this will need to continue to work for all future versions of the wheel specification. I should probably clarify the rationale for this in the PEP.
I can understand that this mechanism needs to be invariant moving forward. If thereâs any reason at all to switch to something else it would need to be now, while changing the extension.
Thatâs not to say that it should changeâthe only other option I can think of is a tarball and that doesnât seem obviously better.
Maybe a tar (with metadata files at the beginning of the archive if reading some files is desirable without having to read the whole archive) combined with a stream compression algorithm like zstd? I have no idea though how much reduction in file size this would actually give for real world packages compared to zip.
Wouldnât this be the perfect time to switch to .dist-info/METADATA.json? Since it has a different extension, an installer needs to know about the extension to read it, so might as well change now. Though a METADATA file could/would be required as well for a while for extraction into site-packages. Maybe that could be Python version specific?
Agreed the change would need to happen now. I donât think we should change it however for a few reasons:
A future wheel version could provide better compression by putting non-metadata files into a .tar.zstd or some other compressed tar file and require installers decompress that in some way. The metadata would be accessible the exact same as past versions, but large shared libraries or other content could be compressed significantly. The outer compression format does not need to change to take advantage of compression.
I donât think itâs a good idea to boil the oceans on the format, we could make something completely different from a wheel, but that would require significantly more work for tools, and a much more involved migration. Unless there is some reason an outer zip file is a problem (see next point to the contrary), I donât think it makes sense to change things.
zip files have some nice features tar files donât, such as random access. pip and uv both use this to do HTTP range requests when supported if an index doesnât serve the metadata file, and this wouldnât be possible with an outer tar file.
Iâll include these points in a rejected idea about changing the outer wheel format.
I think that is a topic that would best be put in a wheel 2.0 PEP specifying changes to the file format, not this PEP that specifies how to change the file format in such a PEP. When I do write up the 2.0 format spec, I plan on including a metadata.json file.
Sorry for triggering a big bike shedding argument straight off, but I agree, the rest seems good.
One substantive question I have is around the other places core metadata is stored. Would metadata in sdists and on disk in installed distributions be expected to omit the wheel version, or will it be optional but meaningless in those places? This PEP will need to more formally define the new metadata item (in the same sort of format as the existing definitions - for reference, âDynamicâ is an example of an existing item that is only meaningful in one file format).
I was expecting the new extension to be bikeshed, so no worries. Glad you like the rest! Would you be content with .whlx if I added a section going over some of the mentioned alternatives in rejected ideas and clarified that x does not mean the major version when introducing .whlx?
My thinking on this is that it should only be allowed in wheels, served from an index via PEP 658 (when pulled from a wheel), or potentially on disk in the installed directory. Iâm not as sure about the last one as the other two. Itâs not a big ask for installers to just strip it out at install time, but maybe someone will want to inspect the information? I donât think thereâs a reason not to let it be installed into .dist-info/METADATA, so I think I would err on the side of not making the installation process more complicated.
FWIW I would personally avoid saying that a field MUST NOT appear in another context, but only that it MUST NOT be used to change the interpretation of that format, if found.
If you say MUST NOT, then any tool that wants to validate will need to enforce that rule even if it makes no difference to the operation of that tool. Ignoring extraneous metadata is a simple, forward-compatible default.
My main dislike of the x is that it feels reminiscent of its use in .docx and .xlsx to mean âextended versionâ, and in Windows SuchAndSuchEx APIs with the same meaning. Because itâs common in Microsoft products, I have a vague feeling that itâs some sort of âcorporate over-engineeringâ. Itâs also a dead end, in that if we ever need to do this again, .whlxx just feels silly.
I can certainly live with it, but my main complaint is why not use a readable extension like .wheel? @ericvsmith mentioned the potential for confusing when speaking because .whl and .wheel could be pronounced the same, and I guess thatâs a fair point, but I hope we donât all end up referring to âWheel-Xâ files, so I think verbal distinction is just something weâll need to sort out as we go allong (âNew wheelâ works just fine for meâŚ)
It is bikeshedding, though, and if you say the PEPâs going to choose .whlx, then thatâs your right as the author. I appreciate you taking the question seriously, but Iâm not going to make a fuss about it.
My feeling is:
It should be prohibited in sdists.
It should be mandatory in (new) wheels.
PEP 658 metadata files have to match whatâs in the file itself - the PEP says:
The metadata must only be served for standards-compliant distributions such as wheels [wheel] and sdists [sdist], and must be identical to the distributionâs canonical metadata file, such as a wheelâs METADATA file in the .dist-info directory [dist-info].
The hard one is installed distributions. I really donât want to add complexity to the process of installing a wheel - at the moment, itâs âunpack and copy a bunch of filesâ. If we require modifying the metadata, that means that file needs to be rewritten, and the RECORD file needs modifying to correct the size and hash of the METADATA file. And I bet weâll end up with mistakes being made resulting in installations where RECORD wasnât corrected.
Overall, I think we should require that installing a distribution from a wheel must continue to copy METADATA and RECORD unchanged. So the wheel version metadata may be present in an installed distribution. However, while thereâs no standard saying how to install a package from anything other than a wheel, thereâs nothing prohibiting a user doing that manually. So I think we have to say that the wheel version metadata is optional when a package was not installed from a wheel.
I wonder how distributions will view this? I believe they create their distro packages by building and installing wheels into an isolated area, and then repackaging that into a distro-specific format. I could interpret that as being a case of not installing from a wheel, although I doubt anyone would actually care.
Long story short - IMO for installed packages the wheel version metadata should be optional, but the spec for installing from a new-style wheel should explicitly state that METDATA (and its RECORD entry) must be copied unchanged (so that the wheel version is always present for packages installed from a wheel).