This got me thinking some more about PEP 825 and its role as a specification of the data model underlying wheel variants. I see three key ways in which variant data needs to be handled in a data model:
- Recording the compatible variant values for a wheel.
- Recording the priorities to be used when selecting a wheel for an environment.
- Recording the supported variant values for an environment.
Item (3) is important when considering cross-environment installs, as in the use case @cjames23 described above. It’s not so much about the UI for cross installs, but more about having a format with which we can persist the description of an environment for use in such a UI.
I’ll discuss the 3 items in turn.
Wheel variant values
PEP 825 supports this pretty well right now. The description of the wheel filename format, labels, and the variant.json file format cover this very effectively. The one qualification I’d make is that for this part of the spec, the default-priorities part of variant.json is not required. See below under “Variant ordering” for a discussion of this data.
For this set of data, the index-level variants.json files are a pure optimisation. When interpreting the label of a wheel served from an index, you can read the index-level variant.json file and use that to interpret the label, rather than having to read and unpack the wheel itself. There’s no need to consider multiple indexes, as everything here is operating on a single wheel from a single index.
Variant ordering
This is less clear, IMO. The problem is that ordering of variant values is not a wheel-level property. Individual wheels have no concept of “other wheels” - in all existing standardised contexts, a wheel can be viewed in isolation, with no need to know where it came from, or what other wheels might exist.
I have no problem with the PEP’s approach of having a default-priorities table to define ordering, or of allowing tools to define a UI to override the provided defaults, but I do have a problem with publishing that data at the wheel level. It simply isn’t wheel-level information. At the most fundamental level, ordering data is a resolution-level data item, but in the absence of a standardised idea of a resolver, the best we can do is to say that it’s a “group of wheels” level item.
The obvious groups of wheels that exist are indexes, and (not standardised) directories full of wheels (pip’s --find-links option). IMO, ordering data should be standardised at that level, and should not be present in individual wheels. That still leaves us with a need to define how we order wheels that come from different groups (indexes or other sources of ordering data), but at least the problem is clear now - we can’t know which of two wheels is “better” unless we have a single ordering relation, so we have to merge the data from the indexes the two wheels come from to obtain that ordering relation.
Additionally, by making ordering data a standalone concept applied to a group of wheels, it becomes possible for installers to support overrides by simply having a --variant-ordering flag that takes a file of ordering data in the standard format. I don’t know if there are use cases for that much configurability, but the fact that it’s possible can’t be a bad thing. It is worth noting that for something like this to work, it would be necessary to be able to store ordering data independently from the mapping between labels and variant properties. I don’t think that’s a bad thing, though - storing them in the same file is (as I’ve noted) an artificial association, not justified by the underlying data model.
Environment information
This one currently isn’t covered by PEP 825 at all, and IMO it needs to be. We should define a format for storing in a file the variant values supported by an environment. That format could be nothing more than the same variant.json format that we use for wheel compatibility (with or without the ordering data, I don’t have a good feel for whether that’s needed), but the key here is that we state explicitly that it’s the standard format for storing an “environment description”.
Having a standard file allows for “offline capture” of an environment, as well as manual creation of a description file in cases where the target environment is so locked down that normal discovery methods aren’t permitted. It also simplifies UI decisions for tools - it’s far easier to support a single --target-env-file option than to need a complicated array of individual flags.
Once we have an environment description format, future PEPs describing discovery of variant properties for an environment (e.g., plugins) can be described simply in terms of adding data to that file format, and the specification of the variant selection process can be described in terms of reading the data from that file format. A lot of interdependencies go away if we have a well-defined interoperability data structure.
Disclaimer
I want to be clear on what my intention is when posting this. I am juggling various roles (PEP delegate, pip maintainer, interested community member) when posting here, and I want to be sure people don’t misinterpret what I’m saying.
This should be read as a post by an “interested community member”. It’s a possible redesign of how PEP 825 could present the variant data format(s) that feels significantly cleaner to me, and I want to offer it as a suggestion. As a pip maintainer, I doubt it will significantly affect how pip implements variants.
As PEP delegate, I obviously have a liking for this framing of PEP 825, but I’m not saying “rework the PEP this way or I won’t accept it”. I have to be honest, I do think I’d find it easier to accept a PEP that was written from this perspective, but that’s mostly because I see it as clearer than the current version of the PEP. At a minimum, I’d like the PEP authors to consider whether this approach would work for them.