Implementation variants: rehashing and refocusing

One problem here, which I’ve alluded to before, is when neither A nor B depend on the other, but the semantics of a particular variant set requires that both A and B must use the same variant value. If A is already installed, pip install B has no awareness of the existence of A, and therefore cannot know what variant of A is installed. The numpy/scipy relationship is almost like this, it’s just that because scipy depends on numpy, in the “normal” cases pip will see both. You can generate the issue artificially, though - if someone does pip install scipy followed by pip uninstall numpy[1], maybe by accident, then a subsequent attempt to fix the issue by running pip install numpy would require pip to look at the scipy in the environment, and the existing resolution process won’t discover that need.

Of course, pip could include all installed packages in its solution constraints, rather than what it currently does (which is to only include the specified packages and their dependencies, recursively). But that would potentially make the resolution process significantly more complex (and hence slower) when installing into an environment with many packages installed, because the dependency graph is significantly larger (and resolution algorithms are exponential in the size of the dependency graph[2]).

If we need to consider everything that’s installed, then we’ll have to accept the impact that will have. But let’s make sure the benefit is worth the cost. At the moment, that’s not obvious to me.


  1. Which will work - neither pip nor uv blocks uninstalls that break the environment ↩︎

  2. Technically, the resolution problem is NP-hard ↩︎

This post has a high risk of sidetracking things. I think it is relevant to Paul’s point, so I’m putting it up, but my point here is to discuss what is necessary for a sane solve, not to nitpick or put down existing approaches.

This has always seemed like a risk to me. In my work assembling environments for production purposes, I have taken to always specifying all of my existing packages as well as any new specs. The patterns end up looking like:

  • build a bunch of wheels
  • download them into one place
  • run pip download, providing all of my built wheels as specs, to satisfy a coherent set of dependencies.
  • Provide the wheels and the downloaded dependencies as a whole, to one install command.

If I get complacent and split my installs across more than one command, I can easily end up with some package in my environment upgraded past the constraints listed in another package. This frequently happens when assembling the test environment on top of the runtime environment. That feels very counter-intuitive to me. If something is in my environment, I expect its dependencies requirements to continue to be met.

I suspect that this was the end result of a long discussion elsewhere, and I’ll gladly defer to any decision made there, especially since it is established behavior now. I’d certainly appreciate a pointer to that discussion, though. I suspect that my inclination for whole-environment solutions from my conda-based upbringing will bias me strongly here, but I’ll work to keep an open mind.

One thing that might make it easier for variant stuff is that the number of packages with variants and the number of variant values is likely very small, compared the number of packages in an environment, so the cost of considering them as a whole shouldn’t be as bad as the whole-environment package dependency graph.

3 Likes

FWIW, there’s an open issue about pip “fixing” its behaviour and enforcing consistent environments: Take installed packages into account when upgrading another package · Issue #9094 · pypa/pip · GitHub

It is technically feasible for pip’s resolver to take existing packages into account when doing upgrades. IIRC, I’d pushed back on implementing it at the time since I didn’t trust the metadata in the ecosystem at the time and we didn’t have any mechanism for ignoring bad metadata within the resolver (we still don’t, but both can be fixed/changed). And, there was also limited time available in the grant-funded work and we couldn’t have resolved this in time.

I do agree that we should avoid going into more detail about the pip’s resolver behaviours here – if someone wants to do that, let’s do that in a separate place/topic/whatever.

3 Likes

I wouldn’t see any of that intelligence living in installers, they’d continue to keep doing what they’re told explicitly to do. In the absence of more specific requests in the given requirements set (including the already installed variants of packages in that set), they’d continue to grab the default variants of any dependencies (which would presumably continue to emphasise portability over performance).

Variant selectors would instead live at a similar level to locking tools like PDM, poetry and pipenv: given a vaguely specified list of dependencies, they would turn it into a concrete list of specific variants and versions.

Most likely, I would see such selectors being defined by the projects that actually offer non-trivial numbers of different build variants, like numpy, pytorch, tensorflow, etc. I’m not even sure they would need a standardised API - they could just be regular Python libraries that accepted a system description (in a format they define) and return a ranked list of variant recommendations for that selector. They would also likely need to offer a way to build system descriptions from actual systems, as well as check that a given system matched a particular system description (ala environment markers).

Locking tools that wanted to manage particular selectors more optimally could then integrate these libraries as they chose.

Maybe a plausible selector API standardisation proposal would then emerge over time, but I don’t believe one needs to be defined up front for the idea of build variants to be useful.

It’s just namespacing that avoids claiming all possible future names for variant categories as environment marker terms, and allows existing environment marker terms to be used as variant category names (the way some packages work, platforms are build variants since their dependency metadata changes).

If both numpy and scipy used the same selector package, then that package could look for either of the targets to be installed, figure out which variant is used, and emit only that variant. If neither is installed, then it could emit all variants. Or users could be explicit about the variant they want/need.

However, as you point out, there are many ways to get into this situation already. We should keep it in mind, but I don’t want us to block making progress on solving the variant challenge based on this behavior. I don’t think variants would make the situation appreciably worse for this case and, as Michael points out, there are procedural workarounds and best practices to avoid it.

OK, I think you’re looking at variants as something that users specify and I’m looking at them as a way for installers to choose binaries compatible with a given host/environment. Those are different cases. I agree sometimes users need to be able to specify the variant, and if our first iteration relies on that it will still be progress, but my ultimate goal is to have the hardware detection information provided automatically by having the target package indicate which selector package can be used to feed information into the file selection process after a dependency is resolved.

I’m having trouble understanding what that means. I think you’re saying the workflow for managing variants exists outside of the installer?

I’m looking for pip install torch to do the “right” thing for an environment. And maybe pip install --variant accelerator=cuda torch as a way for the user to optionally express that they want to override some selector package information. What workflow are you expecting?

Yes, that’s the model for delivering them that I have in my head, too.

I don’t anticipate lock files being a prerequisite for using selectors. Selectors should integrate with locking semantics, but users shouldn’t be required to use lock files to get the right binaries in the simple case.

OK, that makes sense. I can go along with it, but I think we could live without it, too.

1 Like

What are “the right binaries” though? If the user is trying to build a portable environment and the installer tightly couples it to the build hardware, then the installer has done the wrong thing.

Hence my suggestion that we view this as a two level problem:

  • install packages that will work in the current environment while meeting the specified requirements (baseline installer capability)
  • install packages that meet a given definition of “best” (optional enhanced installer capability)

The parallel with lock files is that while lock files are recommended for many use cases, you can build a compliant Python package installer without supporting locking. My suggestion is that we approach optimising installers the same way: you can build a compliant installer that treats build variants as opaque strings and does only exactly what it is told to do (grabbing default variants otherwise), but you should also be able to build a “smart” installer that understands the semantics of the build variants defined by key packages and can thus choose appropriate build variants (other than the defaults) when given a desired outcome like “portability” or “speed on this hardware” or “low memory use”.

We’ve had locking tools for years, and we’re only just starting to get a sufficient grip on the problem space to see a potential path to standardising it well enough for it to become a feature that pip offers (it’s not there yet, but Brett’s latest investigation into the idea is looking promising). If locking support had been made an a priori requirement for picking a baseline installer, we’d potentially be a decade behind where we currently are.

Optimising variant selection for particular use cases is absolutely going to be an important use case for build variants, but I don’t think it is something that needs to be defined as a “de jure” standard up front. Instead, I think we could tackle the easier (but still not easy) task of making optimising installers possible, waiting to see what de facto best practices emerge, and then, if it seems worthwhile to do so, inferring a written standard for build variant optimisation based on that. (For “extras”, for example, while semantic conventions around things like declaring docs and test dependencies have emerged, only build dependencies have so far risen to the level of needing formal standardisation)

1 Like

I agree, those are different cases that should be tackled in the order you list them.

In my earlier post, I (lazily) used “best” to mean “work for current hardware” or even “most optimized for current hardware”.

I’m less confident it’s going to make sense to build the semantics into the installer, but I don’t think anything we’ve talked about would prevent it. I mostly worry that it tightly couples the installer itself to concerns that might be very specific to a given software stack.

So instead of teaching pip or uv to understand variants, you want someone to fork them to create a completely different installer that understands the feature? Or do you mean just for the installer that understands specific kinds of variants and does the best optimization?

Speaking for pip, I’d be inclined to assume that we would provide the “baseline” functionality and leave providing specialised, “enhanced” capabilities to other tools. Like you say, I don’t think I’d want pip to be that closely tied to specifics of the semantics of particular variants.

Where I differ from @ncoghlan is that I’m not convinced that this is a good outcome. For years now, we’ve had a backlog of feature requests for pip that represent functionality that people wanted in an installer. Many could have been implemented as wrappers to pip - not even requiring a fork. Nobody has ever done that, though, suggesting that having different, specialised installers for particular tasks is not something that end users want or would support.

Unfortunately, I don’t have a better answer here. The way variants are being presented here makes me feel that “enhanced” functionality isn’t ever likely to be something that a generalised installer could support, so I don’t really know what the next step is. It’s possible that with more detail about the precise use cases, we could find some middle ground between “baseline” and “enhanced”, but I feel that in order to do that we’d need to give up on the idea of a general mechanism.

2 Likes

What use cases are there for an “enhanced mode” that wouldn’t be met by letting the user (or a wrapper) tell the installer that instead of looking up variant settings using selector plugins it should use values passed in by the user (either directly on the command line or in a lock file of some sort)? I think the “portable wheel set” case is met with that approach. What other cases are you thinking of, @ncoghlan ?

I’m getting confused now. I thought it was you, @dhellmann, who were arguing that installers should pick the “best” solution (for example, picking x86_64_v4 over x86_64_v1 if the target machine supports it).

Maybe that’s not what you mean by “best” - but that’s why I said I didn’t want pip to have to understand variant semantics in my previous post, because it’s not clear to me what people expect or want beyond “something that works”.

1 Like

One of the main areas where our perspectives differ is that you see the notion of “selector plugins” at the baseline installer level as substantially more viable than I do. We still don’t even have platform plugins that would allow pip et al to ask the underlying platform (conda, Fedora, Debian, etc) if it can provide particular dependencies (especially non-Python build dependencies like a C/C++ or Fortran compiler), so the idea that we’d be able to come up with a viable plugin API for a completely new aspect of dependency management that we don’t have any experience with seems overly optimistic to me. Coming up with a standardised way to ask a variant selector to pick the “best” option is far from a simple problem, since the platform features that matter depend greatly on the specific build variants that a given library happens to define (more on that below).

The natural candidates for taking early advantage of published build variant metadata are the folks already using transitive dependency locking tools to get reprocible builds (introducing additional selection logic to obtain more optimal builds), as well as the folks consuming Python packages and repackaging them for other installation formats (such as conda and the Linux distributions).

As an example of how that could work as a wrapper that takes advantage of existing tools without changing their behaviour (beyond basic build variant support), consider a tool that is designed to switch the generic pytorch build out for a more optimal one (perhaps available as python -m torch optimize-installed-build-variants --unlocked=requirements.in --locked=requirements.txt or similar). Given a fully locked tree derived from a requirement set (or the installed environment via pip freeze), it can run through looking for any packages that define “torch-style” selectors, assuming they mean the same thing as they do for torch. After swapping all of those for a more specific variant, it can feed them plus the original unlocked requirement set back into the resolver and see if it still resolves, and then check the result to see if it has pulled in any more default builds that need to be swapped out, etc. Running multiple full resolution iterations like that is a slow way to approach the problem, so the next step would be to figure out a more efficient way to directly integrate the variant checks into the transitive dependency locking process, but it would still provide a way to figure out the variant selection logic itself before having to work out a viable selector plugin API for dependency lockers and/or package installers).

As folks work on filtering the available build variants based on their specific needs, it may turn out that there’s a viable path to move smart variant selection further down the stack and make it a baseline installer capability (including integrating with dependency lockers, and then following that capability down the stack once a lockfile format is eventually standardised), but even if we never reach that step, simply allowing projects to define the build variants that are available would be a major step forward in metadata clarity, since it would help make it clear what folks are giving up by using the pre-published binaries instead of running their own source builds and optimising them for their hardware.

Mostly just the three I mentioned:

  • “Optimise for portability” (this is the status quo, and should remain the default)
  • “Optimise for fastest possible execution on a given set of hardware” (the lack of this is the main downside of the status quo for ML/AI end users on machines with powerful vector processing capabilities that the portable libraries may not be exploiting to their full extent)
  • “Optimised for lowest possible runtime memory consumption” (while even “embedded” use cases are often starting to measure RAM in gigabytes these days, there are still some situations where minimising the amount of memory used is more important than raw execution speed)

I don’t see a lot of major variations emerging beyond the classic space/speed/portability trade-offs, and it may be that the frequency with which folks seek low memory consumption will be rare enough as to be negligible compared to the other two cases (while the popularity of turning off docstrings shows folks do care about memory consumption, leaving them in at runtime is also pretty much a pure cost with no compensating benefit in many situations).

However, one specific area where attempting to define a general purpose selecter API worries me is that I definitely don’t want to see things go backwards when it comes to the partial steps that have been made towards properly supporting cross-builds (where the system downloading and unpacking distributions may not have the same platform details as the eventual deployment target), so any standardised selection logic proposal that only worked when the deployment target is also the system running the install would be a non-starter for me. I am also genuinely doubtful of our ability to come up with a standardised selector API design that is sufficiently specific as to allow selectors to make good choices about the available build variants, while also being sufficiently flexible as to allow the design to gracefully adapt to future evolutions in computer hardware design (a major factor in my skepticism on that front is the fact our current collective level of support for “build for platform compatibility tag X-Y-Z rather than for the currently running platform” is still as weak as it is, since that’s theoretically a simpler problem, with more well-established precendents to follow from other ecosystems).

2 Likes

Yeah, “best” is very fuzzy. Let me try to be more precise.

Each version of a package will have an ordered set of selection rules, and each rule will include some variables.

Packagers will define the selection rules and their order based on how the variants of the package are built. Only packagers know what is in each file and packagers are therefore most informed about what rules might apply to choosing one file over another.

Selector packages should report values for variables that can be used to evaluate the selection rules to pick a file. Any given variable can have multiple values, and so the selector should report them in order that they should be tried (for example, from most specific instruction set to least-specific). Package publishers and selector package authors have to agree on what the variables mean, but those meanings are domain specific knowledge that the installer doesn’t have to understand. It just has to evaluate the rules.

The installer should use the already implemented characteristics (version, platform tags, etc.) to choose a candidate. That will give it a list of potential files from which it needs to make a selection – the actual wheel file to download. The installer should iterate over the rules and variables until it gets one combination to report true. The file associated with that rule is the file to use. We need to specify how the loops are nested (whether the installer loops over all of the rules with a given variable value, or loops over all of the variables for 1 rule before moving on to the next). Let’s set that question aside for now and come back to that.

If no rules ever evaluate to true, there is no usable build of the candidate and it should be rejected. The existing logic for dealing with that by starting again and choosing another version should be applied. This is equivalent to someone only publishing an x86_64 wheel, with no sdist, and then someone trying to install that on an aarch64 system. Applying the same logic should make it easy for users to understand why a version might be rejected, and it should make implementation in the installer simpler because it will be localized to the file selection code.

Packagers can provide a “default” file for a package by including a wheel with a selector rule sure to evaluate to true (for the hardware acceleration case, maybe the rule is even empty). That probably points towards looking at each rule/wheel with all of the possible variable values before moving on to the next rule/wheel.

Placing the rules in order gives the installer a chance to select something the publisher and selector package author think is optimal based on whatever considerations apply for their use case (taking advantage of hardware capabilities, default library compatibility settings, etc.). When I’ve said “best”, this is what I’ve meant.

Whether that is the “best” outcome, though, depends on what the user wants. I assert that in most cases it will be, but @ncoghlan and others have rightly brought up the portable wheel set as an example of one way the definition of “best” may be different from what the rules would give. If the user wants to define “best” their own way, they need to give the installer instructions for that new definition. To me, the most logical way to do that is have them override the selector package so that variables going into the rules have values the user chooses instead of values the code in the selector package chooses. That gives the user direct control of the rules that evaluate to true. I think of that like pinning the version in the candidate selection process.

Asking the user for variable values, instead of variant names, comes at the cost of them having to know the name of that variable and the value to provide, but to compensate for that UX the same variable would apply to all packages they are installing, regardless of what other axes packagers might have used to create additional variants. So, the user can ask for the lowest common denominator hardware compatibility without also having to specify library compatibility requirements, or vice versa.

So, what I was asking, is what other cases do we have for the user to “redefine best” that cannot be handled by giving them the ability to just specify the variables directly? Why do we need more than ordered selection rules, variables, and user-given overrides?

I do see the selector problem as easier. For one thing, it’s entirely within the PPA’s control to define the standard (with community input, etc., of course). It doesn’t require a bunch of completely independent ecosystems to agree on a standard, though.

And the selector API is purposefully described as just providing values to evaluate rules. “What hardware acceleration is supported on this host?” is an isolated question that doesn’t require that code to interact with anything else. The library compatibility question is a little different, but it can still be described as “list the library variants in the order they should be tried”.

I want the order of those first 2 to be switched, especially in these AI/ML cases. For the vast majority of packages, portability will be the default because there won’t even be variants. When someone goes to the trouble to publish variants, though, they do it for a reason and it should be possible for the variants to be considered before the default, if that’s what the package publisher thinks best meets their users’ needs.

I can see memory consumption being a significant, if not primary, concern on edge devices. So, yes, as a user consuming packages I would want a way to express that. Possibly even to indicate permanently that packages with low-memory variants should always be selected over other choices.

Definitely. As someone working in this space with a specific outcome in mind, I have the same needs. That’s why I think we need a way to override the selector plugins. But the way I override may be different than the way you override – I’m going to definitely ask for specifically optimized variants because I know where the code I’m packaging is going to be running. That’s still an override, though, because I’m definitely not building the container images on the hosts with fancy GPUs. :slight_smile:

I get that. I’m pushing for a flexible design that lets the package authors and publishers define rules precisely because I don’t think we can predict all of the ways we will need different types of builds in the future. Maybe WASM builds become another variant. Maybe the communication interconnect between accelerators in clustered environments triggers another set of variants. I have no idea what’s going to be needed. But using variables in rules with values coming from selector packages lets us make progress while reusing existing features of the installers, and the approach at least looks flexible because the package publisher can make up whatever rules they want, as long as they also publish a selector package to provide values for the variables.

1 Like

This probably deserves its own thread, but pluggable build variants also helps with this problem.

Imagine a variant with a rule that included “operating_system_vendor = ‘Fedora’”. That then requires a selector package to at least say whether a system is running Fedora (maybe it reports all operating systems or all linux variants, it doesn’t matter).

A file with a rule indicating that its variant is built for Fedora could have additional platform-specific metadata, defined and injected into the wheel however the PPA likes, to list the RPMs that need to be installed for the Fedora-specific build to work. The installer wouldn’t have to do anything with that metadata on its own, and could completely ignore it.

I could envision another type of plugin (or extension if you don’t like plugins) for the installer, not even connected to the variant system. Call it a pre-install-integration plugin. The Fedora variant of the target package could indicate that it has a pre-install-integration plugin dependency on a package that knows how to read the extra metadata from the package that is being installed and use it to install any system-level dependencies defined in the metadata.

So, a package built for Fedora could link to external libraries in RPMs matching versions available on Fedora, and could express the need to have those libraries installed in order for the package to work. If the user doesn’t like those libraries for any reason, they can override the selection process to avoid choosing the Fedora variant of the wheel.

That entire system can be implemented without the Fedora community or dnf authors having to agree to any sort of API at all. Ideally they’d be on board with the idea, because they would be well placed to own the integration plugin and selector package and I’m sure Fedora users would want it shipped in Fedora by default. But anyone who really wanted it could take up that work by writing the plugins on their own.

1 Like

Thanks. I think I follow the process you’re describing here, but it’s only a part of the much larger resolution algorithm, and what you’ve (deliberately) omitted is integrating this description into the larger picture.

My reservation here is that because dependency resolution algorithms are already NP-hard, adding complexity to them is risky at best. And given that we have relatively few robust and well-known algorithms (backtracking, SAT and PubGrub are the main ones I know of), ensuring that we don’t add complexity that no known algorithm can handle is a genuine concern.

So I’m willing to accept your definition of “best” as well-defined, but I’m seriously concerned that it might not be practical or achievable. I don’t know how long we can reasonably continue discussing options when some of them have “might not be implementable in practice” caveats attached to them. Alyssa makes a good point that we’ve reached a similar impasse in other areas - compatibility with platform-supplied libraries and cross-platform builds are the two well known cases. I’m not sure why this case is different.

Maybe I’m wrong, and this idea is implementable. The only way I know of to be reasonably confident of that would be to document the complete resolution process, taking into account index lookups, candidate selection, discovery and running of selectors, dependency resolution and handling of building from source. With such a document we can at least have a shared view of the complexity, whereas right now we keep coming back to a position where we just agree to differ on whether the idea is practical.

3 Likes

My inclination would be to focus on the user-supplied scenario first, and encourage installers (or whoever needs the list of resources/preferences “supplied”) to allow it to be set in a configuration file. Then the problem of determining the “best” automatically becomes a case of updating configuration files, or at worst, telling the user what they have on their machine.

Apart from the implicit intradependency case (i.e. “numpy used MKL, so scipy should too”), it seems most variants are inherent properties of the environment, rather than actually deriving from package preferences. So setting the variant option once per environment would be most likely, and then intradependency consistency comes from there. Changing the option is a manual step, which means documentation exists, and it can contain the warnings about how much everything break.

3 Likes

I cannot absorb a thread of ~80 posts (many quite long) at once – so apologies if some of this got covered already –, but just some things that stood out to me:

There’s also:

  • license variants (e.g. some packages opt in or out of GPL dependencies), which is an even less structured problem space.

I’ve collected use cases and references for this here which is a somewhat-related feature request for conda (i.e. while conda is able to take into account variant constraints for resolving the environment, the user-facing API/UX for it is… poor).

This sounds very much what conda calls run_constrained: (the “run” is from runtime). Generally, this is useful for version constraints where you know that something breaks otherwise, e.g.

# say for package foo 3.0
requirements:
  [...]
  run:
    - regular_dependency
  run_constrained:
    # bar <1.1 still uses foo's deprecated API that got removed in v3.0
    - bar >=1.1

What this means is that if bar is present next to our foo 3.0 (it doesn’t need to be), then it has to satisfy that version constraint, otherwise foo 3.0 is not an eligible install choice.

Unsurprisingly, that same mechanism can be used to enforce variants like MKL, e.g. by setting something like run_constrained: blas=*=mkl.

So for enforcing uniformity of the BLAS-implementation (for example), it would suffice that packages add a constraint on which BLAS flavour they expect. The downside of this is that for any BLAS implementation to be viable, all BLAS-dependent packages a user cares about would then need to publish variants for that flavour.

How conda-forge solves this is that – with very few exceptions (e.g. where someone is explicitly depending on MKL-only APIs) --, we compile all packages against the “reference” BLAS/LAPACK API from netlib, which allows us to achieve ABI-compatibility between those flavours, meaning that even though there’s only one published variant, the actual BLAS implementation can be freely chosen (e.g. according to user preference) at installation time. In other words, packages we build against BLAS/LAPACK don’t actually have a variant constraint. Realistically though, this approach is too fragile for an author-driven publishing model.

Coming back to Steve’s example, what IMO would work though, is for packages to have a fall-back generic variant (without constraint) where they vendor BLAS or whatever package and everything is mangled for internal use. That way you could get away with having only an MKL-variant + the generic fallback, giving everyone something to install regardless of what other BLAS constraints they have in their environment, and without causing an undue build/publishing burden for the package maintainers.

5 Likes

OK, I genuinely think we (or at least me, @dhellmann, and @steve.dower) are all actually hoping for a pretty similar end state, we’re just disagreeing on what the most viable next step towards that goal would look like, as well as some of the specific technical details.

I also think I see a way of defining Doug’s selector module idea less in terms of actual accessible-at-runtime importable code (which is what was making me most skeptical about the idea), and more as a declaration of “variant semantics for this module are taken from this other module” (which seems like a plausible way of distributing the semantic definition problem in a way that is manageable without requiring fully centralised decision making).

Nominating interpretations for build variant semantics

I still like my tentatively proposed source metadata fields:

  • Provides-Variant: repeatable field defining available variants for source trees & sdists, singular field defining the built variant for wheels & installed packages (omitted for anonymous default variants)
  • Default-Variant: give the default variant a more meaningful name than just :default:
  • Requires-Dist-Variant: use the new specifier syntax that accepts variant details

To those could be added a fourth optional field:

  • Variant-Selectors: a comma-separated list of distribution names that define the variant categories used by this distribution

Omitting Variant-Selectors would be equivalent to specifying the distribution’s own name as the selector (i.e. it is defining its own variant categories without reference to any other project). That way coincidental name collisions between projects shouldn’t cause any problems.

When a distribution publishes variants without a named category, the distribution name would become the implied category. For example, if numpy declares mkl and openblas variants, then the corresponding variants would be numpy=mkl and numpy=openblas for any other project that declares Variant-Selectors: numpy.

Whether the coupling between the variant of the distribution setting Variant-Selectors and the distributions named in the setting is loose or tight would be controlled by the dependencies specified in Requires-Dist-Variant. It would even be fine for a project to specify a variant selector that it doesn’t depend on at all (e.g. another project might find the way numpy or torch define their build variants useful without actually using the packages themselves at runtime).

Build variants in environment markers

In the original post, I suggested using variant == "name" as the way to represent variants in environment markers, but then also went ahead and suggested the notion of compound variants that consist of a comma separated list of names. Adding categories makes the mismatch with equality checking even worse.

Treating variants as sets of features suggests that "featurename" in variant and "featurename" in variant.category would be a better way of spelling the checks. That spelling also allows clear subset checking via {"featurename1","featurename2"} in variant.

Build variants in dependency specifiers

I still like the parentheses based syntax I proposed previously, but now believe we should define two special variant names (any and default) that are disallowed as variant category names rather than just reserving default:

  • distname, distname(:any:): any build variant is acceptable
  • distname(), distname(:default:): specifically the default build variant (or a compatible variant) is wanted
  • distname(featurename): build variant must include featurename (comma-separated to specify a set of feature names)
  • distname(categoryname=featurename): build variant must include featurename in categoryname (comma-separated to specify a set of feature names)

I think build variants do genuinely need a bit more freedom than wheels have to alter their dependency trees (reasoning backwards from the existing projects that decline to abide by the assumption that all wheels built from a given sdist will declare the same dependency metadata), so I do think we want to make them conceptually an intermediate tier between source distributions and wheels, even though their physical artifact would be the same sdist as the default variant.

However, I also like the notion of combining a “preferred variant list” for the given selectors with the existing “preferred platform compatibility tags list” for binary artifacts.

Rather than attempting to query live Python modules for preferred variant lists for each selector, installers could instead get them from an input configuration file as Steve suggests. In the absence of such a config file, or when a distribution specifies an unknown selector, installers would fall back to the existing behaviour of preferring default variants for everything. When multiple selectors were listed, the overall list would combine each of their preferred variant lists combinatorially in the order given to choose an overall preferred build variant (with unknown selectors simply not contributing any new combinations).

The list of candidates would then be culled based on the active variant requirements for that distribution before being compared against the available artifacts.

To avoid weird resolution order dependent quirks in the resolution process the config file would also need to define a strict ordering for applying the known selectors when multiple names are listed for one distribution (otherwise you might get different resolution results depending on which package was resolved first when one specified Variant-Selectors: torch,numpy while the other specified Variant-Selectors: numpy,torch)

Unfortunately, while this approach would address my concerns about the viability of runtime plugin APIs (let’s not do that if we can define a viable static config file format instead), I doubt it will assuage @pf_moore’s resolution performance concerns. Anytime the word “combinatorial” legitimately enters the description of a process I get worried, let alone when its adding another case of it to a problem that is already NP-hard.


Tangent to add a belated reply to an older post in the thread:

Newer wheels for default variants should get the same names as they did before build variants were a possibility, so old versions of pip and other installers should continue working without any problems (even for newly released versions of projects that start defining build variants), blissfully unaware that build variants even exist.