Require any of several alternative package dependencies

Following my and not my SO questions, I’m asking to provide a new syntax of package requirements. Namely, to make a way to specify equally good alternatives for a required package.

System package managers already support alternatives. For example, Debian’s apt uses the following syntax to achieve the goal:

Depends: libc6 (>= 2.2.1), default-mta | mail-transport-agent

So, I suggest something similar. Take a look at the following line:

package-a | package-b > 3.14 | package-c; env_markers

In the example, the requirement is considered satisfied if

  • either package-a is installed,
  • or package-b is installed, and its version is > 3.14,
  • or package-c is installed, providing env_markers (whatever allowed by PEP 508) evaluate to True.

If the requirement is not satisfied, the first package that fulfills the environment markers should be installed.

The syntax I propose doesn’t break any existing code.

Real-life cases where such a syntax would be of a great help, include (but not limited to)

  • picking a Qt binding by pyqtgraph or QtPy,
  • installing only one (not all at once) MS Excel writer by pandas[excel].
1 Like

On the first read, this seems good. I wonder if something like this has been discussed before. Maybe the closest thing is the concept of “default extra”, but those two solutions do not overlap exactly.

Let’s take the example of App that depends on Default or Other. With the “default extra” solution user would install App, or App[default] to depend on Default, and App[other] to depend on Other. With this solution presented here, the user would install App, or App Default to depend on Default, and App Other to depend on Other.

Already installed Install dependency Default extra This solution
None None, Default, or Other
Default None, or Default
Default Other A D
Other None B
Other Default C E
Other Other

The cases B and C seem to actually be the same. In a default extra context installing App or App[default] should probably yield the exact same result (be it success or fail).

What would (or should) happen in cases A, B, C, D and E? It seems like in all cases it would result in both Default and Other being installed.

I would expect so, yes.

Another question. If I have package A that depends on B and C, and B depends on D|E and C depends on E|D, which of E and D would you expect to be installed (assuming an initially empty environment)? And how would a resolver know what to do in that situation? Is it acceptable for the answer to be “it depends, and the only way to know is to try it”? Is it acceptable if the answer changes each time you run the command?

In thinking about this situation, it’s important to remember that there’s no assurance that the resolver will try B and C in that order.

For extra fun, consider if E depended on F|D. Possible solutions include A,B,C,D, A,B,C,E,F or A,B,C,D,E. Whatever semantics you expect from this case, you’ll need to explain how the resolver should achieve that result. Dependency resolution is a lot harder than you expect… (Note that if F is already installed, you could also get A,B,C,D,E,F…)

2 Likes

In the example presented by @StSav012 in their StackOverflow question, the alternative dependencies all seem to have their own unique import namespace (different top-level import package names), so at first it seems like it would be fine to have both D and E. I assume B and C contain some logic to decide at run-time which one they want to import. But then on second thought, the intent might be that B and C both interact with the same dependency (either D or E exclusively) because they need to exchange object instances and whatnot. So yes, even if it is possible to resolve dependencies this might fail at run time.

So what happens now if we work with exclusive OR (XOR)?

Requires-Dist: Default ^ Other
Already installed Install dependency Default extra This solution
None None, Default, or Other
Default None, or Default
Default Other A D
Other None B
Other Default C E
Other Other

The cases B and C seem to actually be the same. In a default extra context installing App or App[default] should probably yield the exact same result (be it success or fail).

It seems like in all cases A, B, C, D, and E, the dependency resolution should fail.

Interesting. I assumed that the alternatives used the same import name. Which adds another element of potential confusion to the whole idea.

To be clear, I’m not against the idea. I’m just concerned that it would actually be far harder to define and use than a superficial consideration would suggest.

I feel like whether or not the import names are the same should not matter. I mentioned it to try to understand if the operator should be an OR or an XOR. But since it was always possible to install libraries that overwrite each other’s top-level import packages (if I am not mistaken), it should not matter for this proposal either, should it?

Anyway:

A = B & C
B = D ^ E  # default D
C = E ^ D  # default E

still leaves us with two solutions (ABCD and ABCE), right? Maybe in a first version, the specification could define that the resolution should fail, and the specification could be amended later to define a better behavior.

To clarify with the requirements:

A | B

Would B only get installed if there are no candidates found for A? Or would you expect some other events to trigger causing B to install? (e.g. if building A failed)

I think this would be useful for numerical codes which can depend on a number of packages as backends, which is becoming more common thanks to the array-api efforts.

If the rule is that all alternatives are equivalent solutions, it would only require a convention that whatever is encountered first is tried first. In the sense that, when building the set of missing dependencies, the first encountered dependency D|E adds D to the set, and the second encountered dependency E|D sees that one of E and D is already requested.

If you want to force installation of E, you can ask for it manually. Similarly, I would expect an extra [with-E] that specifically depends on E to force that package.

The order in which things are encountered is not predictable or guaranteed - that’s basically my point.

If you make a rule like that, you effectively prohibit certain types of resolver implementation - notably, the one that pip uses :slightly_frowning_face: (Before anyone comments, this is something of an over-simplification, but it is true that “whatever is encountered first” constrains the possible implementation choices in ways that the current semantics don’t).

Indeed, and my point was that you get whatever happens to be encountered first in that particular installation, without it being predictable or guaranteed to be reproducible.

1 Like

Having alternatives (or “variants”) is kind of similar as having versions of packages. When considering versions, you could also get a near endless amount of solutions, but we restrict and order essentially on “the latest” versions. The difference between versions and alternatives is that the former is on an ordinal scale and the latter is not. If we however would define a mapping where alternatives are sorted by preference (on a global level) then I think the problem ends up being the same as for versions.

This is indeed a topic that I think gets more prominent with numerical-related packages.

Note we have a similar issue when we consider a lock file from say poetry. In poetry2nix we want reproducible evaluation and builds. The lock file can offer both a wheel and sdist. Which to pick? We set a preference.

There’s one more risk to consider. Given:

a >= 1.3 | b

What happens if a < 1.3 is installed?

If I read the first post correctly, it would mean that b is installed to fulfill the dependency. If that’s the case, then I’m afraid you’re going to sooner or later end up with people mixing the above with code such as:

try:
    import a
except ImportError:
    import b

(i.e. where dependencies include version constraints but the actual code doesn’t)

2 Likes

If the developers can’t agree on whether they work with alternative dependencies with version checks or not, it’s not a problem of Python. Bugs may appear for many reasons, and the code logic is one of the main ones.


I don’t suppose the setup process should consider the dependencies further than the current line in the pyproject.toml file. It’s absolutely possible that, at the end, there will be all the possible dependencies installed. But still, I’m sure it’s better to allow some alternatives’ statements rather than to build a package with hard-coded dependencies or with a default extra, essentially hoping for the best.

A package maintainer can inform that the package requires any package of a list to be present in the system. However, who does read the docs? The users who do should be immortalized in stone. I state that my code requires Python >= 3.8 next to the header of the docs twice, but recently, one complained that the code crashes on Python 3.2 (point two, not point twelve). When a user faces an error message with a button, they click the button regardless of the message text and the button title. So, I don’t consider even the most thorough explanation viable for an end-user.

Off-topic: A curious case of a message button

At work, we got far too frustrated with an app that crashed twice a day with no apparent reason. When it crashed again, I took a screenshot of the error message and placed in on the desktop background, stretching it proportionally so that it covered most of the desktop. I haven’t moved the desktop items, so some of them covered the picture. Two hours later, I got a phone call from my boss. He said that he can’t close an error message. Despite the picture was some 4 times larger than the original message (therefore, the pixels were clearly visible) and was partially covered with desktop items, all the boss noticed was the OK button.

As the first attempt, I propose leaving the resolution order to the developer. That is, the requirements should be processed (tested and installed if needed) one by one as they appear, and the alternatives should be processed in the order they’re listed, left to right. Again, if a developer opts in to the alternatives’ approach, they should be aware of the possible consequences, including the installation of more packages than the possible minimum.