Implementation variants: rehashing and refocusing

My immediate thought is that “collect variants by recursing dependency tree” is problematic, because the dependency tree doesn’t exist (as a completely defined entity) at this point. Because Python packages can have different dependencies depending on which version or even which wheel you select, the dependency tree is only discovered incrementally, during the resolution process. That’s the key point which I’ve been trying to get across, but no-one seems to be picking up on.

For a very simple example, suppose we have package A. Version 2.0 of A depends on B, version 1.0 does not depend on B. B has a variant X. If I request installation of A, do I need variant X? At this point, I don’t know whether I’ll pick A 1.0 or A 2.0 (it might depend on whether A 2.0 has a compatible wheel for my platform, for example).

Now imagine that happening at the bottom of a deep dependency tree, with boto (which has hundreds, if not thousands, of versions) somewhere in the middle.

No practical algorithm exists which allows you to consider “the dependency tree” as a concrete, fully-known, entity. This is a fundamental complexity of Python packaging, which most other dependency resolution problems don’t have to deal with. We’ve had extended discussions about the possibility of requiring all wheels for a given version of a project to have the same metadata (which would be a step towards addressing this, but would not be sufficient by itself) and even that has proved impossible to get consensus on.