Selecting variant wheels according to a semi-static specification

msarahan · May 16, 2024, 1:00pm

Thanks for this writeup! My initial thought is that we should avoid things that require new implementation in installer tools and in wheelhouse. The extra TOML file does limit the black-box nature of the lookup package, relative to a PEP 517 build backend approach, but adding new functionality to installers feels like it dramatically increases the potential effort to adopt this feature.

Perhaps we could achieve something similar if the PEP 517 build backend “dispatch” packages contained (or could obtain/download) enough metadata about their potential dispatches to construct a more complete static environment picture. This would limit their flexibility and correctness, of course, since remote options may change depending on updates and partial mirroring, but maybe it’s enough to get the right compromise.

One thing that GPU packages have to do pretty often right now is host some or all packages on an external index. The size constraint, plus the practical tedium of PyPA staff managing manual overrides, means that we can’t assume that all variant packages live in the same place. Do you see a clear way to allow referencing external indexes? Could it be just another field in this TOML file?

Yes, I think the selector should be at a higher level than a single package. Once installed, it should activate logic for any package with a matching selection to be made.

xref What to do about GPUs? (and the built distributions that support them) - #71 by msarahan - I think it is important to be able to change selector and re-evaluate an environment. I don’t know what the right implementation is, but treating variant packages exactly the same as other packages does seem like it will inevitably mean environment nuking/recreation instead of swapping stuff out. If variant packages are plain dependencies alongside others, then environment specs recorded from that env are hardware-specific.

From my experience, building envs from scratch is unfortunately very rare in many workflows, and people don’t do it often enough. Conda works great at first, but the more history an environment has, the harder things get. People could avoid so many problems by nuking/recreating envs, but in practice, they mostly iterate on envs and keep them around until the env is unmanageable.

Do you mean caching the hardware metadata lookup in the selector package, or caching the package lookup based on that metadata? I think you can cache decently well. it is volatile system state, but that state doesn’t change often. Knowing when to invalidate that cache is the hard part, as usual.