What to do about GPUs? (and the built distributions that support them)

FRidh · February 11, 2021, 12:18pm

I’ve updated my post with what I could quickly find about other ecosystems.

This would be a good first step. I do want to emphasize that in the examples I gave (Haskell, Rust, …) the package manager and the build system are the same, whereas with Python we have that now decoupled into a front-end and back-end. Since other build systems typically allow/require you to declare dependencies, I think we should use their functionality, and extend PEP 517 with additional hooks for obtaining the non-Python dependencies from the build system. Of course, build systems that don’t support it could use a certain key in pyproject.toml. Anyway, now I am going too far into the details!

angerson · February 11, 2021, 8:47pm

One such potential issue: Python packages would be coupled to the availability of their dependencies. For example, ~~TensorFlow bundles CUDA~~ if TensorFlow were to depend on CUDA such that CUDA was installed from somewhere else during TF’s installation, every finished package would be dependent on the CUDA install method itself, and all of our popular old versions could suddenly be permanently broken if that method stopped working. It also might be hard to merge with system-installed CUDA packages.

Edit: TF doesn’t bundle CUDA itself in its Python packages; users must have it installed externally.

dustin · February 11, 2021, 8:48pm

But if CUDA somehow came from PyPI (and not some external host) and this used normal dependency resolution, would that be acceptable?

rgommers · February 11, 2021, 8:56pm

Does it? The tensorflow packages on PyPI have GPU support say the TensorFlow docs, and the largest wheel on tensorflow · PyPI is <400 MB. CUDA itself is a lot larger - I think you expect the user to have it installed already. That’s also what Instal TensorFlow dengan pip says. And those docs also say only CUDA 11 is supported, so the problem is narrowed down a little compared to what CuPy, PyTorch et al. do.

angerson · February 11, 2021, 9:55pm

Shoot, that’s right; I edited my post. I was thinking of all the CUDA-related compiled stuff TensorFlow includes, but that doesn’t include CUDA itself. Thank you.

Still, depending on external packages via PyPI could have the problems I described. I clarified since my original example was wrong.

But if CUDA somehow came from PyPI (and not some external host) and this used normal dependency resolution, would that be acceptable?

CUDA from PyPI seems like it would be helpful.

dstufft · February 12, 2021, 4:21pm

I don’t have super relevant insight into this, but I do want to point out it doesn’t have to be pure python, just that if it can’t be pure python it needs to change relatively infrequently, and we need to get it into Python itself.

steve.dower · February 12, 2021, 7:25pm

Or into, say, a selector package that can contain some pure Python code to try and load its native code, which presumably will fail for unsupported platforms, and so it can then request a different actual package.

dustin · February 12, 2021, 7:39pm

You’re right, I should have said it either has to be pure Python (so libraries like pypa/packaging can detect GPUs without needing to ship multiple wheels) or it needs to be in the stdlib.

ncoghlan · February 16, 2021, 9:54pm

Just noting that if anyone wants to follow up on the specification of external dependencies, a draft PEP for that was started a few years back that could be dusted off and modernised to account for pyproject.toml et al, rather than having to start from a blank page: Adding the draft status PEP for external dependency expression by tleeuwenburg · Pull Request #30 · pypa/interoperability-peps · GitHub

rgommers · February 16, 2021, 11:05pm

Thanks @ncoghlan! There’s some parts of that that are indeed useful, in particular the “reasonable communications layer by which information can be shared between those two separated ecosystems”. I wouldn’t want to reuse the build related parts, e.g. 'include!libblas.h' is not a healthy idea. Back in 2015 the “build from source” problem was still a lot more relevant than it is now. Today all important libraries provide wheels on at least Windows/Linux/macOS, and aarch64 and ppc64le wheels are starting to get traction too. So really runtime dependencies are what matters.

There’s also a lot more detail that’s needed. For example, CUDA and MKL are single-vendor and you should be able to rely on them as runtime dependencies independent of whether they were installed with, e.g., apt or mamba. For other runtime dependencies that won’t be true necessarily. So do they need to be treated differently, yes or no?

Writing a PEP should come later I believe; a clear description of use cases, current packaging practices, and problems to be solved seems needed to refer to and get people on the same page first.

westurner · February 17, 2021, 12:55am

Would you say that support for external dependencies would create 10X more configuration combinations for package authors to diagnose?

Could we write a tool to list all versions of all visible libs and packages? What could the lurking variable(s) be? It should probably also show the LD_PRELOAD list.

Are you expecting that pip will run these other package managers as root (in order to use package managers that unarchive as root and then set permissions and extended file attributes as root)?

westurner · February 17, 2021, 1:04am

(It’s possibly worth mentioning here that virtualenvwrapper has add2sitepackages and toggleglobalsitepackages.

When you add2sitepackages, that’s no longer a 'hermetically-sealed` build/install: you then forefit the build/install isolation that is the whole point of virtualenv. You’re then expanding the “attack surface” of available e.g. gadgets; and, for a production deployment, pip doesn’t yet (?) fix permissions of internal or external dependencies.

https://virtualenvwrapper.readthedocs.io/en/latest/command_ref.html#path-management :

jordanhubbard · February 19, 2021, 10:07pm

I’m not sure that any of the proposed solutions are even close to attaining “silver bullet status” given that almost all of the CUDA software in question is (a) large and (b) needs to support a wide variety of GPUs if you want it to be broadly applicable and not force the user into navigating the GPU version namespace manually and/or suffer long PTX JIT times at startup, assuming that PTX is even an option for all of the GPU kernels you want to publish. The price of hiding such details is large wheels - it’s like a speed of light constant.

I would propose that one solution might be wildcard redirects for very specific vetted vendors. This is to say that rather than doing per-file redirects, which caused the QoS problems that PEP-470 addressed, PyPI just accepts that certain families of packages which are identified as part of a pre-agreed namespace (for example: ^cuda-.|^nvidia-.) do a bulk redirect to the vendor in question, that obviously being Nvidia in my example.

The “vetting” part would also probably involve agreeing to certain QoS obligations. Files covered under a registered wildcard wouldn’t be removed before years, would never be updated in place, would be made available with SLAs on latency and average global bandwidth, etc etc. If the agreement also specified “trust but verify”, it would also be easy enough to expose certain CDN statistics or have bots randomly download targeted files from various parts of the world and report in on whether the the external provider was any worse than the PSF’s designated CDN. If an external vendor started failing to meet their obligations, they would be under threat of losing their wildcard redirect.

TL;DR: I am suggesting that the blast radius of redirection be limited to a small handful of large entities who can pay their CDN bills at scale and meet the overall QoS needs of PyPI while providing large file support.

I could also suggest more radical solutions like IPFS being adopted as a global data store for PyPI and allow this to be sharded across the internet as a whole, but now we’re departing the realm of science and getting more into science fiction.

kpfleming · February 19, 2021, 11:23pm

This is beginning to sound like adding provides metadata elements, where multiple packages can fulfill the same requirement. When the installer can see that there are multiple options available to the user, it obligates the user to choose one.

pradyunsg · February 20, 2021, 7:16am

FWIW, we have the provides-dist metadata key that no one is using. Making various packaging tools use it is a whole other beast that no one has yet poked.

uranusjr · February 20, 2021, 7:31am

Note that for the Provides mechanism to be practical, PyPI needs to sort of bring back the register-a-name-without-releases mechanism, so a package name can be designated as virtual to use in the field. Otherwise we’d have an issue similar to dependency confusion if a name both exists as a real package and is listed as another package’s Provides. The current approach of requesting a name reservation from PyPI admins would not scale well if Provides gets wide usage.

rgommers · February 20, 2021, 11:05am

I agree - there’s no silver bullet. The only thing that can be shaved off is things that people now statically link or bundle in dll’s for, and maybe 1-2 fewer SIMD/PTX variants.

This will apply to a very small group of vendors/projects, so I’m not sure it’s all that helpful. For example just raising the limit to 1.5 GB or 2 GB for that small group and asking them for a reasonable size PyPI sponsorship to cover the costs will be much more effective for everyone than some complicated redirect mechanism.

And regarding sizes, it sounded like the overall size of PyPI is at least as important as a set file size limit. Setting an overall per-project limit so people are forced to stop uploading nightlies for such large packages (or really, all packages) would have a bigger positive effect than the per-wheel limit. Example: the top project by sum-of-package-size has wheels in the 20-50 MB range, the problem is it does almost daily pre-releases: lalsuite · PyPI.

dustin · February 20, 2021, 5:21pm

We already have an overall project size limit, it’s 10GB per project, though we have made some exemptions.

rgommers · February 20, 2021, 7:38pm

Thanks for pointing that out @dustin. It looks like we automatically got limit increases, so I never noticed. I just reduced the total size numpy takes up by 6% by cleaning up some very old pre-releases. We host our nightlies elsewhere, because uploading them to PyPI feels like a bit of an abuse of the system. Out of interest, why don’t you make the largest users of space with pre-releases clean up after themselves, or implement an automatic cleanup policy for dev releases after a given period of time? It looks like this can reduce the size of PyPI by 10-20% fairly quickly.

dustin · February 20, 2021, 8:24pm

We already ask this for projects that have exemptions on total project size, or are close to the 10GB limit.

I think I can speak on behalf of the other PyPI maintainers: generally our goal is for PyPI to have whatever exists on PyPI exist exactly as it was uploaded, ~indefinitely (unless the owner chooses to remove it).

This is why we don’t allow releases to be overwritten, don’t change metadata on existing releases, and don’t have a “cleanup” policy like this. There is undoubtedly some users that such a policy would be disruptive.