Draft pep: Disabling Manylinux

Hi All

Due to some changes in pip 20, I ended up running into some problems with disabling the installation of manylinux wheels. I ended up creating a PR on the packaging github tracker (https://github.com/pypa/packaging/pull/262), and it was suggested that the best way of getting the PR merged would be to create this PEP. I’ve created a PR on the peps github repo (https://github.com/python/peps/pull/1286), as discourse prevents me from pasting in the contents due to too many urls.

James

From a quick look at the issues, it seems like what you really want isn’t a way to disable manylinux wheels, but rather a way to disable fetching pre-built wheels from PyPI, while still allowing pre-built wheels from a local package store? Are there other rationales?

With my current setup (using devpi as a cache to PyPI, plus personal packages for testing), I’m not sure it would be possible to tell the difference between different indexes (as everything is proxied through a local devpi instance). I’m not sure about what other people do though. Also, using an environment variable (which I think is the more useful change) makes it easier to enable and disable usage of manylinux wheels on a per command basis, whereas projects like https://pypi.org/project/no-manylinux/ are more heavyweight solutions, which means I’d be more inclined to test the manylinux wheels I do come across, and help with fixing the issues I come across (as it wouldn’t require playing with the right text file in the right environment).

So do you want devpi to have an option to only fetch sdists from the upstream PyPI, rather than wheels? That seems like a pretty simple feature for devpi to implement.

Your PEP’s proposal to extend the no-manylinux feature to cover all versions instead of just one seems fairly harmless (though it’s redundant with the not-yet-implemented PEP 600, which lets you define an arbitrary function to determine whether your environment is manylinux compatible). But the motivation doesn’t quite make sense to me. Why do you want to specifically disable manylinux wheels, while allowing windows wheels and linux wheels and pure-python wheels? It seems like “manylinux” here is a proxy for some other property you want to control.

PEP 600 seems to me to be a sensible approach to me for linux distros/other python providers to handle (e.g. they’d drop in a _manylinux to handle known issues), but not for users. There’s also an inherent conflict between the distributor and the user, as they both want the _manylinux.py to be the one they’ve selected. Rather than try to handle the conflicts, if we leave the use of _manylinux.py to distributors, and give users a simple on-off switch (which seems to be the desire of people who have been asking for such a feature in pip), this seems to be a better situation to be in.

I agree with @njs here - as far as I can tell, you’ve never explained why you want to disable manylinux wheels. Without a motivating example of a real-world situation, it’s hard to evaluate this proposal. I’m wary of the idea that “it’s a relatively simple change, and we know people would use it” as a motivation - it’s too easy to get into a situation of having to maintain a feature that ultimately gets superseded by a better solution that way.

So for me, the first problem is the ordering of manylinux over linux (I see there’s a proposal for a “local” tag with the highest precedence, which fixes this problem). I can do tricks (and have done) with devpi to only whitelist certain packages to allow PyPI access, but there’s no distinction between wheel and sdist, which makes this somewhat painful (I could look at adding this as a feature to devpi, but then this means every PyPI caching tool need to implement something similar to achieve the same result).

The second is not every manylinux wheel is the same. People like Matthew Brett produce wheels that can be relied on, but some projects like tensorflow don’t, which increases the difficulty in debugging issues.

The third, and the one which really doesn’t have a workaround (beyond using local and being really careful), is some projects are inherently tied to some C library which has different, non-compatible build configurations (mpi4py is one example). Either you don’t compile with the library (e.g. h5py manylinux wheels do not depend on mpi4py for this reason), or you don’t distribute manylinux wheels. Being able to distribute manylinux wheels, with users able to opt out easily as needed, would seem to me to be a better option than having no wheels at all. A more advanced wheel metadata system could handle this, with some kind of tagging for different builds, but that breaks the assumption that the sdist/wheel relation is one to one for a given system.

In terms of maintainability, I suspect the only thing that would supersede this would be if there was a replacement for manylinux, as the other options I can think of (having a white/blacklist for packages and/or versions, adding additional options to pip) imply a higher level of complexity, with more issues for backwards and forwards compatibility (what settings should be used on a older pip, or a newer one). I did look at both trying to have packaging behave like pre-version 20 pip, and what would need to be done to implement PEP 600 in packaging (that would require a large amount of restructuring of the tag logic, which would likely need to be followed up changes in dependant packages - this is something that will need doing, but it doesn’t change things for the users this PEP targets), and this PEP is by far the simplest option I can think of, and I far as I can see, limited potential to break anything.

Thanks for explaining your use case - it has made what you’re proposing much clearer.

So that one should be solved by pushing forward the “local” proposal, not by submitting an alternative proposal that will be obsolete once “local” is implemented.

That’s really something that needs to be addressed with the projects producing low-quality wheels. There may be scope somewhere, maybe in devpi, for blacklisting or otherwise hiding files the user deems as “low quality”, but at a standards level, it’s not clear what would be appropriate - PyPI isn’t curated like this and tools should install what the index server says is available.

So there may be a devpi feature request here.

Could this not be addressed with --no-binary=problem_project? Or, if you build wheels suitable for your machine, the local tag?

It sounds to me as if the local wheel tag, plus some filtering options in devpi, would address the issue for you. So while this PEP is, as you say, simpler, it’s still not obvious to me that it adds anything that isn’t more generally solved by other means. So I’m probably -0 on the proposal overall. But input from people more familiar with manylinux than I am would be welcome here.

1 Like

Link to the “local” proposal: https://github.com/pypa/packaging/issues/239

Also worth noting that the “local” proposal isn’t that much more complex than this one. All it needs now is a PEP to change the spec, and an implementation. And most of the discussion has already happened on the issue, unlike this proposal which is still in the discussion phase.

So the thing that strikes me here is that every one of these problems is a general problem with binary wheels – they apply just as much to Windows and macOS as they do to Linux. So hacking in a Linux-specific solution doesn’t seem like the best long-term approach.

2 Likes

One thing that wasn’t mentioned “local” proposal is how it interacts with other wheels of different versions (or more generally, how tags of different versions interact). Pip has --prefer-binary (which prefers older wheels over newer sdists), should pip gain a --prefer-local, and how does it interact with --prefer-binary, and, if we create more tags, what behaviour is expected (do more specific tags override less specific ones)? For my use case, I’d figure I’d want --prefer-local always, but I do like that I have managed to configure things such that when a newer release comes out the wheel is build as needed (which seems a much harder behaviour to configure with “local”).

The best idea I can see to solve this in the general case, is allow some way of specifying a ordering of preferred tags (and have some way of blacklisting tags), but given the number of possible tags, unless we start looking at grouping them, this idea doesn’t seem very practical to me (but I’m happy to be convinced otherwise.

Mentioning that the aforementioned “local” tag proposal isn’t Linux specific.

I agree, part of the issue for me is that manylinux wheels override linux wheels, whereas if I were to make MacOS or Window wheels, there wouldn’t be the same issue. I did try explicitly in my proposal to give a configuration that works across all versions of pip (or any other tools which support the existing manylinux logic), not just new systems (which is why I’m approaching things from this direction, rather than the local proposal). Additionally, installing things from source (especially some of the more unusual things), tends to be harder on Windows and MacOS. If there were BSD-based wheel standards, it would be interesting about what users of those OSes would prefer (would they be asking for the local proposal also?).

(Note, edited in for me)

One thing that would be nice in whatever proposal gets used, would be to be able to switch between manylinux and the non-manylinux wheel fairly frictionlessly, to make it easier to debug why a specific manylinux wheel is playing up.

I don’t have the time currently to pick this up myself and respond to all the questions/queries here.

If anyone else wants to try to solve the “manylinux before linux” issue, I suggest that they go through past discussions that have happened. A good place to start would be in https://github.com/pypa/packaging/issues/160 and the “local” tag is a result of that discussion there. This and other relevant issues are also linked to in pypa/packaging#239 – which I linked to above. If you’re trying to go chronologically, I think the first instance on GitHub of this is https://github.com/pypa/pip/pull/3921.

Overall, there’s a lot of discussion around this that I’ve tried my best to cross link on GitHub itself, after I’d tried solving this issue and realized that it is actually a fair bit more complicated than it looks on the surface. I think the discussions on GitHub are cross-linked well enough that someone can aggregate all the arguments/discussion into a design document (i.e. a PEP) if they’re willing to put in a few hours.

As someone who has actively engaged in the discussion and proposed the “local” tag as a compromise for the various trade-offs and disagreements we’re dealing with here – I think that it is currently the most (only?) fleshed out and agreed upon approach for solving this – seems like everyone agrees that it’s a good-enough solution for this problem at hand, and that’s after a lot of discussion on this topic has already taken place.

+1 on this.

… and this seems like the most general “fix everything at once” solution. But as you say, it’s a lot of work and it’s not immediately obvious that it’s even practical.

So, I’d suggest that we focus on the “local” tag as the main “immediate fix” for problems like this. The next step just needs someone who cares enough and has the time to pick up the discussion and raise the PEP.

I’m happy to look at having a go at producing a “local” tag PEP, I’m currently looking through PyPA github org for any manylinux related issues. I did come across the pywheels discussion, which flags that not all linux wheels on PyPI are manylinux wheels, which I didn’t realise.

3 Likes

It shouldn’t be necessary. The tags generated by packaging.tags will, I assume, put local tags at the highest level, so its preference will be implicit.

1 Like