PEP 600: Future 'manylinux' Platform Tags for Portable Linux Built Distributions

Thanks everyone for the feedback!

OK.

OK.

Good point.

So it’s probably obvious but just to be clear, the actual implementation in pip is up to the pip maintainers – the code in the PEP is only to illustrate which wheels are supposed to be installable on which system, and any code that ends up doing that is fine. (I do wonder if pip might want to stop generating all the tags at some point, since pep425tags.py is getting pretty convoluted and has accumulated a number of dubious edge cases, as @brettcannon has noted. But that’s a separate issue :-).)

Anyway, it should be possible to generate all the supported manylinux tags using this algorithm:

  • fetch the current glibc version (pip already has code for this)
  • enumerate all the versions between some lower bound (let’s say 2.5 = manylinux1) and the current version. So e.g. if the current glibc is 2.29, we’d enumerate: 2.5, 2.6, 2.7, …, 2.28, 2.29
  • fetch the current platform tag (pip already has code for this), e.g. x86_64
  • use these two pieces of information to generate all the candidate tags: manylinux_2_5_x86_64, manylinux_2_6_x86_64, …, manylinux_2_29_x86_64
  • for each candidate tag, run the “manual override” logic

Comparing this to the text in the PEP, I can see two places where this would break down currently:

  • If we’re running on a hypothetical future system with glibc 3.x installed, then we can’t enumerate all the supported tags without somehow knowing what the maximal glibc 2.x version is. This is kind of an inherent limitation of the “generate all tags” approach. As a hack I’d suggest that if we’re on a glibc 3.x system, then generate all tags up to 2.99, and then 3.0 through 3.x. Since this is just for speculative future-proofing, it’s probably not worth worrying about too much; worst case we’ll just fix things later after the glibc devs actually start making 3.x plans.

  • In the PEP, we currently allow “manual overrides” to declare that systems are compatible with arbitrary manylinux wheels, e.g. a macos-on-ARM system could declare that no really it’s totally compatible with linux-glibc-on-x86-64 wheels. This is kinda silly, and causes problems for the enumeration approach. I edited the PEP to move the manual override checks down below the normal compatibility checks, so that now the manual overrides can only rule out compatibility, not rule it in. That fixes this issue.

    Technically my edit introduces a tiny backwards-compatibility break from how pip works currently. Right now pip only checks the manylinux overrides if the platform is linux_x86_64 or linux_i686, so you can’t declare that a macOS system supports manylinux, or that an ARM system supports manylinux. But before you could declare that a system with an ancient glibc or musl can install recent manylinux wheels, and my updated text prevents this. But this never did anything useful anyway, so I don’t think it matters. In fact, it’s not clear that anyone uses the override system at all, and if they do I’m pretty sure it’s only to disable manylinux wheels entirely (e.g. Nixos used to do this).

The edits I mentioned are here: https://github.com/python/peps/pull/1191

This was exactly the problem we had when we were writing the first manylinux spec. Binary compatibility on Linux is a vast unknown! Nobody knows what dragons lurk there! etc. Fortunately that turned out OK.

What makes me confident now is that we’ve shipped more than 3.2 billion manylinux wheels over the last ~3 years. In that time we’ve found tons of edge cases in wheel building that needed fixes in auditwheel or the build image. We’ve found a few edge cases in system detection that needed fixes in pip (two that come to mind: handling 32-bit python running on a 64-bit kernel, and glibc redistributors who append weird text at the end of the glibc version string). We haven’t found a single issue that called into question the basic approach, and PEP 600 only codifies the basic approach, nothing else.

Also, while I get that it’s impossible to prove a negative, we can make probabilistic estimates about negatives, and I’m confused about why you would think this is a particularly risky transition, even if you aren’t familiar with all that detailed history. Fundamentally the only difference from manylinux1 → manylinux2010 → manylinux2014 is dropping support for old platforms. From the perspective of wheel builders, everything that worked in the old specs is still possible – every manylinux1 wheel is also a manylinux2010 wheel. So it’s hard to imagine how the transition could uncover fundamental problems that invalidate what came before, even in principle.

Oh man I wish that were true; I’d get like a year of my life back. The whole scientific Python stack on Windows is totally dependent on convincing GCC and MSVC to play nicely together via obscure black magic. The first Windows wheels for numpy/scipy took substantially more effort than the first Linux wheels, and that’s including “inventing manylinux wheels” as part of the Linux efforts.

And FWIW, Python 3.8 had to break the “stable ABI” on Windows in order to keep up with a Microsoft-driven deprecation, and this broke PyQt’s wheels. If there was a “manywindowsX” PEP we would have had to update it. This stuff happens sometimes. The best thing is accept that and make it as painless as possible to adapt. Which is the goal of PEP 600 :-).

2 Likes

To be clear here, what I’m saying is that as a pip maintainer, I wouldn’t find the definition in the PEP sufficient. I agree with you that generating a list of all supported tags feels like a bad way to check compatibility, but I had that debate with Daniel when he first developed the wheel specs, and he was clear that there were edge cases where generating the tag list was the only way to get the correct order of priority on the candidates. I don’t recall the details now, but the result is that generating the list is the current way of doing things (from my reading of the compatibility tags PEP it may even be required).

Anyway, you sketched out a possible approach, and I’ve flagged my concern. I’m not going to block the PEP on this, but ultimately someone is going to have to develop a PR for pip to implement this, and as long as we’re clear that doing so may be trickier than it first seems, that’s fine.

Ouch, good point. I should have expressed things differently - on Windows, “compatibility” is implementation-defined by the version of MSVC that Python is built with. Which I guess undermines the argument that defining manylinux standards using an implementation defined standard in auditwheel is unreasonable :slightly_frowning_face:

If nothing else comes out of this discussion, it’s that all of this stuff is really hard and there’s a lot of knowledge scattered around in people’s heads that could really do with being captured somewhere, or people will keep reinventing wheels…

Thanks for your patience, I’m happy with your edits to the PEP. I’m now switching back to a “watching the discussion” mode - I don’t have any more points of my own to add.

Is it still the case? My understanding is that the CRT is now binary compatible accross all recent MSVC versions.

It’s an over-simplification (and as @njs pointed out, is also wrong :slightly_frowning_face:). My main point was just that there are fewer variables on Windows when it comes to questions of compatibility. But “fewer” != “none”…

Anyway, it was something of a distraction from the main point here, which is PEP 600.

The main answer to this question is the same (IMO) as with a lot of decisions in open source - resources and (relatedly) people’s attention. All of this work is being done on a volunteer basis, and simply having people interested enough to contribute is a significant factor. Right now, we have people engaged and willing to develop, promote and discuss PEP 600. If we leave this for whatever period your suggestion translates to (months, maybe even years?) then we risk losing that interest, and have PEP 600 get abandoned through lack of interest. And then, when the existing standards do start reaching EOL, we have another rush to make a decision.

I’m not saying this is the only (or even the major) factor in the process here, but it is a factor.

As a second point, @njs has claimed a number of times that work is being done under the existing PEPs that could be avoided if PEP 600 gets accepted - so that’s extra resource and effort freed up by a quick decision. I’ve not seen any concrete details on how much effort would be saved, and in all honesty, I wonder if @njs might not be optimistic about the savings here, but I’ve very little idea of the details of what work is involved, so I’m fine with taking his word that there are savings to be gained from a reasonably quick decision. Again, I’m treating this as a minor, but not irrelevant, factor in not delaying too long.

First of all, by its very nature this is entirely a matter of opinion. None of us can know what might happen. But secondly, and more importantly, why is it so disastrous if we do have to revise PEP 600? PEPs (or rather the features they define) get revised and updated all the time. And furthermore, PEP 600 says nothing about transition processes. Nor did the previous manylinux PEPs. Any problems we find in the transition from manylinux1 -> 2010 -> 2014 are likely (again, just IMO) to be transition problems, and so not in the scope of PEP 600 anyway.

To summarise, I hear your points, and acknowledge them. But as the one making the decision on the PEP I don’t plan on delaying a decision for so long that the momentum is lost. I’m not deciding in haste (ask @njs what he thinks, if you don’t believe me :slightly_smiling_face:!) but we’ve had a decent amount of time for discussion now, and I think we need to move past vague “we mightn’t have spotted everything” concerns. We’re not aiming for perfection here, just for agreement on a workable way ahead.

If @njs wants to add some words to the PEP summarising the concern here and adding a response, that would be great, but I’m not insisting on it. Equally, if he has anything further to add in response, that would be good too.

OK, and once again, things have gone quiet.

I don’t think there’s going to be much more of a substantive nature added to the discussion here. I’m not sure I’d describe what we have as “consensus”, sadly, but I do think that “no significant objections remain that haven’t been addressed or responded to” applies. So I’m inclined to approve PEP 600.

One thing I would like to see discussed, though, is how the acceptance of PEP 600 would affect the work going on right now to deliver manylinux 2014. It’s been frustratingly difficult for me to get a real sense of what work is (or will be) going on “behind the scenes” to actually implement the manylinux specs, and there is definitely part of me that simply wants to ignore the problem, and say that if no-one else is interested in exploring the question, then I’ll just assume all is fine and not worry. But I feel that one last attempt to get input is in order.

@dustin, as the author of the manylinux 2014 PEP, do you have any reservations regarding the approval of PEP 600? Are you comfortable that timescales for implementation of perennial manylinux can be managed so as not to derail the work going on currently with 2014? Note that I’m perfectly OK with approving PEP 600 but for its actual implementation to be delayed - approval does not imply “you must implement this right now”.

@njs, I’m sure you’ll have comments on how PEP 600 won’t cause problems for manylinux 2014. Please don’t address them to me - I don’t have the knowledge to evaluate them. Rather, please address any comments or clarifications to @dustin, but do so in this thread so that the information is available to everyone, and not just covered by “we had a chat and it’s OK”.

Anyone else who feels they have anything new to add, please also feel free to chip in. But if it’s something that’s already been discussed or pointed out, please don’t bother. I have re-read all the discussions and frankly have a low tolerance for going over the same ground yet again.

(Caveat: I haven’t read most of this thread or the new PEP 600 draft.)

I think that PEP 600 is probably moving us in the right direction, however I don’t quite see what the rush is.

While I think waiting for some adoption level of manylinux2014 to be achieved is probably unnecessary, I do think delaying any implementation until the manylinux2014 rollout (which is moving along just fine) is more or less finished might help us focus on one thing at a time, and give everyone involved (including myself) a little more perspective on what actually needs to change before we start changing things.

2 Likes

FWIW, the idea of accepting PEP 600, and then following on with actually implementing it after https://github.com/pypa/manylinux/issues/338 (the manylinux2014 rollout) has been resolved makes sense.

As long as the implementation happens some time in the next 12-18 months, that should be soon enough to address any strong demand to target the CentOS 8+ era of distros.

2 Likes

So, where are we now with the manylinux2014 rollout? The only non-optional action item on the referenced issue (that’s not related to transition from earlier manylinux versions) seems to be the final release of auditwheel 3.0.0.

As far as I am concerned, there is nothing remaining to be discussed on PEP 600, in terms of what the proposal states. No-one has asked for further changes to the PEP, and leaving it sitting here “in limbo” is doing no-one any favours.

So therefore, in the next few days, I plan to accept PEP 600, leaving it to the manylinux developers to plan the timescale for implementation.

2 Likes

OK. I’ve thought long and hard about the decision here, and I am going to accept PEP 600. Congratulations to @njs for getting the proposal through to acceptance.

One reason I found this a difficult decision to make was that I remain concerned that the way the PEP specifies compatibility, in terms of a manylinux_tag_is_compatible_with_this_system function, does not match well in practice with how existing installers (i.e., pip and packaging.tags) currently handle compatibility. So I’d strongly recommend that implementing PEP 600 in packaging.tags be a high priority, in case it exposes any issues with the PEP.

Ultimately, though, the implementation plans are down to the manylinux developers, and I’m sure they will do a great job.

5 Likes

Is there a tracking issue for the PEP 600 rollout so we can see which tools already support it and which tools need to add support? I see that people have opened https://github.com/pypa/manylinux/issues/501 and https://github.com/pypa/packaging/issues/280 , but neither of those feels like a systematic tracking issue similar to the one for manylinux2010.

1 Like

Nope, there isn’t.

@mattip thank you for creating a tracking issue! And thanks @mayeut and @pradyunsg for editing it.

@takluyver @njs I suggest you take a look, improve it, etc.

1 Like

With almost all check boxes ticked in the tracking issue, what would be the process to mark the PEP Final (or Active ?) ?
I Also opened a PR to reflect the following statement from PEP 600:

When this PEP is accepted, the previous manylinux PEPs will receive a final update noting that they are no longer maintained and referring to this PEP.

On reflection, is it right to mark the older PEPs as superseded? A tag of manylinux2014 is still valid, and what it means isn’t stated in PEP 600. So “Superseded” seems incorrect.

A tag of manylinux2014 is still valid, and what it means isn’t stated in PEP 600.

You’re right, IMHO, PEP 600 should be amended to also mention manylinux2014 as it does very clearly for manylinux1/manylinux2010.

If that was you’re only concern, you can skip the following that emphasize why those older PEP should, IMHO, be superseded.

PEP 513 is “broken” as it refers to “libncursesw.so.5” which is in contradiction with PEP 600 and will install a package that can’t be loaded on not so recent systems if linked against.

Auditwheel now breaks rules from PEP 571 / PEP 599 because it allows a new whitelisted library which is legit per PEP 600 and requested by users of manylinux2010/manylinux2014. This might extend to other libraries (zlib comes to mind immediately) in the near future.

The full intent of PEP 600 regarding older PEPs:

Previously, we had an open-ended and growing commitment to keep updating every manylinux PEP whenever a new Linux distro was released, for the rest of time. By making this PEP normative for the older tags, that obligation goes away.
When this PEP is accepted, the previous manylinux PEPs will receive a final update noting that they are no longer maintained and referring to this PEP.

I hadn’t noticed that the other manylinux tags were (re-)defined in PEP 600 (it’s been a while since I reviewed it :slightly_smiling_face:) so thanks for the reminder. I’d just picked manylinux2014 to check at random, and luckily spotted the omission.

But yes, given those definitions, PEP 600 needs an update to redefine manylinux2014 in terms of PEP 600 tags (or PEP 599 needs to remain active, but that seems to undermine the intent of PEP 600).

Pull Request created.

I feel like that’s a significant enough clarification that I’d like confirmation from the PEP authors, @njs and @takluyver that they approve it. It should probably also be confirmed by the manylinux2014 author, @dustin.

For the purposes of that review, could you summarise the proposed change here, so people don’t have to get it from the PR?

Proposed changes:

  • Add PEP 599 in the mentioned list of previous manylinux PEPs (in addition to PEP 513 & PEP 571)
  • Add manylinux2014 aliases in the Legacy manylinux tags section. There are a few more since manylinux2014 also defines platform tags for aarch64, armv7l, ppc64, ppc64le, s390x.
  • Add a note about manylinux2014_compatible module boolean attribute for compatibility with previous specifications in the Package installer section.
  • Add manylinux2014 aliases to the manylinux_tag_is_compatible_with_this_system example.
  • Add manylinux2014_(x86_64|i686|aarch64|armv7l|ppc64|ppc64le|s390x) to the list of regexes package indexes are recommended to accept.