The next manylinux specification

pf_moore · July 11, 2019, 7:56am

Yes, but where? Under the current system it’s the PEP index (indirectly). That’s essentially the point - perennial manylinux needs to clarify where the canonical list of “valid flavours” exists, as it’s no longer the PEPs.

The more I see imprecise answers like this, the less I’m comfortable with perennial manylinux. I’m fine with its idea of “don’t require the PEP process for all the details”, but not with the fact that it’s not actually saying what its proposed alternative actually is other than “read the code” - which for someone like me with no experience building extensions on Linux who builds a simple C extension and just wants to “make it available to Linux users”, isn’t a realistic option (I looked at GitHub - pypa/auditwheel: Auditing and relabeling cross-distribution Linux wheels. and had no idea where to start ).

takluyver · July 11, 2019, 9:18am

My updates to the draft a couple of hours ago point to https://packaging.python.org/ as the place where flavours will be documented. This fits in with a general move to have packaging specifications there rather than scattered across PEPs.

I was being vague about it partly because I was still looking for people to agree on the idea of ‘document this somewhere other than PEPs’, and partly because I don’t feel I have the authority to say what should and shouldn’t be included in PyPUG.

korijn · July 11, 2019, 10:04am

I’m unable to read all of the communication here and on the perennial-manylinux PEP PR so please excuse my ignorance.

Steve Dower mentioned a concern I also have early on in this thread:

Overall I don’t personally have a lot of thoughts on this, but I’d like to at least warn that having flexible support for libc just pushes the same problem to the next dependency, whatever that happens to be for a particular wheel.

For example, browsing through the Anaconda package index for scipy and VTK I find package names such as:

I think this illustrates the issue quite clearly. There will be more constraints than just glibc.

Are we confident that the perennial-manylinux PEP will be able to avoid this scenario through the compatibility profiles it introduces? Can we learn something from conda?

takluyver · July 11, 2019, 10:26am

In brief, wheels have to bundle all libraries apart from a small selection of very stable libraries which are present on many Linux distributions. For a less brief description, see PEP 513 and auditwheel.

This isn’t always straightforward, of course, but scipy and vtk are among the libraries already distributing manylinux wheels. So we know this mechanism works enough of the time to be useful.

The ideas being discussed here don’t make any fundamental change to how that works. The main point of contention is how and where we should maintain the list of very stable libraries which wheels can rely on being on the system.

pf_moore · July 11, 2019, 12:56pm

I assume you mean the statement “For each profile defined on https://packaging.python.org/” at First draft of perennial manylinux PEP by takluyver · Pull Request #304 · pypa/manylinux · GitHub?

That seems a little brief and doesn’t really cover any details, like the process for adding a profile, how users are informed when new profiles become available, etc, etc.

Again, if I were to provide Linux wheels for my code (that I develop and build on Windows) how would I know if & when I needed to create new Linux builds or update my Linux build toolchain? IMO the proposal seems way too focused on the perspective of users who are deeply immersed in the whole “building wheels on Linux” ecosystem/community, and doesn’t really take into account making it easy for users with limited experience and essentially no prior knowledge.

It’s not that manylinux2014 is that much better, in general packaging binaries on Linux seems incredibly overwhelming from an outsider perspective, but at least there’s a master document as a starting point for each version.

takluyver · July 11, 2019, 1:31pm

I feel like the goalposts are shifting, and in an impractical direction. Yes, we should absolutely document manylinux better for users who aren’t deeply immersed in packaging. But a PEP is not the place to do that, nor even to decide how to do that.

This discussion is already long and contentious enough without swerving into the woods of improving documentation.

If you want processes, I’ll fall back to my earlier suggestion: change to glibc-based versioning so that pip doesn’t need updates for each flavour, and keep on defining wheel compatibility in PEPs. I’m not invested enough to define a separate set of processes for creating and disseminating wheel compatibility profiles.

pf_moore · July 11, 2019, 2:18pm

Fair point. I think I did acknowledge in a previous post that this was a digression, so let’s drop it.

TBH all I want is consensus But we’ve had a question about how perennial manylinux affects the way package maintainers have to decide which platforms to build for. We also still have the question of “how will the platform definitions be documented”. Both of these have been responded to in the thread, but I haven’t seen anyone say “oh, OK, that’s covered then” yet, nor has there been any substantial change to the PEP (your recent edits were pretty minimal).

I’ll step back though, and let others respond. But I’m still assuming that if we don’t get consensus on a perennial manylinux proposal by the end of the month, then it’s better to give it more time to develop and go with manylinux2014 for the immediate next version. And consensus needs a bit more than the assumption that silence equals acceptance, IMO.

I’d be very surprised if additional options helped the situation at this point

zwol · July 11, 2019, 4:41pm

On Linux, it can never be quite that simple, because a toolchain that runs on a current version of a GNU/Linux-based distribution will only be able to generate binaries usable with equally current distributions.¹ You have to get a “build environment,” which is some kind of VM or container image (currently I only know of Docker images for this, but they could just as easily be created for other virtualization mechanisms) containing a sufficiently old distribution, and run pip wheel within that environment, and then run auditwheel on the result to detect and/or fix up common problems.

Your hypothetical non-expert extension packager is going to do basically the same thing under both proposals: find an appropriate build environment and use it as above. The big headache for them, under both proposals, is going to be deciding which of several available manylinuxWHATEVER tags is the right choice (as long as there’s more than one whose output is accepted by PyPI at any given time, this headache exists) (usually the right answer will be “the oldest one with a $LANGUAGE compiler that accepts your code”, but there could be complications).

As I understand it, most of the work in developing either a new manylinux${YEAR} under the existing methodology or a new manylinux_${bikeshed_version} under the “perennial” proposal, is not in going through the PEP process, but in defining what “sufficiently old” means for the new tag, preparing a suitable build environment, and patching both pip and auditwheel to understand what to do with the new tag. Under perennial, pip only has to be patched once and then it understands all future manylinux’es, but I don’t see any inherent reason why pip couldn’t be patched once to grok all future YEARs if we stuck to the existing methodology. And anyway patching pip is the easiest part.

¹ I say “GNU/Linux-based distribution” here because this is not a property of Linux-the-kernel; it’s a consequence of the maintainers of GCC and GNU libc deciding that it’s only feasible for them to support one-way backward compatibility — binaries compiled with an old toolchain will almost always work with a newer libc.so.6, but not the other way around. Both Microsoft and Apple have put in the extra work required to make sure that you can, for instance, compile a program on a current Windows 10 dev box and get an executable that will run correctly on Vista, if that’s what you want. There’s no inherent reason why that work couldn’t be done for (GNU/)Linux, it’s just that nobody’s being paid to do it and all the volunteers have other priorities.

zwol · July 11, 2019, 4:58pm

This may be the crux of the dispute. perennial sheds a chunk of work that the current maintainers of tools and build environments find to be inconvenient and redundant, but the manylinux1 PEP is currently doing the job of answering the high-level question that package maintainers have: what do I use to build binary wheels for Linux? If we stop issuing manylinuxXXXX PEPs, something else needs to take that role.

Having said that, right now we have people being confused over whether to use manylinux1 or manylinux2010, so there’s already a gap in available documentation. Nor do I mean to dismiss the tools maintainers’ goals—on the “optimize for fun” principle, we should be looking for ways to shed tasks that people find inconvenient and redundant.

I don’t imagine there would be a new build environment for every new upstream release of glibc, incidentally; those come out every six months, it would be way too much churn. Probably people would prefer something more like one new environment per LTS release of CentOS, every three to five years.

pf_moore · July 18, 2019, 10:47pm

A procedural point here. The manylinux2014 PEP is in the process of getting checked into the PEPs repository as a draft. @takluyver you should be looking to do the same with the perennial manylinux proposal - when we reach the point where a decision is to be made between the two proposals, they should both be registered, with proper PEP numbers, and marked as Draft, but in the state they are looking to be evaluated on.

In particular, I’ll be expecting both proposals to have sections addressing any questions/points raised in this thread. They can be in a “Rejected Alternatives” section, or a “Questions Raised”, if appropriate, but they shouldn’t be ignored (nor should “silence implies consent” be assumed - if in doubt add the question and the response from this discussion into the PEP).

And finally, I’d like to see a little more technical discussion (both here and in the PEPs). For example, the question of C++ compatibility (i.e. the crashing issues) should be addressed by both PEPs - either there should be a technical solution in there, or there should be a discussion of how the lack of an explicit technical solution will impact use of next-manylinux wheels.

There are probably other technical issues to review, but unfortunately the 2014-vs-perennial debate has overshadowed any such discussion. I assume that perennial manylinux will take the view that this is OK, because technical details aren’t needed as part of the PEP, but it is something that could impact manylinux 2014, and I’d like to know that sufficient due diligence has been done (something along the lines of “this spec has been reviewed in the following places…” is probably enough).

As I say, this is basically just procedural points - when I look to make a decision, these are things I’ll be looking at to assess how “ready” the two proposals are.

But I’d still much prefer to see consensus - it’s never going to be ideal to have to pick one proposal over the other, some group will end up feeling that they “lost” in that case. But I fear that everyone is now too burned out to progress on any sort of consensus proposal, so choosing one is the only remaining option. So that’s what I’m assuming will happen.

njs · July 18, 2019, 11:58pm

The reason pip can’t be patched to grok all future YEARs is because pip needs to have a lookup table from YEAR → glibc version, and we can’t predict when future RHEL versions will be released or what glibc versions they’ll use.

And patching pip is easy, but waiting for users to upgrade to new pips turns out to be a major blocker for actual deployment, and it’s unavoidable in the manylinuxXXXX approach.

Unfortunately, there’s nothing to discuss here. No-one knows why these crashes are happening, and as far as we know it has nothing to do with the manylinux spec. It’s definitely a bug that should be analyzed and fixed, of course, but I don’t think it would improve the PEP to add a section saying “btw, there’s a bug that we don’t understand and is unrelated to the rest of this PEP. Just thought you should know”.

One thing this thread has definitely unearthed is that we need better documentation. That’s not a big surprise, but it’s good to have highlighted like this. It sounds like all your concerns about perennial manylinux at this point are concerns about lack of documentation that provides package-maintainer-oriented help with building wheels: tutorials, recommendations, etc. Is that right?

I think we have rough consensus on perennial manylinux, in the RFC 7282 sense – i.e., all issues have been considered, understood, and addressed. In some cases the solution is “yeah, someone needs to write some docs”, which isn’t the most satisfactory answer given that we don’t know who “someone” is or when it will actually happen, but that’s an inherent limit on any volunteer-driven project, and outside the scope of what a PEP can fix. Maybe those corporations who care about this can donate some technical writer time?

We don’t have rough consensus on the manylinux2014 approach. IMO it has serious technical problems, including: putting a specification of how Linux distributions should work inside a PEP is fundamentally misguided because we don’t control Linux distributions, it creates useless busywork for volunteers, it makes deploying new versions harder and slower, it makes supporting new architectures harder, and the PEP isn’t even a useful tutorial for users.

pf_moore · July 19, 2019, 8:50am

Thanks for your other comments, they are all very useful and I’ll need to consider them - mostly I think you’re right - but there’s nothing I directly need to respond to here. I also need to read that link to RFC 7282 - we’ve not typically been that formal in the Python PEP process, but given where we are in this discussion I think it’s worth trying to be clear on what constitutes consensus (for everyone’s sake).

(Hmm, one point I should mention -

from a process point of view, “addressed” here should be taken to mean “written up in the PEP”, so I’ll be looking at the PEPs to confirm that’s happened).

But the C++ compatibility issue is something I want to come back to.

I thought there was some reasonably plausible evidence presented here that C++ initialisation “stuff” (runtime static initialisers) could well be involved? (Disclaimer - I know about C++ semantics, but not so much about how Linux compilers implement them). Regardless of whether that allows us to define a “solution” at this stage, I don’t think it’s unreasonable for the PEPs to note that manylinux2010 suffered from issues because the compatibility requirements didn’t cover this, and as a result, spec-compatible wheels could in practice turn out to be incompatible, and to go on to say something like (for manylinux2014) “this PEP does not address this issue, so in practical terms claiming manylinux2014 compatibility is not a guarantee of interoperability unless the standard build toolchain is used” or (for perennial) “this PEP does not directly address this issue, but by deferring the precise definition of compatibility to the implementation of the standard build toolchain and auditwheel, it places the responsibility for dealing with the problem on those tools”.

(Offtopic) It seems to me that it’s also probably worth looking into whether the crashes are only triggered if wheels that are not built with the standard build toolchain are involved - and depending on the conclusion, considering whether the standards need to go further and mandate the toolchain (the wording I gave above was intended to imply that using the standard toolchain was a good way of ensuring best possible compatibility, but to fall short of mandating it at this point).

takluyver · July 19, 2019, 10:46am

Done: PEP 600: Perennial manylinux proposal by takluyver · Pull Request #1125 · python/peps · GitHub

I’ve mentioned it there, but to emphasise: I am about to be offline for a week, and then probably digging myself out of a pile of email for another week. I’m also kind of exhausted with this. If it needs changes, someone else will need to step in to do them.

I also won’t be upset if you choose to accept manylinux2014 over this. I wrote up the proposal because I think it deserves a fair airing, but I don’t feel like it’s my fight.

I have added such a section for what seems to me the main point of contention and how I think we’ve resolved it. But this has been a long thread with a lot of people weighing in - I don’t know if everyone will feel their concerns have been addressed.

Perhaps it should, but I know zilch about it, so if it needs addressing in the perennial PEP, someone else will have to do this.

pf_moore · July 19, 2019, 11:43am

This fits in with a discussion I’m currently reading in the IETF “Rough Consensus” document that @njs kindly linked, that the key question here isn’t about what people agree with, so much as it’s about what people can’t agree with. So on that basis:

Are there any participants in this discussion who would be unable to work with either of the two PEPs being accepted? If so, can you provide details of what specifically would cause a problem for you?

To give people an idea of what I mean, my personal position is:

At the most basic level, I can accept either proposal, as I don’t use Linux myself.
I have a problem with the PEPs not having any comment on the “C++ compatibility issue”, because it has the potential to impact me where it undermines user confidence in the compatibility tag mechanism, if supposedly-compatible wheels can cause crashes. I want to see some indication that the problem is being worked on (at the level of "when the standards say ‘compatible’ I can trust that they mean it).
I have a problem with the details of what constitutes a compatible wheel not being independently documented, because it risks people claiming that the standards process isn’t sufficiently transparent, or that it “blesses” particular tools making it impossible for people to write competing tools/processes. I want to see a commitment to ensuring that the checks we make are documented in a way that doesn’t require “reading the code” to understand them.

Both of these points have been discussed and responded to here, so I’d have to say that I can live with either proposal (which is essentially what “rough consensus” requires, if we follow the IETF guidelines).

Are there any other issues that people need to register before they would be willing to live with either of the PEPs? @njs has mentioned a number of issues he has with manylinux2014 - I think they have all been acknowledged and responded to (even if the responses leave @njs unconvinced) but is there anything more that should be added?

Please note, my comments below are not intended as responses or a defense of manylinux2014, but rather as a summary of what I think has been said - I’m happy to accept corrections (with pointers) to places where I’ve misrepresented something, but I’m not the person you should address if you disagree with the content of a given response

This has been responded to - we’re not saying how Linux distributions should work, we’re defining a (virtual) platform, that Linux distributions can be assessed against. And as we’re defining the platform, we must specify the details of what it provides.

This has been responded to - the work of documenting the requirements is needed in either case. Beyond that this is more of a comparative comment - the implication is that perennial manylinux requires less “busywork” - and while we should be choosing the proposal that imposes the least amount of administrative work on already-stretched volunteer teams, all other things being equal, “can I live with manylinux2014” should be looking at that proposal in isolation, not in comparison with the perennial proposal.

I think this has been covered - it’s the process that manylinux1 and manylinux2010 used, so manylinux2014 is taking the approach of using an existing process - “the devil you know”, in effect. When you say “harder and slower”, that’s comparing with the perennial proposal, and while that’s important for deciding between the two proposals, it’s not relevant to the question of “is manylinux2014 fundamentally acceptable”.

This is, as far as I can tell, a new objection and should be clarified (and responded to). But I do think it needs clarification first. I may be missing something here, though.

I’m not sure why that would be needed. Personally, I’d be inclined to say that this would need clarification before I’d worry about it as an issue that manylinux2014 needs to respond to.

So in summary, I think there’s one key issue here that still needs a response - the question of supporting additional architectures. That needs discussion, but initially I think it needs a bit of clarification, to explain how it makes supporting new architectures harder, and why that’s a problem with the proposal.

ncoghlan · July 20, 2019, 12:26pm

Both the manylinux2014 and perennial manylinux proposals have now been posted as PEPs:

PEP 599 (manylinux2014): https://www.python.org/dev/peps/pep-0599/
PEP 600 (perennial manylinux): https://www.python.org/dev/peps/pep-0600/

(I only just merged the PEPs into the repo, so it may take a while for the cached HTTP 404 to expire)

Regarding which one I think should be accepted: I think we should eventually accept both of them, but I think we should accept PEP 599 (manylinux2014) immediately to unblock the folks that are waiting for it (with one caveat - see below), and then continue iterating on the perennial manylinux idea to bring in a concept of human-centric “build profile aliases” that tools can standardise on in their human-facing interfaces, even while we switch to the more forward-compatible “named heuristic with tuning parameters” approach at the file naming layer.

My rationale for that is that whichever one we implement, there’s going to be at least one final lag waiting for pip and other installers to be updated to support the new tag. Since that’s unavoidable for manylinux2014, we may as well go with the lowest risk option of a single static tag, with a meaning defined by a PEP.

However, I also think the perennial manylinux proponents are right that we’d be in a much better position if build profiles could encode heuristic tuning parameters right into the file name, such that installers only need to be updated when new heuristics are defined, not when we adjust the tuning parameters on an existing heuristic.

At the same time, I still think it’s far more human friendly to let people say “I want to build against manylinux2014” than it is to make them go “I want to target recent’ish distros, anything more recent than 2014 or so, and glibc 2.17 was pretty common around then, so I want to target manylinux_glibc_2_17”.

So if we went with that approach, the sequence of events would hopefully work out something like this:

Accept manylinux2014, and roll it out the same we did for manylinux2010 (but hopefully much faster this time, since folks have worked through the necessary details before)
In parallel, flesh out a notion of “build profile names” as part of PEP 600, using manylinux2019/manylinux_glibc_2_28 as the first named build profile example (based on CentOS 8) where the wheel filename directly encodes the install-time heuristic, rather than the build profile name that corresponds with that heuristic
Well before CentOS 7 goes End of Life, add support for the manylinux_glibc installation heuristic to pip (et al), and for the manylinux2019 build profile to the publishing tools

And then after that, for as long as the glibc heuristic remains adequate at install time for most purposes, installers won’t need to be updated when new build profiles targeting that heuristic get defined.

(The caveat on manylinux2014 acceptance I mentioned above: in line with the previous manylinux PEPs, PEP 599 includes the sentence “It should not attempt to verify the compatibility of manylinux2014 wheels.” in the section about PyPI. I think we should change that to say “If technically feasible, it should attempt to verify the compatibility of manylinux2014 wheels, but that capability is not a requirement for adoption of this PEP.”)

zwol · July 22, 2019, 8:42pm

I am not wholly convinced by this argument. For instance, pip has to talk to the network anyway, so why couldn’t it update its lookup table based on some canonical file maintained inside PyPI?

I am totally prepared to believe this; however, in an offline conversation, @pf_moore said he thought pip’s auto-self-upgrade mechanism ought to be sufficient to deal with it. So I’d like to ask both of you to respond specifically to this point.

@njs, you have concrete evidence that users don’t get upgraded to new pips? Can you say more about that? What sort of work are those users doing with Python, what vintage of CPython and of Linux are they using, and is anything specifically known about why they are still using old versions of pip?

@pf_moore, when you told me that pip’s automatic self-upgrade ought to deal with this, was that based on anything more concrete than your intuition as a pip maintainer? If so, can you talk about that? If not, can you say something about what kind of evidence would convince you whether @njs is right to be concerned about this?

I am looking into these crashes in collaboration with the TensorFlow maintainers. I don’t have anything to report yet; however, my working hypothesis for the root cause is what I said in the perennial PEP PR:

If there is more than one copy of libstdc++ loaded into a process, the behavior of the entire program becomes undefined, in the sense in which the C and C++ standards use that term.

If this is correct, the fix will involve the manylinux spec. It will go something like this:

The core interpreter will need to be linked with -lgcc_s. This needs to happen anyway, for unrelated reasons, so I went ahead and filed bpo-37395 for it.
The core interpreter may need to find a way to load libstdc++.so.6 in the global ELF namespace, if and only if any extension module requires it. I hope this part won’t actually be necessary, because I don’t think there’s a good way to do it right now. Even if I put my glibc maintainer hat on and add one, that won’t help with older distributions.
libstdc++.so.6 and libgcc_s.so.1 will need to be added to the list of libraries that are not to be included in a manylinux wheel. (As a consequence, wheels using C++ will fail to load if the C++ runtime is not installed as a system package. This is unavoidable.)
Each future version of manylinux will need to specify a particular version of the C++ compiler, to be used for all extensions containing C++, directly or indirectly.

The last bullet is the most important one and, unfortunately, I fear it may wreck the perennial versioning scheme. There is no reason why “glibc 2.34” should necessarily imply “g++ 11.0” or vice versa; the mapping will have to be maintained by hand, and now we’re back to what I understand is your (@njs’s) most concrete objection to continuing with manylinux_YEAR.

(In case anyone is curious: no, you cannot mix C++ code compiled by LLVM with C++ code compiled by GCC, either.)

I don’t feel I have standing to raise a formal objection in this group, but in my opinion, the C++ issues above must be resolved before perennial can move forward. Adding “all C++ code must be compiled with G++ [version that shipped with CentOS 7]” to manylinux2014 is a one-line edit. The analogous edit to perennial would need to explain how to choose the appropriate version of G++ for each new tag, and I don’t think we even know where to begin with that yet.

And given that, I am in agreement with Nick when he says

although I have an additional caveat, which is that I think manylinux2010 has dropped the ball on rollout, and specifically on packager takeup (see here). So I would ask the people working on manylinux2014 to present a concrete plan and timeline (not as part of the PEP) for rollout.

zwol · July 22, 2019, 9:04pm

Looking back over the discussion in the perennial-manylinux PEP PR, I see that @njs did already try to address my concern about the poor coupling between glibc X and g++ Y:

we don’t care about the leading edge, only the trailing edge, and the trailing edge is much more stable and boring. So e.g. consider a manylinux_23 , targeting systems with glibc 2.23 and newer. glibc 2.23 was released in 2016. If we look around at all the distros currently shipping with glibc 2.23 or later, and compute the minimum libstdc++ version, it’s probably going to be something from 2016-ish. The situation where we’d get in trouble is if suddenly a new distro appears that’s shipping an older version of libstdc++ than any of the distros we already looked at – so you’d need a new 2019 distro that just started shipping a libstdc++ from 2015-ish. That never happens in practice.

Sure, this means that it’s actually impossible to know what the definition of manylinux_X is until some time after glibc 2.X has been out in the wild – the standard says that manylinux_X wheels have to be compatible with all distros shipping glibc 2.X, and that’s not determined until all those distros have shipped. But that’s unavoidable, and not really a problem since our goal is to achieve broad real-world compatibility…

I’m not convinced by this argument because, under perennial as-is, nothing stops people from uploading wheels tagged manylinux_23 as soon as they have their hands on a development environment including glibc 2.23, with whatever random G++ that environment includes. Yeah, they ought to wait at least for the auditwheel profile, but given that we know we have a people-power problem here, they might be waiting a long time and they might get fed up and start uploading anyway — as people already have with the existing 1/2010/2014 situation.

Through this lens, needing to patch pip actually serves a useful gatekeeping function: nobody can use wheels tagged manylinux2020 before there is an agreement on what “manylinux2020” means, and therefore nobody uploads them and makes a mess in PyPI.

njs · July 22, 2019, 9:22pm

Right now, today, perennial manylinux is a better, more pragmatic solution than manylinux2014. If you have ambitions to do something even fancier, then that’s cool, I look forward to your PEP :-). Human-centric aliases are an interesting idea, and I don’t think there’s anything particularly Linux-specific about them – for example, macOS wheel tags also have semi-opaque versions embedded in them, and it’s often unclear which version you should be targeting.

But perennial manylinux is basically just a cleanup and simplification of our existing process, and there’s no reason to block that while waiting on some future fancier thing.

I suppose it could, but that means implementing a whole second version of the self-upgrade mechanism, and the alternative is to just… not do that and get the same result?

The pypi download stats have info on which versions of pip are used. I don’t have time to dig into the data right now, but it’s fairly straightforward if someone wants to look at it.

My fuzzy impression is that people do upgrade pip, but as long as, say, 10% of users can use manylinuxX but not manylinuxY, package maintainers will be reluctant to switch to manylinuxY, and that takes maybe 6 months after a pip release.

The manylinux2010 rollout may be especially slow here because it landed in pip at the same time as the pep 517 changes, which had a really rocky rollout and caused a lot of people to hold off on upgrading. Hopefully that won’t happen again. But it might – relying on pip’s release cycle means there’s inherently a risk of this kind of coupling.

It’s much worse than that. If what you’re saying is true, it wrecks Linux wheels entirely. Even if manylinuxX and manylinuxY each mandate a specific C++ compiler, people can still install manylinuxX and manylinuxY wheels into the same environment.

I really don’t understand why libstdc++ is so special that it’s impossible to support loading independent copies into isolated ELF namespaces that never interact with each other. It’s totally fine if one extension can’t catch the other’s exceptions or whatever, because C++ extensions don’t pass between Python extensions. I know being a language runtime is a kinda special thing, but what specifically is the issue?

If they do, then their wheels won’t be compliant with the spec. (Remember, the spec requires that a manylinux_2_23 wheel has to work on all distros that ship glibc 2.23+.) And we do have options: we can apply social pressure, or PyPI can block such uploads using technical measures.

pf_moore · July 22, 2019, 10:46pm

Not really, I’m afraid. I don’t know much about how Linux developers handle their working environment, so I’m working mostly on assumptions here. However, pip does remind people to upgrade whenever they run an older version, and (for me, at least) the reminder is sufficiently annoying that I tend to upgrade. Also anecdotally, we get a lot of rapid feedback when a new version breaks existing workflows, which implies that a fair number of people do upgrade as soon as a new release comes out.

Contrasting with that, people using a system supplied pip should definitely not upgrade until their distro provides a new version, so if it’s common practice to use the system supplied pip, then there’s definitely a problem.

I’m willing to believe that getting people to update could be a problem, although ideally, I’d like to see some description of a plausible workflow that makes it difficult for the developer to use a new version of pip (generally, upgrading pip should be painless and backward compatible). What I’m less convinced about is that upgrading pip is the significant issue here - I’d like to see an example of someone who specifically was all ready to go with manylinux2010 but was only waiting for the latest version of pip to be available in their development environment.

But I would point out that when comparing manylinux2014 and perennial, both need a pip update. The difference is solely that perennial avoids needing future pip updates (assuming no big issues like the C++ point you note). And that’s self-evidently a point in favour of perennial - all we’re really arguing about is the weighting of that point (and to be honest, fine weightings of individual differences isn’t really what I’m looking at right now).

So doesn’t that mean that the PEPs are both lacking, as they allow wheels to be built with either compiler? At a minimum, they’d need to state something like “C++ code must be built with a compiler that generates initialisation code that is runtime compatible with (whatever G++ version the standard build toolchain uses)” surely? (Even if this isn’t the direct cause of the crashes we’re seeing, it’s still in principle a C++ compatibility requirement).

I do have that standing, and while I’m not going to say that perennial manylinux is dead unless it addresses this, I will say that without a response to this objection, we most definitely do not have a rough consensus on perennial manylinux - far from it. (I’m taking “rough consensus” here in the IETF meaning that @njs linked to - all substantial objections need to have been considered, and properly responded to).

I’d also say that I’m concerned that so much energy is going into the “perennial vs 2014” debate, that would be better spent refining and dealing with the technical definition of the profile (which is needed by both proposals, regardless of where they say it will be formally specified). No matter which proposal gets accepted, I think there will still be a round of specification work before the final version can be rolled out.

Agreed. And personally, I’d go further and say that they should be offering a specific review of what happened with 2010 and how they intend to avoid making the same mistakes. So far, all that’s really been said is “we have people willing to work on this”, but simply throwing manpower at the issue isn’t likely to be enough by itself.

It’s actually arguable that for precisely this reason, perennial manylinux has a much stronger need for PyPI validating uploads using auditwheel than manylinux2014 does (although both would benefit, obviously). But as far as I know there’s not yet any real timescale for when that sort of validation might happen.

My understanding (from knowing about C++ implementation mechanisms and the C++ standard in general) is that it’s not about the language runtime support library. It’s about initialisation code, which is user code that has to run before main() starts, and which the compiler has to include in the binary in such a way that it’s triggered to run in a certain (standards-mandated, I believe) order.

So. if I have

static int foo = calculate_foo();

void main() {
    static int bar = calculate_bar();
}

then the calls to both calculate_foo and calculate_bar must be embedded in the final binary in a way that means they get run before main, and in a specific order. And if foo and bar are defined in different compilation units,. the compilers used must agree on the precise (implementation defined!) details of how that running order is managed. (In reality, there are far worse edge cases, that I don’t understand at all - static members of class templates is the sort of word salad I’d expect to see. I know someone who is a real C++ standards expert, and I could ask him for further detail if anyone wants me to, but I’ve been sort of assuming that the people close to this issue already knew this sort of stuff).

This is a nasty problem, and one that “normal” linker technology couldn’t really handle when C++ was first defined (which is why C++ had a reputation for needing a smart linker). Linkers can deal with stuff like this now, but it’s still all very implementation-specific - and I think newer C++ standards keep introducing new features that make such games even harder - necessitating extra shenanigans. So there’s a real risk that a compiler which implements C++ 2014 can use a different initialisation sequence from one that implements C++ 2017.

I will say that the above is all theoretical - it may explain the crashes people are seeing, and it may explain why it’s not about libstdc++ but rather about the actual compile toolchain, but equally it may be unrelated (although even then, it’s still something that might happen, even if we’ve not seen it be an issue yet).

ncoghlan · July 22, 2019, 10:48pm

To be clear, I don’t think perennial manylinux should be accepted without user-friendly build profile aliases as an inherent part of the specification. I can understand “manylinux2014 targets distros released since 2014 or so, and excludes older distros”, but as a human attempting to make publishing compatibility trade-offs rather than a computer trying to install packages, “manylinux_glibc_2_17” doesn’t tell me anything useful. I think that readability trade-off would be an OK one to make in archive filenames, but not in tooling interfaces.

The idea of moving the build profile definitions out of pip core in order to ease future rollouts is an interesting one, though. That could be done in a couple of different ways:

amending the simple API to include an extra metadata file (potentially tricky to evolve over time, but theoretically no harder than evolving a Python API. Definitely harder to roll out though, since we’d need to update all simple API implementations, not just PyPI)
creating and publishing a new “wheel_build_profiles” package that serves as a perennially updated map from wheel naming conventions to installation target checking heuristics.

Updating “wheel_build_profiles” to a new version could then be as routine a task as updating to a new timezone database, such that even highly conservative distros would be willing to keep it up to date.