This is something of a digression, but it does relate - has it been confirmed that the “crashing issue” isn’t simply a case of code triggering undefined C++ behaviour in a way that would happen even if the exact same compiler was used throughout? I’m thinking specifically of the static initialization order ‘fiasco’ which can cause crashes in completely statically linked programs compiled with a single compiler.
We’re getting to a point here where we are presuming that “something needs to be included” in the specs for this issue. But if there’s still a realistic possibility that this is just badly written C++ code that triggers undefined behaviour completely within the C++ language sense (i.e., even if we ignore dynamic linking and libstdc++), then we may be expecting too much from the PEPs.
The C++ standard says nothing about dynamic linking, or how runtime support is loaded into a program. So we’re way into the area of undefined behaviour here. The C++ standards say nothing about mixing compilers, or dynamic linking, and C++ is known not to guarantee ABI-level compatibility between compilers, so at a theoretical level, any version of manylinux has to mandate that the same compiler is used for all compatible wheels. But we’re looking for a practical standard here, not a theoretically-perfect one…
You know, that’s fair – I think we do all agree that the manylinux2014 approach can be made to work. It’s the same one we’ve been doing all along. The problem isn’t that it can’t work, it’s that if you look at the consequence of choosing one versus the other, then perennial has multiple significant advantages and no real disadvantages.
It’s just another version of the point about rolling out new versions: adding support for a new architecture is basically the same process as adding support for a newer baseline version, so anything that helps one helps the other.
I mentioned this because one of the persistent complaints about the perennial approach is that the PEP doesn’t contain as much documentation about how Linux platforms work.
Upon further consideration… I think both of us are overestimating the problem here, and it’s actually not that big a deal.
If C++ libraries only use the distro-provided libstdc++, then none of these problems happen. The potential problems are when you want to use new C++ features, while targeting distros that shipped with an older libstdc++. In that case, you have to do some kind of vendoring or static linking shenanigans. For the current manylinux build images, we install a special version of g++ that Redhat maintains, and it handles this static linking automatically.
If the static linking doesn’t work, then that’s a shame – it would mean Redhat’s compilers are broken and we have to stop using them. But the only consequence would be that package maintainers who want to use new C++ features can’t target old Linux distros.
It does emphasize that each manylinux tag has to refer to a whole constellation of library versions (glibc version, libstdc++ version, etc.). But, that was already true and already accounted for in all the different proposals we’re talking about. The perennial approach does theoretical limit us slightly compared to the manylinuxXXXX approach: in perennial the glibc version + the distros that actually exist determines the libstdc++ version, while in manylinuxXXXX we could in principle choose those two versions independently. For example, we could have separate PEPs for manylinux2014a and manylinux2014b, that are identical except for which libstdc++ version they require.
In practice I don’t think this flexibility is actually useful – there’s no point in having a spec for an environment containing old glibc + new libstdc++, because there are no distros that ship that combination. And even if those distros existed, AFAICT there’s no way for pip to reliably detect when it’s running in an environment like that, so it couldn’t install those wheels anyway.
So we should definitely include a “wheels MUST NOT cause other unrelated wheels to segfault” requirement in whatever PEP ends up winning, but I don’t think there’s anything else to do here.
That part’s fine – remember we’re talking about independent Python extension modules here. When the interpreter dlopen's one extension module, part of the linker’s job is to run the shared library’s initialization code, including C++ constructors. So when you import the first module, it gets dlopen'ed and its constructors run, and then when you import the second module, the same thing happens for it.
The crash we’re seeing is something much more gnarly and subtle. IIRC the stack-traces involved some kind of combination std::once and thread-local storage, to give you a sense of how deep the rabbit hole goes…
It’s not end users that I’m thinking about, it’s package maintainers. The scenario is: some end users don’t think that keeping pip up to date matters, or they’re running it as part of some automated process that doesn’t pay attention to the “you should upgrade!” messages. So it takes them a while to upgrade. If there are enough users like this, then package maintainers feel obligated to keep supporting old versions of pip, which means they have to keep shipping manylinux1 wheels instead of manylinux2010.
For a concrete example, I just checked with the pyca/cryptography devs, and they said that they haven’t shipping any manylinux2010 wheels yet, and in their next release they’re going to start shipping both manylinux1 + manylinux2010, and then they’re going to watch the manylinux1 download numbers and wait for them to fall below some threshold before they stop building manylinux1 wheels.
Why is this suddenly mandatory for Linux, but not for other platforms?
Literally no-one knows which version of Windows or macOS they should be targeting in their build profile either. Quick, without looking anything up, is macOS 10.11 EOL or should you still support it?
(Answer: It’s a trick question, Apple refuses to tell anyone which versions of macOS are EOL. But you can make some guesses by looking at their past behavior, and by looking at the statistics different people have gathered on how common different versions are in actual deployments. The number itself doesn’t really tell you anything useful though – it’s just a starting point for research.)
Yeah, it is definitely annoying that you have to cross-reference that number with some external documentation to figure out which glibc platform you should be targeting, I totally get you there. I have spent way too much of my life looking at tables of metadata about different distros. But realistically, this is unavoidable no matter what we do. Should you use manylinux2010 or manylinux2014? In practice, the way you answer that question is either (a) ask someone – a friend, or some recommendation on packaging.python.org – or else (b) go do a bunch of reading to figure out which distro releases are supported by which build profiles, and then do some more reading to figure out which of those distro releases are EOL and which are in use.
For example: is Debian jessie supported by manylinux2014, and as a package maintainer, should you care?
Answer: after some web-searching, I found a wiki page saying it was frozen in 2014, same as RHEL 7, so it’s a flip of a coin which has the older glibc. But checking distrowatch it turns out Jessie does have a slightly newer glibc (2.19 versus RHEL 7’s 2.17), so manylinux2014 wheels will install on Jessie. But, you probably shouldn’t care about this, because the Debian project declared Jessie EOL a few years ago. Except, there are some other volunteers outside the project providing long-term-support, so maybe you do care, depending on exactly who your package’s target audience is.
If we instead asked “is Debian jessie supported by manylinux_2_17?”", then we still have to do all that analysis, except we get to skip the part at the beginning where we have to map the wheel tag back to a glibc version.
Also… reading your post again, are you actually asking for anything besides extra tags on the docker images, so you can do docker run quay.io/manylinux/manylinux2019_x86_64 and it’s an alias for docker run quay.io/manylinux/manylinux_2_28_x86_64? If the “human readable” tags don’t appear in the actual wheels, then there’s no need for a PEP to specify them, because there aren’t any interoperability issues.
I think we should suspend the C++ aspect of this discussion until we have a better understanding of what the actual problem is. I’m actively pursuing that in concert with the TensorFlow maintainers and hopefully we can get somewhere with it in the next couple weeks.
I believe we have a meta-problem here: clearly you believe this, but as far as I can tell, nobody else is convinced. Everyone else seems to see perennial as a desirable future goal, but one which is not yet fully baked and therefore should not stop us from proceeding with manylinux2014 to unblock the people who need newer build environments already.
I don’t know what everyone else is thinking exactly, but I can tell you why I think perennial isn’t fully baked. We know from the manylinux2010 experience that we have serious problems with the overall process of rolling out new Linux wheel tags. manylinux2014 doesn’t address any of those problems, but that’s OK because it’s a stopgap: it’s literally “oh crud, we took so long with 2010 that its base distribution is nearing end-of-life, s/CentOS 6/CentOS 7/ and ship it”. Perennial, on the other hand, aims to fix the process.
I believe you when you say that perennial fixes the parts of the process that are pain points for you, as a maintainer of auditwheel. However, perennial does not appear to tackle the parts of the process that are pain points for package maintainers, except possibly the pip deployment issues:
availability of “official” build environments
build utility support (cibuildwheel, etc)
knowing when to start building manylinux_[NEWER] wheels
knowing when to stop building manylinux_[OLDER] wheels
knowing whether a manylinux_[VERSION] wheel will work on your linux-of-choice
…? I don’t claim to know the full list
That would be fine if the process issues were decomposable, so that we could say that clearing up the pain points for the auditwheel maintainers was an unambiguous step forward. But I don’t think we can say that. I think perennial, as-is, may be making the above set of problems for package maintainers worse by replacing the existing short list of tags (1 and 2010 2014) with an open set of uncertain size.
BTW, I agree with this. @dustin can the relevant text be changed to say that:
Authors must not upload non-compliant wheels to PyPI
PyPI should remove wheels found to be non-compliant
PyPI should check wheels for compliance, either on upload (with failure meaning the upload is rejected) or in the background (with non-compliant wheens being removed), but is not required to if technical constraints make this infeasible.
We’ve been talking about running auditwheel on uploads for a while now. I’m not comfortable with approving a PEP that prevents that happening, and I’d much rather that the PEP acknowledges the intention explicitly.
Maybe I’m misunderstanding what you mean here, but I’m interpreting this as “PyPI should audit all existing wheels and remove any that are non-compliant”.
If we ever gain the ability to audit manylinux wheels on PyPI, I’d be strongly opposed to removing distributions that have been published and then are later found to be non-compliant. This would very likely cause more user headache than the existence of non-compliant wheels currently does.
I intended it as “PyPI should remove non-compliant wheels”. Not that they have to proactively look for them, but if reported, or located some other way, they should be removed. (Edit: whether and how PyPI should audit is one of the other points, but this point is just about housekeeping non-compliant wheels).
I thought this was (relatively) self-evident. If we don’t want non-compliant wheels on PyPI (i.e, people should not upload them), then surely we want to remove them when they are found to have slipped through the net? What else would you propose to do with them? I guess “leave them until their non-compliance is established to be an issue” is an option, but how do we establish that? (I’m thinking specifically of the long-running uncertainty over the tensorflow non-compliant wheels).
But if this is controversial, then I’m happy to leave that one out of the PEP and have the discussion elsewhere. I don’t know how to split this into a separate thread, so if someone who does could do so, it would save me the need to re-type it and link back.
Yeah, I think yanking would be a better approach, and I agree that that level of detail isn’t necessary in the PEP. I’ve added a commit to https://github.com/python/peps/pull/1130/ which should address this.
I don’t have a horse in this race, but when @pf_moore said that level of detail wasn’t necessary, I thought he was referring to the yank / removal distinction. But the updated PR is now silent on whether non-compliant wheels can / should be removed (using any method like yanking or flat-out removal).
It isn’t suddenly mandatory - it was a major topic of discussion when naming manylinux2010 (with that name winning over manylinux2 by virtue of conveying more information that’s useful to publishers), and it was one of the first concerns raised by folks hearing about the perennial manylinux proposal for the first time.
My understanding is that folks generally answer this question by inheriting their build flags from the corresponding CPython binary release, thus outsourcing the question to the CPython binary build maintainers (currently Ned Deily for Mac OS X, and Steve Dower for Windows). Apple and Microsoft take care of naming the available target ABIs, and defining what they each mean.
That answer doesn’t work for Linux, since there aren’t any distro-independent CPython binaries published via the PSF to establish a standard baseline, and each different distro defines their own independent ABI, so the PyPA ends up having to take on all three tasks of defining the target ABI, naming the target ABI, and communicating the relationship between the two.
With the CalVer naming, folks that are willing to trust our naming don’t need to look any further than “What year was the oldest major distro version I want to support first published?”, on the assumption that if there was a major distro release in 2014 that was incompatible, we wouldn’t have called the baseline manylinux2014 in the first place - we’d have called it manylinux2015 or manylinux2016 as appropriate.
It’s only if folks are thinking “Are, but what if they made a mistake in naming this version?” that they’ll need to go check our work by comparing the major library versions used by the distros at the time (in the case of manylinux2014, that would be RHEL/CentOS 7, Ubuntu 14.04, and Debian 8. For a hypothetical manylinux2019 spec, the major distros of interest would be RHEL/CentOS 8, Debian 10, and Ubuntu 20.04, skipping over Debian 9 and Ubuntu 18.04 due to the slower RHEL/CentOS update cycle, just as we’ve skipped versions in going from manylinux1 to manylinux2010 and now to manylinux2014).
I do hope that accepting manylinux2014 won’t kill the motivation towards finding a better way to handle defining manylinxu2019 - I just don’t want to put the design work for that evergreen solution on the critical path for handling the immediate requirement to allow publishers to target manylinux2014.
OK, it’s the end of July, and there has been no real further discussion here in the last few days, so I think it’s time to bring the debate to a conclusion.
I’m going to approve manylinux2014 as the next version of manylinux. Congratulations @dustin and thanks to everyone who participated in the debate.
There are some caveats, however. In spite of a number of questions being asked about “how do we know this will be delivered more quickly than manylinux2010?” I’ve seen no real response. Having people ready to work on the proposal is not enough - if it’s anything like other situations I’ve seen, it’s easy to get people to work on the technical stuff, and nearly impossible to get them to work on documentation, planning, looking at the bigger picture etc. So, to the extent that I can demand anything, I want to see the manylinux2014 “team” publish a review of what the sticking points were with manylinux2010 deployment, and how they intend to address them. I’m looking very specifically here at @dustin and the people he’s got waiting to work on manylinux2014 - it’s not fair or reasonable to expect the people who worked on 2010 to do this themselves. If no-one, from the people willing to commit to working on manylinux2014, is able to do this, then I fear that manylinux2014 will have the same problems as 2010.
I’m also assuming that there will still be technical work going on with the spec. I’ve seen the vsyscall discussions, and I know the investigation into the TensorFlow crashes is still ongoing. The discussion here around perennial manylinux has made it clear that this is a normal part of maintaining the specs. But I would insist that the spec is kept up to date and any non-trivial changes are at a minimum publicised here. After all, the main point of the perennial debate was that “maintaining the PEPs is an unnecessary overhead” - if the manylinux2014 supporters dismiss that objection and then fail to do that spec maintenance, that’s cheating
I don’t want to see another manylinux20xx proposal after this one. In my view, the perennial manylinux proposal raised some important sustainability questions which we must answer. To that end, I’d expect the next manylinux specification after this one to be some form of perennial approach. Whether it’s the existing perennial proposal, updated to address the concerns and issues raised during this discussion, or an independently developed proposal, I don’t mind, but we need something that gives us an ongoing solution.
In particular, I’d like discussion to start relatively soon. I know that people are burned out by now, and the work to actually implement manylinux2014 will be a distraction, so we all need a break, but it’s not like distribution EOL dates come as a surprise. I wasn’t particularly comfortable with us being under pressure to find a solution this time because “we need something quickly”, and I won’t accept that argument in future.
Thanks to everyone who participated in the discussion. It’s often hard to get people involved in this type of debate, so thanks everyone for the work you put in.
In particular, thanks to @njs for arguing for the perennial manylinux proposal. In spite of the fact that I ultimately approved manylinux2014, your comments were important and resulted in a much more useful discussion. I don’t see my decision as in any way rejecting the “perennial” idea, but more as a way to give it the time it needs to be fully developed, without unwanted pressure that “we need something right now”.
Yeah, it’s a really difficult topic: it has an incredibly obscure and intricate set of domain-specific technical details, there’s a ton of potential for scope creep, and most of our general packaging experts aren’t experts in this specific domain – yet are stuck trying to steer the proposal despite that.
When I first came up with the idea of manylinux, and when were writing the original PEP, there was intense skepticism and push-back and we had to fight hard to get it through with a reasonable scope. It’s sort of ironic that now it’s the original PEP that everyone takes for granted and the idea of doing something different that makes people nervous, but I guess that’s how these things go.
I’ve actually never maintained auditwheel. My main role in manylinux has been as a kind of technical lead, dealing with the overall vision, PEP process, project management, and acting as a problem-solver-of-last-resort. The perennial proposal is scoped the way it is because AFAICT it’s the best available architecture for the ecosystem as a whole, not because I’m trying to selfishly save myself some work.
Not sure what you mean here. Creating and maintaining the build environments is certainly a pain point, in the sense that it would be nicer if they just magically existed without anyone having to do any work. But unfortunately a PEP cannot wish a build environment into existence :-). And the actual build environment maintenance and availability is identical across every possible proposal I can think of.
Yes, it’d be nice to have better support for more manylinux versions in more tools. But again, that’s beyond the powers of a PEP; either someone will do the work or they won’t.
There are lots of ways to solve this – we can squint at download statistics, we can survey maintainers, we can collect together relevant information and put it in our documentation – but I don’t see how any of them involve the PEP process.
This is pretty similar for all the proposals we’ve seen: manylinux20XX makes it a bit easier to get a rough guess about whether it will work, and perennial manylinux makes it a bit easier to get a definitive answer. Either way, our main focus is on providing wheels that just work on as broad a range of platforms as possible.
I don’t see how it’s possible to do better, because there simply isn’t any widely-understood versioning scheme shared across different Linux distros.