The next manylinux specification

dstufft · June 7, 2019, 11:27pm

FWIW this in large part is why pip and Warehouse try very hard not to diverge from the behavior defined in any relevant PEPs, and if we need to our first step is to always update the PEP, and then work backwards from there. Due to history that wasn’t always done, but I think we’re pretty good about it now. It sounds like auditwheel should be taking a similar stance, that it implements the PEP and if the PEP is wrong step 1 is to fix the PEP (or the specification living on packaging.python.org or whatever).

njs · June 8, 2019, 12:14am

IIUC, there are two separate issues with Tensorflow:

They intentionally chose not to support older distros that are included in the manylinux1 tag, but continued to label their wheels as manylinux1. (I think they ran them through auditwheel to handle vendoring and the other checks it does, and then after it reported they weren’t manylinux1 compliant they renamed them manually.) This is pretty clearly spec-non-compliant, and if warehouse starts checking uploads then this will stop happening, regardless of which approach we take for manylinux-next.
There’s some not-well-understood bug that causes crashes when you mix Tensorflow wheels and PyArrow wheels in the same process (or maybe other wheels using C++). This shouldn’t be possible, based on what I know about how everyone is building their wheels, but apparently there’s something I don’t know. Since we don’t understand it, it’s not 100% clear whose fault it is. It might be a problem with the toolchain we use in the manylinux1 image (which ultimately comes from Redhat). It might disappear when everyone switches to manylinux2010, where the toolchain needs fewer hacks to support modern C++. If it doesn’t disappear, then I guess sooner or later someone will figure out what it is, and then we’ll figure out how to fix it. Whatever it is, none of the PEPs currently block it, and auditwheel doesn’t check for it, because we don’t know how.

To the larger point: we’ve always been pretty intentional about saying manylinux wheels are supposed to coexist peacefully with other packages. For example, this is why we mangle the sonames of vendored libraries – if we didn’t then any one package would work fine, but two packages might crash. Currently none of the manylinux PEPs or PEP drafts say anything about this explicitly, except that I guess you could argue “a system that already has pyarrow installed” counts as a manylinux_X compatible system under the perennial manylinux definition, and so tensorflow should work on it? It might make sense to say something explicit about this. I don’t think it affects the perennial-vs-2014 question.

I’m not too worried about finger pointing. If we had a patch for the bug I’m sure it’d be accepted; none of these projects are trying to cause segfaults.

Yeah, that’s been the goal; I’m just saying I’m not sure if we’ve been succesful :-). And it’d be nice to stop worrying about this, because it’s makework. The PEPs that warehouse and pip implement are generally interoperability PEPs, where other people are writing software that relies on pip/warehouse doing what the PEP says they do. The manylinux PEPs are different; the only public interface that people rely on is “when I install a wheel it doesn’t segfault”, and the detailed requirements are just documentation for how we accomplish that. No-one else’s software relies on auditwheel matching the PEP.

gunan · June 8, 2019, 6:20am

Hi, Gunhan from TensorFlow team again. Just wanted to weigh in from our perspective on the perennial vs 2014 problem.
Manylinux 2010 has been a welcome change, and we are working hard to ensure manylinux 2010 compatibility. We have resolved our nvidia library dependencies, and creating our centos 6 build environment. I am optimistic that we will finally be uploading compliant packages to pypi very soon.
However, our work has shown that while manylinux2010 is a welcome change, but it is still difficult for us to maintain compliance. A standard on a newer operating system will be crucial for us to achieve and maintain compliance easily.
Our discussion so far on this thread has shown me that while perennial manylinux is the way in the long term, it seems to me that there are many different ways to approach it and we do not seem to be converging yet. I am afraid the discussion may take months to finalize, and then its implementation may mean that centos 6 reaches EOL, at which point I won’t be allowed to run it anywhere, and that means TensorFlow will be either locked out of pypi, or forced out of compliance with manylinux 2010.

Therefore, I strongly believe that before we think long term solutions, we need a path that will help packages like us to stay compliant past 2020. I think manylinux 2014 proposal authored by Dustin gives us that path. Moreover, having manylinux 2014 does not mean we should stop our work on a futureproof solution.
I do not think we should see this as a perennial-vs-2014 problem. We urgently need a standard that let’s us use an OS more recent than 9 years ago. I think while perennial manylinux is the right long term solution, we need a solution that we can land before the os we are building on is 10 years old. I think manylinux2014 is the answer to the short term problem.
Finally, in case there is a worry about work that needs to be done. We, tensorflow community, would like to help land manylinux 2014 as quickly as possible.

takluyver · June 8, 2019, 12:34pm

I saw just as I was going to bed last night that @pf_moore mentioned me as potentially being able to clarify some things here. As I still don’t have a strong preference one way or the other, let’s see if I can bring clarity on a couple of points.

I think calling auditwheel profiles ‘implementation defined’ makes the idea sound worse than it is, although it’s perfectly true in a literal sense. Auditwheel exists as a way to define these library & symbol profiles in a format usable in code, along with some utilities that make use of them. The definitions exist as JSON within the package. Maybe it would be clearer if the definitions were packaged separately from the code to check & repair wheels against them, but in either case, it’s not ‘implementation defined’ in the same way that, for instance, the behaviour of setup.py is.
I think Nathaniel’s saying that the functional definition of what external libraries a wheel can rely on is “those that exist and are stable on mainstream Linux distributions with at least glibc 2.x.” We can come up with a list as a hint, but if we discover that e.g. Fedora is missing something on the list, we remove it from the list rather than saying “manylinux X wheels don’t work on Fedora”. So the list is advisory, not definitive. But this would mean that you can never definitively know that a wheel is manylinux X compatible, only that it satisfies the current recommendations. I’m not entirely comfortable with this, which is why my draft of the PEP is framed around auditwheel profiles as definitive.

In light of @gunan’s post, I’ll also plug once more my idea for a compromise. Why don’t we define a new explicit profile in a PEP (satisfying those who want to make the next manylinux available ASAP), but switch to a versioning scheme based on glibc version numbers, so we can work towards pip recognising future tags without needing an update for each profile.

njs · June 11, 2019, 12:05am

Hi @gunan, welcome!

So just to make sure I understand, it sounds like as far as you’re aware, there’s no particular benefit of manylinux2014 over perennial manylinux – either could solve your problems equally well if implemented – you’re just eager to see something move forward ASAP?

soumith · June 11, 2019, 1:26am

I would like to point out one great advantage of a perennial manylinux PEP.
With a perennial PEP, at some future date we don’t have to be worried about whether everyone will upgrade their pip to the latest version.
Currently, it is a worry / risk (even) with manylinux2010, that distro defaults in any recent distro, for example like CentOS7, Ubuntu 16.04 etc. wont have pip upgraded until a much later date (or at worse never) to a sufficient version that can recognize these wheels.

gunan · June 11, 2019, 5:52am

Hi @njs
Your summary captures my point of view pretty well.
As far as I can tell, manylinux 2014 is ready to move forward today. However, perennial manylinux is something we may want to take the time to do it right.

angerson · June 11, 2019, 7:50pm

Hey all, I work with Gunhan on the TensorFlow team. I also help run TF’s SIG Build, the group of TF developers and community members that Dustin has mentioned (I’m happy to help answer any questions about it).

I don’t fully represent SIG Build’s many members, but I can echo Gunhan’s sentiment that many of us would like to do what we can to assist with landing a manylinux successor. Personally, I share Gunhan’s view that manylinux2014 looks like a good way to resolve issues shorter-term, whereas perennial manylinux looks like a great path for the future.

njs · June 11, 2019, 10:29pm

@gunan @angerson Thanks, that’s really helpful!

So it sounds like there’s broad agreement that the perennial approach is better technically, and the obvious choice if it can be done quickly.

The core ideas are solid and haven’t changed in months, so there’s no technical obstacle to doing it quickly. I think the only potential hold-up is that we’re not sure which words we should use to describe those ideas in the PEP that will be acceptable to @pf_moore and others.

Does that sound right?

brettcannon · June 11, 2019, 11:27pm

So what’s “quickly” for people? If we can get a perennial Linux definition done by the end of the year is that fast enough? Is that enough time that if it looks like it’s taking too long we can pull off a manylinux2014 in a pinch?

h-vetinari · June 12, 2019, 6:00am

One of the main advantages being cited for the perennial approach is that pip does not need to be upgraded. In other words, we could teach pip today to deal with the manylinux_glibc_2_X_x86_64 wheels of the future. Maybe I’m thinking too simplistically, but couldn’t pip just as well be taught to deal with future r'manylinux20[0123]\d' wheels in a similar way?

Then the only difference between the proposals remains how the profiles are defined, i.e. roughly CentOs-based, or roughly glibc-based. The CentOs-CalVer-style (which has more support so far, AFAICT) could then be continued and still enjoy some of the most important advantages of the perennial approach.

takluyver · June 12, 2019, 7:53am

I’ve thought about this, but pip’s heuristic for “is this system compatible with this manylinux tag” is “does it have glibc version 2.x or greater”. If the tag name includes x, that’s all the information pip needs. If the tag name is based on a year, it needs a separately defined mapping, e.g. 2014 → glibc 2.17. So changing the numbering scheme is really the core of the perennial manylinux proposal.

ncoghlan · June 15, 2019, 5:05pm

Exactly. However, it occurs to me that we can combine the two draft PEPs, such that we make it clearer that projects can’t use arbitrary glibc version numbers in their build tags (others have suggested these points separately, so I’m mostly just summarising here):

in the perennial Linux concept, focus only on removing the need to update pip and other installers. This means keeping the “tunable heuristic” concept, but going back to requiring a PEP to define new tag versions, and keeping PyPI itself strict as to which manylinux tags it accepts for uploads
keep the calendar versioning in the PEPs, but drop it from the filenames. So manylinux2014 would be a descriptive alias for manylinux_bp_glibc_2_17, but none of the build tools would emit that alias, and none of the distribution tools would consume it (build tools should accept it as a target environment name, though, rather than requiring humans to remember the exact build profile heuristic settings that correspond to the manylinux2014 spec)
the manylinux2014 spec would need to define the new file naming scheme, but future iterations could incorporate it by reference

That way we keep the good parts of the calendar versioning scheme (i.e. roughly describing the era of Linux distros that any given manylinux definition targets, and helping to make it clear that it describes a set of ABI compatiblity constraints, not just one), while also gaining the benefits of the tunable heuristic based file naming scheme on the wheel archives themselves.

pf_moore · June 15, 2019, 5:20pm

One (very brief) caveat here. We have a manylinux2014 spec already posted, with people waiting to work on it. Let’s not waste that by digressing into “fix the world” proposals. If perennial manylinux can be got to a state where we can start implementation work quickly, then fine, but if it’s likely to need discussion on details, maybe we should accept manylinux2014, and then make perennial manylinux the next version?

One question, specifically regarding this option - are there any fundamental issues with manylinux2014 as proposed, which need fixing before it could be accepted on the same basis as manylinux2010 and manylinux1? If there are, then the option of releasing manylinux2014 while we work on perennial manylinux isn’t possible.

takluyver · June 15, 2019, 5:41pm

I think it really only amounts to a naming change. It will ultimately want some changes to pip as well, but for now we could just add specific tags like manylinux_2_17_x86_64 to the list pip knows about. So there shouldn’t be any extra implementation work, IIUC.

Of course, there’s some risk we spend a long time bikeshedding over the exact format of the name - whether it’s Nathaniel’s terser manylinux_17, or Nick’s more explicit manylinux_bp_glibc_2_17, or something in between. But if we agree on the direction we’re going, I hope we’ll be able to bring that to a conclusion fairly swiftly.

njs · June 16, 2019, 4:34am

I think it’s great that we’ve really dug into this and gathered a lot of perspectives – that’s what the PEP process is for! But I think we’ve hit the point of diminishing returns, where continuing to go back and forth isn’t adding much value.

Paul, do you want to just BDFL-delegate this one to me? I’m pretty sure I understand all the relevant issues at this point. The main things I’d do are to work with @takluyver to address your concerns about how the current draft can be read as being “circular” and clarify the logic, and with @ncoghlan to make sure I understand his concerns about the tag-name bikeshed before we make a final call on that. I think we can sort both of those out pretty quickly, and then go straight to implementing.

ncoghlan · June 16, 2019, 10:09am

From a technical perspective, I think work on manylinux2014 can start as is, as the vast majority of that work is unaffected by the perennial manylinux idea.

However, the one thing needed to turn manylinux2014 into a perennial manylinux design would be to change the current “installers must maintain a mapping from manylinux version numbers to install time heuristic checks” into “manylinux2014 build tools must encode the installer heuristic name and settings into the wheelarch archive name”.

As long as we do the latter within the next year or so, that should be soon enough to avoid an installer level rollout delay for the version after manylinux2014, and there’s no avoiding that delay for manylinux2014 itself.

pf_moore · June 16, 2019, 10:50am

I think you’re probably right, actual discussion of significant points seems to have reached its limit. However, I think there is still progress that can be made, which will be of value.

First of all, both proposals need to be turned into formal PEPs, which means pull requests against the PEP repo, which are then merged. That doesn’t mean the proposals are “set in stone”, but it does mean that we have a reference text that people can easily refer to, and there’s a common understanding of what’s the “latest state” and what’s still under discussion. At the moment we have manylinux2014 which seems to open a “create a pull request” page for me, and perennial manylinux which is full of review comments making it hard to read in its entirety.

I’m really keen that we don’t bog the process down in additional bureaucracy, but this seems key to me - is there no way that we can separate the common ideas into a subset proposal that can be agreed and can move forward?

I imagine the split being something like:

A proposal for building wheels that conform to compatibility level XXX (essentially “manylinux 2014 level” right now, with future levels to be covered in future PEPs).
A proposal (or maybe 2 competing proposals) “Compatibility tags for Linux distributions” that says what tags installers must use, and how they should determine if a system supports those tags.

We’d need a new PEP of type (1) each time we defined a new compatibility level, but the type (2) PEP could be a single document. I see two possible ways of writing it:

“Current manylinux” - a list of mappings from tags to heuristics, covering what manylinux 1, 2010 and 2014 currently say. This list would be updated (via a PEP revision) whenever a new type (1) PEP was agreed, and would need installer updates based on such revisions.
“Perennial manylinux” - documents manylinux 1 and 2010 for completeness, but then defines an algorithm that installers can implement once and for all. This proposal would have to note that future type (1) PEPs must define rules that allow the different compatibility levels to be distinguished under the algorithm defined here, but would not need installer updates.

The existing manylinux2014 proposal could be converted into a type 1 PEP and work begun on it pretty much immediately, with the perennial manylinux discussion being focused on agreeing which form of “type 2” PEP we prefer.

I still think there’s value in having a BDFL-delegate with no strong preference between the two proposals, and I hope I’m keeping up with the key issues. So unless you (or anyone else) are worried I’m missing something key, I think I’m happy keeping the decision making role for now.

However, I will be away for two weeks starting at the end of next week, so I would appreciate it if you could guide the discussion and keep things moving in my absence. I think @dustin is returning just as I disappear, so if you could get his input and incorporate that into the discussion that would be great.

Please go ahead with those anyway. They sound like key tasks to get the perennial manylinux proposal on an equal footing with manylinux2014 in terms of being ready for implementation. Those are detail level stuff that I definitely don’t want getting held up because I’m trying to understand the issues. On the “circular definition” point, I’m pretty sure we understand each other now, so I don’t think I’m a blocker there (also, @dstufft had similar concerns here, so I’m sure he can review any updates in my absence).

njs · June 18, 2019, 1:25am

Hmm, I don’t think we have any radical disagreement, but I think we actually do have a different understanding of where things are right now, so let me try to be super explicit.

I always had a tentative, provisional preference for the “perennial” proposal – that’s why I brought it to distutils-sig in the first place, because I thought it might be a good idea :-). But I didn’t ask for it to be pronounced immediately; and if you’d made me BDFL-delegate back then, I wouldn’t have pronounced immediately, because IMO a tentative, provisional preference isn’t good enough. A BDFL-delegate needs to make sure to have collected input from all sides, make sure all the issues are flushed out and examined, etc. It’s a different way of looking at things than as an individual participant.

And what I’m saying is: when looking at the current discussion through a BDFL-delegate lens, and based on my experience with both running processes like this and my knowledge of the technical details of manylinux, I’ve just now reached the the point where I believe that all the substantive issues have been understood and addressed, and I’d be comfortable pronouncing in favor of perennial manylinux in principle, with a few details to be worked out as noted.

Do you see any possible path to acceptance for manylinux2014? What is it? Even its proponents have said that they only reason they’re pushing for it is because they’re worried that perennial might be delayed, but the only reason for perennial to be delayed is because people are pushing manylinux2014. Encouraging people to work on manylinux2014 is actually encouraging them to work against their own interests.

So at this point I think the best use of our energy is on polishing off the last few details on perennial manylinux and moving onto impementation ASAP.

I am convinced there is zero value in going through the PEP process for future compatibility levels, and requiring it would end up being pure busywork. We should save that energy for more important issues.

njs · June 18, 2019, 1:30am

tl;dr: I’m not saying “let’s cut off debate in favor of my preferred solution!”, I’m saying that AFAICT all the issues have been addressed and we’re now just going through the motions; there is no more debate to be had.

If I’m wrong and someone has more issues they want to raise then please do, but if not then let’s wrap this up.