The next manylinux specification

Yes, but where? Under the current system it’s the PEP index (indirectly). That’s essentially the point - perennial manylinux needs to clarify where the canonical list of “valid flavours” exists, as it’s no longer the PEPs.

The more I see imprecise answers like this, the less I’m comfortable with perennial manylinux. I’m fine with its idea of “don’t require the PEP process for all the details”, but not with the fact that it’s not actually saying what its proposed alternative actually is other than “read the code” - which for someone like me with no experience building extensions on Linux who builds a simple C extension and just wants to “make it available to Linux users”, isn’t a realistic option (I looked at https://github.com/pypa/auditwheel and had no idea where to start :frowning:).

1 Like

My updates to the draft a couple of hours ago point to https://packaging.python.org/ as the place where flavours will be documented. This fits in with a general move to have packaging specifications there rather than scattered across PEPs.

I was being vague about it partly because I was still looking for people to agree on the idea of ‘document this somewhere other than PEPs’, and partly because I don’t feel I have the authority to say what should and shouldn’t be included in PyPUG.

I’m unable to read all of the communication here and on the perennial-manylinux PEP PR so please excuse my ignorance.

Steve Dower mentioned a concern I also have early on in this thread:

Overall I don’t personally have a lot of thoughts on this, but I’d like to at least warn that having flexible support for libc just pushes the same problem to the next dependency, whatever that happens to be for a particular wheel.

For example, browsing through the Anaconda package index for scipy and VTK I find package names such as:

I think this illustrates the issue quite clearly. There will be more constraints than just glibc.

Are we confident that the perennial-manylinux PEP will be able to avoid this scenario through the compatibility profiles it introduces? Can we learn something from conda?

In brief, wheels have to bundle all libraries apart from a small selection of very stable libraries which are present on many Linux distributions. For a less brief description, see PEP 513 and auditwheel.

This isn’t always straightforward, of course, but scipy and vtk are among the libraries already distributing manylinux wheels. So we know this mechanism works enough of the time to be useful.

The ideas being discussed here don’t make any fundamental change to how that works. The main point of contention is how and where we should maintain the list of very stable libraries which wheels can rely on being on the system.

1 Like

I assume you mean the statement “For each profile defined on https://packaging.python.org/” at https://github.com/pypa/manylinux/pull/304/files#diff-5e28e1d6d6c93938e4f223a76a71cc64R127?

That seems a little brief and doesn’t really cover any details, like the process for adding a profile, how users are informed when new profiles become available, etc, etc.

Again, if I were to provide Linux wheels for my code (that I develop and build on Windows) how would I know if & when I needed to create new Linux builds or update my Linux build toolchain? IMO the proposal seems way too focused on the perspective of users who are deeply immersed in the whole “building wheels on Linux” ecosystem/community, and doesn’t really take into account making it easy for users with limited experience and essentially no prior knowledge.

It’s not that manylinux2014 is that much better, in general packaging binaries on Linux seems incredibly overwhelming from an outsider perspective, but at least there’s a master document as a starting point for each version.

I feel like the goalposts are shifting, and in an impractical direction. Yes, we should absolutely document manylinux better for users who aren’t deeply immersed in packaging. But a PEP is not the place to do that, nor even to decide how to do that.

This discussion is already long and contentious enough without swerving into the woods of improving documentation.

If you want processes, I’ll fall back to my earlier suggestion: change to glibc-based versioning so that pip doesn’t need updates for each flavour, and keep on defining wheel compatibility in PEPs. I’m not invested enough to define a separate set of processes for creating and disseminating wheel compatibility profiles.

Fair point. I think I did acknowledge in a previous post that this was a digression, so let’s drop it.

TBH all I want is consensus :slight_smile: But we’ve had a question about how perennial manylinux affects the way package maintainers have to decide which platforms to build for. We also still have the question of “how will the platform definitions be documented”. Both of these have been responded to in the thread, but I haven’t seen anyone say “oh, OK, that’s covered then” yet, nor has there been any substantial change to the PEP (your recent edits were pretty minimal).

I’ll step back though, and let others respond. But I’m still assuming that if we don’t get consensus on a perennial manylinux proposal by the end of the month, then it’s better to give it more time to develop and go with manylinux2014 for the immediate next version. And consensus needs a bit more than the assumption that silence equals acceptance, IMO.

I’d be very surprised if additional options helped the situation at this point :slight_smile:

On Linux, it can never be quite that simple, because a toolchain that runs on a current version of a GNU/Linux-based distribution will only be able to generate binaries usable with equally current distributions.1 You have to get a “build environment,” which is some kind of VM or container image (currently I only know of Docker images for this, but they could just as easily be created for other virtualization mechanisms) containing a sufficiently old distribution, and run pip wheel within that environment, and then run auditwheel on the result to detect and/or fix up common problems.

Your hypothetical non-expert extension packager is going to do basically the same thing under both proposals: find an appropriate build environment and use it as above. The big headache for them, under both proposals, is going to be deciding which of several available manylinuxWHATEVER tags is the right choice (as long as there’s more than one whose output is accepted by PyPI at any given time, this headache exists) (usually the right answer will be “the oldest one with a $LANGUAGE compiler that accepts your code”, but there could be complications).

As I understand it, most of the work in developing either a new manylinux${YEAR} under the existing methodology or a new manylinux_${bikeshed_version} under the “perennial” proposal, is not in going through the PEP process, but in defining what “sufficiently old” means for the new tag, preparing a suitable build environment, and patching both pip and auditwheel to understand what to do with the new tag. Under perennial, pip only has to be patched once and then it understands all future manylinux’es, but I don’t see any inherent reason why pip couldn’t be patched once to grok all future YEARs if we stuck to the existing methodology. And anyway patching pip is the easiest part.


1 I say “GNU/Linux-based distribution” here because this is not a property of Linux-the-kernel; it’s a consequence of the maintainers of GCC and GNU libc deciding that it’s only feasible for them to support one-way backward compatibility — binaries compiled with an old toolchain will almost always work with a newer libc.so.6, but not the other way around. Both Microsoft and Apple have put in the extra work required to make sure that you can, for instance, compile a program on a current Windows 10 dev box and get an executable that will run correctly on Vista, if that’s what you want. There’s no inherent reason why that work couldn’t be done for (GNU/)Linux, it’s just that nobody’s being paid to do it and all the volunteers have other priorities.

1 Like

This may be the crux of the dispute. perennial sheds a chunk of work that the current maintainers of tools and build environments find to be inconvenient and redundant, but the manylinux1 PEP is currently doing the job of answering the high-level question that package maintainers have: what do I use to build binary wheels for Linux? If we stop issuing manylinuxXXXX PEPs, something else needs to take that role.

Having said that, right now we have people being confused over whether to use manylinux1 or manylinux2010, so there’s already a gap in available documentation. Nor do I mean to dismiss the tools maintainers’ goals—on the “optimize for fun” principle, we should be looking for ways to shed tasks that people find inconvenient and redundant.

I don’t imagine there would be a new build environment for every new upstream release of glibc, incidentally; those come out every six months, it would be way too much churn. Probably people would prefer something more like one new environment per LTS release of CentOS, every three to five years.