PEP 656: Platform Tag for Linux Distributions Using Musl

Continuing the discussion in Wheels for musl (Alpine), here’s my draft to propose musllinux. Sponsor needed.

Rendered restructedText version (GitHub).

The version below is converted to Markdown with pandoc. I did not read the result and it may contain rendering errors.


PEP: 9999
Title: Platform Tag for Linux Distributions Using Musl
Author: Tzu-ping Chung <uranusjr@gmail.com>
Sponsor: TBD
PEP-Delegate: TBD
Discussions-To: TBD
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: TBA

Abstract

This PEP proposes a new platfrom tag series musllinux for binary
Python package distributions for a Python installation linked against
musl on a Linux distribution. The tag works similarly to the “perennial
manylinux” platform tags specified in 600{.interpreted-text
role=“pep”}, but targeting platforms based on musl instead.

Motivation

With the wide use of containers, distributions such as Alpine Linux,
[alpine]{.citation} have been gaining more popularity than
ever. Many of them based on musl, [musl]{.citation} a
different libc implementation from glibc, and therefore cannot use the
existing manylinux platform tags. This means that Python package
projects cannot deploy binary distributions on PyPI for them. Users of
such projects demand build constraints from those projects, putting
unnecessary burden on project maintainers.

Rationale

Logic behind the new platform tag largely follows
600{.interpreted-text role=“pep”}, and require wheels using this tag
make similar promises. Please refer to the PEP for more details on
rationale and reasoning behind the design.

Specification

Tags using the new scheme will take the form:

musllinux_${MUSLMAJOR}_${MUSLMINOR}_${ARCH}

Distributions using the tag make similar promises to those discribed in
600{.interpreted-text role=“pep”}, including:

  1. The distribution works on any mainstream Linux distributions with
    musl version ${MUSLMAJOR}.${MUSLMINOR} or later.
  2. The distribution’s ${ARCH} matches the return value of
    sysconfig.get_platform() on the host system.

Backwards Compatibility

There are no backwards compatibility concerns in this PEP.

Rejected Ideas

Create a platform tag based specifically for Alpine Linux

Past experience on the manylinux tag series shows this approach would
be too costly time-wise. The author feels the “works well with others”
rule both is more inclusive and works well enough in practice.

References

Copyright

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.

::: {#citations}

[alpine]{#alpine .citation-label}

: https://alpinelinux.org/

[musl]{#musl .citation-label}

: https://musl.libc.org
:::

2 Likes

For convenience: PEP 600 -- Future 'manylinux' Platform Tags for Portable Linux Built Distributions | Python.org

2 Likes

muslinux? Save a character?

Regarding “mainstream Linux distributions”, does that include OpenWRT, which has no GTK/Gnome and X libraries at all in their repository?

These libraries aren’t in the Alpine container image, but can be installed via apk.

According to Nathaniel’s summary here on Alpine (Wheels for musl (Alpine) - #36 by njs), none of those libraries should be expected from a mainstream musl-based distribution. Manylinux wheels should not rely on the system package manager providing dependencies, so should not musllinux. As an analogy, PyQt5 wheels for manylinux do not depend on Qt libraries being available, even though they are installable from package managers. Those libraries are vendored into the wheel instead.

1 Like

There are two open PRs related to musl libc. The first one defines a platform triplet for shared libraries ABI on musl system. Could you please take a look?

1 Like
musllinux_${MUSLMAJOR}_${MUSLMINOR}_${MUSLSECONDMINOR}_${ARCH}

A binary built on 1.1.24 isn’t guaranteed to work on 1.1.23, though as mentioned in other threads, it’s unlikely that compiled Python extensions will be using things just recently added to the library.

I would say “musl-based container” instead, because the distros themselves definitely provide those libraries. IMO it would be reasonable (at a later point in time) to define a “batteries-provided” container spec that includes more than just musl and zlib.

1 Like

Is this specifically about 1.1.23 and 1.1.24, or how musl works in general? The tag can be made to include the micro/patch part if needed, but that’s probably not worth it if this is specific to this version combination.

“Container” would be a weird wording IMO since containerisation has nothing to do with this. A Linux distribution is a Linux distribution, it does not matter if it’s a container or not.

Quoting PEP 600:

The word “mainstream” is intentionally somewhat vague, and should be interpreted expansively.

If “musl-based container” has a meaning to the community other than a base Alpine installation, it’s up to the people to decide what a “mainstream musl-based distribution” means. With that said, if you feel it’s more beneficial to define a battery-included platform instead, feel free to write a PEP for it. I’ve seen how painful the previous manylinux iterations were, and want to stick to the conclusion they eventually reached (PEP 600), but my opinion should not by any means stop you.

1 Like

That shouldn’t be much of an issue - the manylinux approach has been from the start to “build with the lowest common denominator”, which is then still compatible with newer versions at runtime. So unless there are critical bugs to avoid, musllinux_1_1 / musllinux_1_2 should (would) probably use the lowest sensible patch level to avoid just this problem.

1 Like

That was just an example. As you can see in the “new features” item for each release in WHATSNEW - musl - musl - an implementation of the standard library for Linux-based systems , multiple 1.1.x and 1.2.x releases included new features (be they functions or options to pass to functions).

Oh, I think if it ever exists, it would definitely be in the future; I hadn’t even thought of OpenWrt, which is a distro that won’t even have the GUI related libraries available. The current approach seems reasonable, sorry if I made it seem otherwise :slight_smile:

Sorry, that was just a nit on the wording of that comment, since most musl distros can pretty easily install those other libraries (and I generally read “available” as something that can be installed, not as something that’s present by default). But getting something off the ground with simpler requirements makes a lot of sense to me.

1 Like

I like “manylinux2010” name since it gives an idea of the ABI age, and it’s simple to remember (single number). Why not reusing this scheme for musl? I expect the ABI would include way more libraries versions than just the musl version. For example, it can also include a specific C++ ABI version through the compiler.

manylinux2010 gives an explicit list: PEP 571 -- The manylinux2010 Platform Tag | Python.org

Do you need to care about “compatibility with kernels that lack vsyscall” that glibc had of recent Linux kernels? Does musl use that?

Which architectures do you target? Architecture - Alpine Linux lists:

  • x86
  • x86_64
  • armhf
  • armv7
  • aarch64
  • ppc64le
  • s390x

I understand that manylinux2010 only supports x86_64 and i686.

It would be nice to give a link to manylinux2010: PEP 571 -- The manylinux2010 Platform Tag | Python.org

1 Like

Why not reusing this scheme for musl? I

The new spec should follow PEP 600. The next manylinux* tag will be a PEP600 compliant manylinux_2_24 That is a step forward, since it more aptly describes what the spec is about: the glibc version it supports. The versions of the other libraries in the explicit list are more of a “minimum viable product”

It would be nice to give a link to manylinux2010

Maybe you meant manylinux2014 PEP 599 which does support the architectures you list (except armv7 and armh4).

1 Like

This scheme is based on PEP 600, which from my understanding is the preferred approach going forward. It’s definitely possible to use the year-based approach if you are willing to take the responsibility to update the year number periodically for the community. I’m not doing that.

I see little benefit in using a year-based scheme in practice either (for musl specifically). The most popular musl-based distribution by far is Alpine, the base image of which only contains one additional library (libz), so a year-based scheme is little more than a mapping that translates musl version to year number.

Year-based manylinux platforms limit architectures to i686 and x86_64 because that’s what CentOS has. A libc-version-based scheme does not have the same constraint, and applies to any architecture that runs musl and Python.

2 Likes

Why are year-based manylinux platforms limited to X86 and X86_64? manylinux2014 and CentOS 7 also supports aarch64, ppc64le, and s390x.

1 Like

Sorry, the comment was based on manylinux2010 and I missed that 2014 added additional architectures.

What I was trying to say is, year-based manylinux platforms need to explicitly specify the architecture because they are limited to what the “base” can offer (CentOS); a libc-version-based platform does not have this limit because it’s defined by functionalities, and any architecture that all the “playing well” libraries can be built on automatically qualify; vice versa, if any of those common libraries cannot be built, the architecture is auromatically out. PEP 600 does not contain an architecture list either.

2 Likes

Update: This draft has been merged as PEP 656 and published at PEP 656 -- Platform Tag for Linux Distributions Using Musl | Python.org

2 Likes

Great! Thanks @uranusjr!

When it gets accepted, apart from support in packaging, is there anything else needed to get the support into Pip?

2 Likes

In the strictest sense, packaging.tags support is all that’s needed for pip. Practically though, a few more things need to happen before we can actually see people start publishing musllinux wheels:

  • PyPI need to start allowing those wheels.
  • We need something to convert a linux_{arch} wheel into a valid musllinux_{arch}, like auditwheel for manylinux. (It’s probably a good idea to include this in auditwheel as well.)
  • (Optionally) Expand cibuildwheel to include musllinux.

I would be out of my depth for these (especially the latter two) so much help would be very appreciated.

1 Like

Thank you for pushing musl support forward!

The proposal seems to currently assume that musl is being dynamically linked (which is how Alpine does things, with a /lib/ld-musl-x86_64.so.1 that ~all binaries link against). However, musl can be statically linked. I believe this warrants distinction in the platform tag somehow because the runtime requirements and capabilities of a dynamically linked musl Python are different from a statically linked Python!

For example, a statically linked musl possibly (always?) isn’t a dynamic executable. This means it can’t use dlopen(). This means it can’t load file-based extension modules (extensions have to be statically linked into the binary).

If there were a single platform tag for dynamic and static musl variations, we could run into a situation where a static musl Python attempts to find/load extensions with a dynamic musl library dependency. And of course statically linked musl wheels wouldn’t be anything like binary wheels today: because dlopen() doesn’t work, you’d need to include the ELF object files or static library archive (instead of a shared library) and link a new binary to incorporate the extension. That’s a radical departure from how today’s shared library based wheels work and likely a tough road to official adoption/support. I only bring it up to reinforce that there are substantial implications for musl wheels depending on how musl is linked.

I’ll note that while Alpine uses dynamic musl linking, the musl Python distributions produced by GitHub - indygreg/python-build-standalone: Produce redistributable builds of Python statically link musl, enabling those binaries to run on ~every Linux system in existence. PyOxidizer uses these musl distributions to produce single file Linux executables that can be copied to any Linux machine and just work. So statically linked musl Python exists in the wild (although probably isn’t used heavily at this time).

Something else we may want to consider is what happens if you try to mix static and dynamically linked musl into the same binary process. I’m sure there’s a way to coerce that into existing. Assuming we only care about dynamically linked musl wheels at this time, perhaps there should be a recommendation that all musl symbols in the ELF be marked as undefined [and serviced by a shared library]? I’m unsure about musl, but when loading symbols from 2 glibc into the same process, ugly things happen. So any checks we deploy to prevent this could prevent some ugly bugs.

1 Like

Thanks for the feedback! The statically link issue was also previously discussed in the “Wheels for musl (Alpine)” thread (around here). While it is indeed an issue, as mentioned by Nathaniel, a statically linked Python interpreter cannot meaningfully use the extension module in a wheel (musllinux or otherwise), so a dynamically linked interpreter is more or less assumed by a platform tag definition for wheels. With that said, I would have no problem specifying explicitly the platform tag defined in PEP 656 only applies to Python interpreters dynamically linked against musl, or even to the wheel format specifically. How do you think would be best to achieve this? Would it be enough to include “Wheel” in the PEP title? If we need to add some specific description in the text, do you have a suggestion?

1 Like

It’s ok to limit the scope of your PEP to common cases and not support dynamically linked interpreters or mixed builds with glibc. You could include something like

This PEP only applies to Python interpreters which dynamically linked to musl shared library. Statically linked interpreters or mixed builds with glibc are out of scope and not supported by musllinux platform tags.

1 Like