Wheels for musl (Alpine)

Edit: thanks to @uranusjr there’s now a pre-PEP for this.

Please correct me if I’ll mix something up, I don’t have an in-depth knowledge of the topic.

Lots of people use CICD images based on Alpine. This distribution uses different implementation of libc: musl (see Alpine Linux has switched to musl libc | Alpine Linux).

For this reason, many packages need to use a compilation step when installed on Alpine images. Normally this would not be an issue, as it would only affect the installation time.

This changed today, when lots of users have been “introduced” to cryptography by surprise. While there was a good reason for the update, it seems like it exposed a missed opportunity: some of the fallout could’ve been easily fixed by releasing built distributions that target Alpine.

As far as I understand, there’s currently no way of doing that.

Given a demonstrable popularity of Alpine, wouldn’t it be a good idea to introduce a way of providing built distributions that target it?

3 Likes

The glibc-based distributions had a champion that proposed a PEP standard for building wheels: PEP 513 for manylinux1, PEP 517 for manylinux2010, PEP 599 for manylinux2014, PEP 600 for the perennial manylinux standard. The standard uses glibc as a type of baseline for compatibility across various linux distributions such as debian and CentOS. Someone from the musl community needs to come forward and propose it as a PEP that can be evaluated, discussed, approved, and implemented.

1 Like

I’d like to add that besides Alpine there are other envs and it’s quite reasonable to explore BSD-like OSs as well.

A PEP is needed that would declare what runtime dependencies are allowed to be linked against. One of the main differences is that Alpine doesn’t ship glibc but all manylinux tags assume glibc specifically. And a PEP for Alpine would need to consider musl as an alternative to use in the standard. One of the requirements, though, it that it’d only allow linking against things that are usually present on the system by default, not requiring the users to do any apk add before pip install.

There’s a thread on this a while ago:

There was no follow-up, but from the linked musl mailing list response, it seems like most of the technical difficulties around detecting musl have been cleared, and the main obstacle is someone to champion the effort to research related distrobutions, and either propose a manylinux equivalent, or a way to incorporate them into the manylinux specification (if musl is found to have a common ABI to glibc where manylinux concerns).

1 Like

Just want to chime in to say that as a packager of a reasonably popular Python package, which has non-trivial dependencies, we’d love to ship Alpine/musl wheels!

5 Likes

I was afraid that framing it into “support for non-glibc” rather than “support for musl” might be a bit much. Having wheels for Alpine would mean that you now have a <10MB linux distro that most packages can be installed on blazingly fast, and that’s the value proposition that I was targeting with this post.

Does it have to be a systematic approach for all of such cases, or could it be just “support for Alpine-like stuff”?

FWIW, that discussion was basically about musl support and what that would entail. :slight_smile:

And the answer is what has been stated above: someone needs to figure out the details of how it’d need to work, write a design document describing that (i.e. write a PEP about it) and then we can go from there.

2 Likes

I added BSD specifically because that’s where we faced problems today. This can be a separate standard, I guess.

It can be Apline-like as long as it’s actually similar. So that it wouldn’t end up having too much feature creep.

I think it can be scoped however the person making the proposal wants. If you’re only interested in Alpine, feel free to propose a standard for that case only. More tightly-scoped proposals are less likely to get bogged down in debate and scope creep - and someone with a clear and focused vision is likely to have a much better chance than someone with a broader, but less clear idea.

And IMO Alpine/MUSL is an important enough target (thanks to Docker) that I think it warrants consideration for a manylinux-style standard.

4 Likes

Does there need to be a different cibuildwheel CIBW_MANYLINUX_X86_64_IMAGE or just e.g. a different CIBW_REPAIR_WHEEL_COMMAND in order to build wheels on Alpine with musl libc?
https://cibuildwheel.readthedocs.io/en/stable/options/#repair-wheel-command

It depends on

  1. If musl can fake itself enough to pass manylinux’s glibc checks.
  2. If Alphine’s system libs qualify a manylinux spec.

If not, Alpine can’t be manylinux and a new image is needed. In either case, either auditwheel gains additional logic to fix musl, or a separate tool is developed to do that (and be used instead of auditwheel in CIBW_REPAIR_WHEEL_COMMAND).


Edit: I quickly checked and musl does not pass the glibc checks. So cibuildwheel will need a few new columns to support it.

1 Like

Note that PEP 600 explicitly says:

Adding extra words to the tag string : Another proposal we considered was to add extra words to the wheel tag, e.g. manylinux_glibc_2_17 instead of manylinux_2_17. The motivation would be to leave the door open to other kinds of versioning heuristics in the future – for example, we could have manylinux_glibc_$VERSION and manylinux_alpine_$VERSION.

But “manylinux” has always been a synonym for “broad compatibility with mainstream glibc-based distros”; reusing it for unrelated build profiles like alpine is more confusing than helpful. Also, some early reviewers who aren’t steeped in the details of packaging found the word glibc actively misleading, jumping to the conclusion that it meant they needed a system with exactly that glibc version. And tags like manylinux_$VERSION and alpine_$VERSION also have the advantages of parsimony and directness. So we’ll go with that.

Re-reading PEP 600, the recurring idea behind it is “play well with others”, which is an implicit understanding between glibc-based Linux distributions even before the PEP. If we continue that idea, an Alpine-compatible platform tag (something like alpinelinux_{musl_version}_{arch}) could simply be defined as “anything that looks sufficiently like a base Alpine distribution using >={musl_version}”.

Thank you for working on this!

Correct.

I think Issue 43112: SOABI on Linux does not distinguish between GNU libc and musl libc - Python tracker needs to be acknowledged and fixed first. It will be difficult to add support for binary wheels unless upstream python recognizes that the musl vs glibc ABI is more than the calling convention.

we could have manylinux_glibc_$VERSION and manylinux_alpine_$VERSION.

I don’t think manylinux_alpine_$VERSION linux makes sense. It makes much more sense to do manylinux_musl_$VERSION. I dont think we should use alpine in there at all, since there are more musl libc based distros out there like void linux, Gentoo, adelie linux, openwrt and sabotage linux.

I don’t think it needs to be fixed, there are other ways to detect whether a Python executable is linked against musl instead of glibc. Although it’d certainly make things easier if Python can encode that information at compile time.

As mentioned above, manylinux_musl_$VERSION is also not appropriate here, given the decision on PEP 600. An appropriate platform tag should not contain the string manylinux at all.

1 Like

I also have a way to detect musl libc at runtime in a PR for find_library. However I think it would be much nicer if it could be detected/configured at compile time.

It has to be based on something, right? My current intuition on the matter is that it would be the easiest to point at some particular version of Alpine and say: this is what you need to limit yourself to - in the same way manylinux does with CentOS. Is there a better candidate for it than Alpine?

The PEP effectively says “manylinux == manylinux_glibc, but the latter is confusing so let’s not do that.” However, this is based on the historical details and so-far exclusive relevance of that tag.

If indeed the various distros that @ncopa mentions are similar enough to warrant having a many* standard for, then I think those previous considerations have much less weight vis-à-vis having a broadly usable image (rather than an alpine-specific one). Whether it’s manymusl or manylinux_musl is then only bikeshedding (as long as it doesn’t interfere with existing packages and infrastructure).

I believe what was meant was that the manylinux_alpine name is bad. musl is a libc used across quite a few Linux distros. Using alpine (the distro) as a base is reasonable, but naming the specification as <something>_alpine isn’t.

They should be. I am basing myself off of the manylinux2014 specification, to try and see what is similar enough between these environments.

Supported architectures are all covered by musl:

x86_64
i686
aarch64
armv7l
ppc64
ppc64le
s390x <-- Void Linux doesn't provide this, but Alpine and Gentoo do

Regarding libraries:

# provided by modern GCC
libgcc_s.so.1
libstdc++.so.6
# provided by libc.so on musl, but all functionality is available
libm.so.6
libdl.so.2
librt.so.1
libc.so.6
libnsl.so.1
libutil.so.1
libpthread.so.0
libresolv.so.2 <-- this is the last one provided by libc.so
# provided by Xorg and related projects, haven't changed ABI in a long time
libX11.so.6
libXext.so.6
libXrender.so.1
libICE.so.6
libSM.so.6
# provided by glvnd, very stable
libGL.so.1
# provided by GLib, also very stable
libgobject-2.0.so.0
libgthread-2.0.so.0
libglib-2.0.so.0

Adding OpenSSL libraries to this list might even be possible (depending on how the OpenSSL 3 release is going to work), given that Alpine and Gentoo default to it, and Void Linux is moving to it as well. But since manylinux2014 doesn’t include it, there’s no reason for us to include it either.

Creating a package list for them should be pretty simple, once a PEP rolls out. Since musl doesn’t support symbol versioning, that part isn’t a concern to us.

Given that this is going to be a modern standard, caring only about recent Python 3 versions also seems reasonable to me.

All that said, I believe the biggest concern is the changes between musl 1.1.x and 1.2.x, specifically for 32bit devices. I would say this matters, since armv7l is likely (?) to be a supported platform. musl 1.2.0 implemented the time64 transition, which changed time_t to be 64 bits on all architectures. While ABI compatibility was maintained with older binaries, this change means that any binary built on musl>=1.2.0 is unlikely to work with musl<1.2.0, and, more importantly, libraries that use time_t somewhere in their external API can end up subtly miscompiled. Therefore, I would argue for this new standard, in the interest of future proofing (we all want things to work post 2038, I think :P), to use some musl>=1.2.0.

2 Likes

Do you happen to know when Alpine transitioned to musl 1.2.x? Making the version requirement sounds like a good technical decision, but it wouldn’t be very helpful if most people out there can’t use it.