PEP 656: Platform Tag for Linux Distributions Using Musl

Great! Thanks @uranusjr!

When it gets accepted, apart from support in packaging, is there anything else needed to get the support into Pip?

2 Likes

In the strictest sense, packaging.tags support is all that’s needed for pip. Practically though, a few more things need to happen before we can actually see people start publishing musllinux wheels:

  • PyPI need to start allowing those wheels.
  • We need something to convert a linux_{arch} wheel into a valid musllinux_{arch}, like auditwheel for manylinux. (It’s probably a good idea to include this in auditwheel as well.)
  • (Optionally) Expand cibuildwheel to include musllinux.

I would be out of my depth for these (especially the latter two) so much help would be very appreciated.

1 Like

Thank you for pushing musl support forward!

The proposal seems to currently assume that musl is being dynamically linked (which is how Alpine does things, with a /lib/ld-musl-x86_64.so.1 that ~all binaries link against). However, musl can be statically linked. I believe this warrants distinction in the platform tag somehow because the runtime requirements and capabilities of a dynamically linked musl Python are different from a statically linked Python!

For example, a statically linked musl possibly (always?) isn’t a dynamic executable. This means it can’t use dlopen(). This means it can’t load file-based extension modules (extensions have to be statically linked into the binary).

If there were a single platform tag for dynamic and static musl variations, we could run into a situation where a static musl Python attempts to find/load extensions with a dynamic musl library dependency. And of course statically linked musl wheels wouldn’t be anything like binary wheels today: because dlopen() doesn’t work, you’d need to include the ELF object files or static library archive (instead of a shared library) and link a new binary to incorporate the extension. That’s a radical departure from how today’s shared library based wheels work and likely a tough road to official adoption/support. I only bring it up to reinforce that there are substantial implications for musl wheels depending on how musl is linked.

I’ll note that while Alpine uses dynamic musl linking, the musl Python distributions produced by GitHub - indygreg/python-build-standalone: Produce redistributable builds of Python statically link musl, enabling those binaries to run on ~every Linux system in existence. PyOxidizer uses these musl distributions to produce single file Linux executables that can be copied to any Linux machine and just work. So statically linked musl Python exists in the wild (although probably isn’t used heavily at this time).

Something else we may want to consider is what happens if you try to mix static and dynamically linked musl into the same binary process. I’m sure there’s a way to coerce that into existing. Assuming we only care about dynamically linked musl wheels at this time, perhaps there should be a recommendation that all musl symbols in the ELF be marked as undefined [and serviced by a shared library]? I’m unsure about musl, but when loading symbols from 2 glibc into the same process, ugly things happen. So any checks we deploy to prevent this could prevent some ugly bugs.

1 Like

Thanks for the feedback! The statically link issue was also previously discussed in the “Wheels for musl (Alpine)” thread (around here). While it is indeed an issue, as mentioned by Nathaniel, a statically linked Python interpreter cannot meaningfully use the extension module in a wheel (musllinux or otherwise), so a dynamically linked interpreter is more or less assumed by a platform tag definition for wheels. With that said, I would have no problem specifying explicitly the platform tag defined in PEP 656 only applies to Python interpreters dynamically linked against musl, or even to the wheel format specifically. How do you think would be best to achieve this? Would it be enough to include “Wheel” in the PEP title? If we need to add some specific description in the text, do you have a suggestion?

1 Like

It’s ok to limit the scope of your PEP to common cases and not support dynamically linked interpreters or mixed builds with glibc. You could include something like

This PEP only applies to Python interpreters which dynamically linked to musl shared library. Statically linked interpreters or mixed builds with glibc are out of scope and not supported by musllinux platform tags.

1 Like

However, it might be reasonable to require installers to not claim compatibility with the musllinux tags if they statically link the library. I don’t think there’s any precedent on this - I know of no other system where static linking the C library is common.

2 Likes

Do those binaries work with Linux Kernel 2.2 from 1999? :sweat_smile:

Sorry, I couldn’t resist. I’m just wondering, what’s minimum Kernel ABI for a static binary? For example does MUSL still support SYS_socketcall multiplex dispatching on X86 ABI?

2 Likes

I’ve added a paragraph to limit the scope to dynamically linked Python interpreters running on musl libc.

2 Likes

This is somewhat off topic, but there is of course a limit to how old of a Linux a statically linked binary will run on :slight_smile: musl libc - Supported Platforms says Linux kernel >= 2.6.39 with undefined support for older. And when we talk about machines that are that old, we need to talk about instruction set compatibility (e.g. no SSE/AVX).

To move this discussion forward, maybe the platform tag should somehow denote Linux ABI compatibility like the existing manylinux tags do? I’m skeptical we’ll ever need to version it given backwards compatibility commitment from musl and Linux on the interfaces used. But the long arc of time has a way of invalidating assumptions.

1 Like

IMO this is one of the things that fall into the “play well with others” rule introduced by PEP 600. Wheels can only make use of newer kernel features when the version is generally shipped by default by all popular musl-based distros, and once all major musl-based distros ship a certain kernel version by default, users that rely on older kernel versions are automatically removed from the coverage of a musllinux tag.

I am honestly not familiar with how popular musl-based distros handle their kernel version support, nor know how the “play well” rule is going to work out (since AFAIK no projects are currently distributing PEP 600 wheels yet). But I feel it’s probably better for the community as a whole if both glibc and musl distros can use this same rule so the experience learnt by either can advance things for both sides (and other libc implementations if they need their own platform tag one day).

1 Like

Discussion seems to have quieted down. Are there further issues I need to address?

2 Likes

My only comment is that the passage

Logic behind the new platform tag largely follows PEP 600, and require wheels using this tag make similar promises. Please refer to the PEP for more details on rationale and reasoning behind the design.

could be read as if PEP 656 provides more details, when I think you mean PEP 600 does. I would say “Please refer to that PEP” to be explicit.

Also, I think “and require wheels using this tag” should be “and requires wheels using this tag”.

1 Like

I have submitted a change to explain more about the perennial design (a rewrite of this comment above). The new paragraph is a reiteration to things already covered by PEP 600, but reading this thread again, many comments seem to still want to add ideas from previous manylinux specs intentionally not covered by PEP 600 (and therefore this PEP), so I figured it’s best to repeat the rules again in PEP 656.

The PR also contains the wording clarification and typo fix suggested by the immediately above comment. Thanks!

2 Likes

musllinux support in packaging is ready for review: PEP 656 musllinux support by uranusjr · Pull Request #411 · pypa/packaging · GitHub

2 Likes

Another week has passed without further comments, and ~1month from the last meaningful update. With the intention to move things forward, may I ask for pronouncement from the BDFL-Delegate @pfmoore?

The implementation for installation-time tag compatibility detection has been proposed to packaging (linked in the comment immediately above), and I plan to move onto the wheel generation part. The first thing I’ll do is to open an issue auditwheel to discuss whether it’s a good fit for this, and decide whether to expand auditwheel or develop an entirely standalone tool. Either way, cibuildwheel will then be able to use a command to “repair” (using auditwheel terminology) a wheel into musllinux compatible.

1 Like

I’ll try to take a look in the next few days. If I’ve not responded by the end of the week, please ping me again.

1 Like

It is with pleasure that I formally accept PEP 656. Congratulations @uranusjr and thanks for moving this through to a successful conclusion.

As with perennial manylinus, I don’t claim to be an expert in the technical details of MUSL, so I’m taking the view that the technical experts in the community have had the opportunity to flag any concerns or issues, and the lack of any such feedback indicates that we have consensus that the proposal is technically sound.

The approach of requiring wheels claiming musllinux compatibility to work on all “mainstream” distributions, and to “play nice with others” matches the approach in PEP 600, and I expect it to work as well here as it did in that case. But I would advise people working on the specification to take a cautious view over what gets admitted as “available everywhere”, and in particular to focus on distributions used in containers as the “lowest common denominator”, as containers are the driving motivation for this specification.

One concern I have with the specification is that the necessity for wheels to bundle “private” copies of any libraries they use may result in an increased size for images relying on many pre-built musllinux wheels, as opposed to custom-built binaries that share libraries. This may be an issue for containers (where overall size is important) and I encourage the people working on the practical aspects of the musllinux toolchain to monitor the impact of this.

10 Likes

Thanks Paul! I’ve submitted a PR to change to status: PEP 656: Accepted by uranusjr · Pull Request #1928 · python/peps · GitHub

(Edit: The PR was merged before I can finish this post. Wow that’s quick.)

The implementation for musllinux identification in packaging is also close to ready, so it even has a chance to be included in pip 21.1 (due this month).

This leaves the last missing piece to actually produce and validate those wheels. I have a mind in looking into this, but please do feel free to do it if you know what to do. I haven’t even figured that out yet, and don’t really have much time to move things forward recently.


100% spot on, this is also my biggest concern going forward. My hope is there’s going to be some kind of community with people coming from different Linux distributions (Alpine obviously, Arch has a lot of users as well), but from what I can tell there doesn’t seem to be a lot of communication going on—most of the time I hear about them is people complaining something doesn’t work to the general Python users community, and those people don’t seem to talk to each other at all. I would be very interested in seeing some sort of user circles happening (although probably not in the position to start one myself).

3 Likes

There’s a talk at the language summit about packaging Python for Linux: PyCon 2021 - Python Language Summit 2021. I have tried to get the Linux distros into a common spot, but so far been unsuccessful in convincing them to follow through on the suggestion.

2 Likes

I’d really like a better reference than a short email on the backwards compatibility issue.

Binary ABI compatibility is way outside my territory so I don’t know if/how relevant this is but I just found a musl ABI tracker which indicates when symbols have been added/modifed/removed.

According to it, the only time symbols have disappeared in the last 10 years was when 1.0.1 dropped __syscall(), __syscall_cp() and __syscall_ret() (anyone heard of those?). Looking at the backwards compatibility warnings, they all appear to be irrelevant or false positives; mostly function argument renames (which doesn’t touch ABI) or occasionally a function will go from returning void to a real return value.

So that does appear to back up the claim in the email - that symbol backwards compatibility is guaranteed but the exact behaviour of those symbols is not.

1 Like