Manylinux backwards ABI compatibility guarantees

RHEL6, RHEL 7 and RHEL 8 define various levels for library compatibility across current and future versions of the distribution. The ones interesting for manylinux are:

  1. Compatibility level 1: APIs and ABIs are stable across three major releases
  2. Compatibility level 2: APIs and ABIs are stable within one major release

The libraries listed in the legacy (i.e. pre-perennial manylinux) policies fall into the following categories:

Level 1:

  • libgcc
  • libstdc++
  • glibc
  • mesa-libGL

Level 2:

  • ncurses-libs (manylinux1 only)
  • libX11
  • libXext
  • libXrender
  • libICE
  • libSM
  • glib2

In theory, this means that RHEL 9 could drop versioned symbols for the GNU libc version shipped in RHEL 6, or even that RHEL 7 drops the versioned symbols from glib2 which were present in RHEL 6. I think this is pretty unlikely as it would affect a lot of existing binary wheels, but it is conceivable that this can happen on some Linux distros for very old library versions.

I was wondering if we should perform backwards compatibility ABI checks on the symbols to detect this across distros. As an example, knowing what symbols and symbol versions were present in manylinux_2_12, we could check that all those are present on newer manylinux distro versions. We could probably do this in @mayeut’s PEP 600 compliance tool.

I’m not quite sure what do to with the results, but ideally tools like pip should be able to reject wheels which are incompatible with the runtime platform due missing versioned symbols.

Is this a valid concern? I couldn’t find anything concrete on this issue in the manylinux PEPs.

3 Likes

It is absolutely a valid concern. I’m one of the primary author’s of Red Hat’s “Application Compatibility Guide” and can provide some context here. I can guarantee that from major release to major release most distributions will do SONAME bumps or incompatible library upgrades that stand to break python binary wheels. In order to avoid that something has to be checking the binary wheel against a curated ABI baseline. While PEP 600 avoids defining the curated baseline, something needs to turn manylinux_2_12 into a curated ABI baseline. Within Fedora we have been using libabigail (The ABI Generic Analysis and Instrumentation Library) to do ABI comparison via rpminspect tooling, this means we can watch for ABI breaks and prevent them.

If you have any questions I’d be happy to help discuss glibc, ABIs, guarantees, and those things being provided by downstreams with experience deploying ELF binaries across long lifetimes.

4 Likes

That’s quite interesting. If Fedora already has a mechanism to detect and block this kind of breakage, does that mean that CentOS and RHEL (which as far as I know base the package ecosystem on Fedora) should not have ABI issues? Is the code for the libabigail checks publicly available?

I wonder if we could make this ABI comparison work for all manylinux-compatible targets. I think the ideal place for this would be the PEP 600 compliance tool. Any thoughts on this @mayeut?

We do some similar checks on CPython using libabigail already:

@codonell Just to clarify, are you performing these checks on manylinux wheels, or only between rpms built for Fedora?

libabigail is open source and available here: sourceware.org Git - libabigail.git/summary
In Fedora we use rpminspect to drive libabigail-based checks.

You can see the Fedora ABI policy checks here:

Where you can see a supserset of ACG CL1 being checked.

We are only performing these checks on differential rpm builds.

As an upstream steward and senior developer for the glibc project, let me make some stronger statements here that might put at ease some of you who are looking at this and wondering if glibc would some day break compatibility.

Consensus in the upstream glibc project is to keep backwards compatibility for a very very long time, this includes keeping around old symbols for GLIBC_2.5 and GLIBC_2.12 because this is the expectation and understanding we have with our users. In some extreme cases for security purposes symbols have moved to new shared objects that need to be explicitly preloaded e.g. malloc hook removal (Securing malloc in glibc: Why malloc hooks had to go | Red Hat Developer)

To give an concrete example: RHEL9 is based on glibc 2.34, and that release retains legacy symbols from the GLIBC_2.5 version set which covers RHEL5 (glibc 2.5), GLIBC_2.12 version set which covers RHEL6 (2.12) and onwards. Those symbols are available to run such applications as were linked against glibc 2.5 and glibc 2.12 and onwards.

There are other symbol sets you’re going to care about like GCC_X.Y.Z, CXXABI_X.Y.Z and GLIBCXX_X.Y.Z, which are tied to gcc (libgcc_s, libstdc++). Like glibc, gcc has not removed legacy symbols in a release. gcc also wants to ensure that older applications continue to keep working.

I don’t have a crystal ball, but ABI stability has been going well, and the consensus is, like with the Linux kernel syscall interfaces, that we do not want to break user applications.

1 Like

Thanks a lot @codonell, that’s great to hear! I still think that ideally we should have automation to validate ABI compatibility across distros monitored by pep600_compliance, but knowing that RHEL won’t be a problem for the next few versions is very reassuring, and great news to the community. I created an issue in pep600_compliance to kick off a discussion to see if it makes sense to implement an automated check there.