"Apple Silicon" and packaging

Apple announced yesterday that they will switch the CPU architecture for macOS over the next couple of years. While doing so there will be once again binaries with multiple architectures (x86_64 and arm64 this time).

I’m wondering how to deal with this in the naming of wheels. In the past we’ve introduced custom machine names for these fat binaries (for example “universal” for “i386 and ppc”), but that was before the introduction of explicit support for multiple compatibility tags in the packaging ecosystem.

There seem to be two alternatives:

  1. Add a new custom “machine” type “universal2” for “x86_64 and arm64”. This extends the existing support for fat binaries in macOS and the name matches Apple’s name for the new combination.

  2. Don’t add a new “machine”, but use multiple compatibility tags in wheels that support both architectures.

As usual there are arguments for both sides:

  • The first option makes it easier to ensure that you get binaries that support the same set of architectures as the Python binary, which is convenient when you want to collect wheels for use on a set of machines.
  • The first option probably requires less invasive changes, the changes to the stdlib are basically an update to a single function (_osx_support.get_platform_osx)
  • The second option makes it easier to find binaries that support your particular machine, even if there are no fat binaries yet.

BPO-41090 contains a rough patch implementing the first option.

EDIT: Fixed name of the new architecture from amd64 to arm64; added reference to a BPO-item with a rough patch.

s/amd64/arm64/ :wink:

You’re right, I need to drink more tea before posting.

Switching to AMD instead of Intel would have been a lot easier for us :grinning:

2 Likes

Hi, I think there’s some confusion here. Apple computers will be switching to ARM (v8 I think?), not to be confused with amd64 which is x86_64. I don’t think having a wheel bundling binary for both architecture is a good idea, since

  1. It’s twice as heavy to download
  2. We’d need some new standard to build and install it

I am optimistic that building wheels extension modules will just work out of box, as the building process already take care of the architecture out of the box. On GNU/Linux for example (where pretty much every architecture is supported, the abbreviation of the arch will just be inserted into the wheel filename and that has been tested thoroughly on at least popular architectures like amd64 and armv7/armv8. manylinux2014 even explicitly declare support for a bunch of other archs.

AFAIK, standardization of wheels has more to do with the reproducible of them, whereas on macOS and Windows it’s not a problem since there’s only one set of compiler and libc. IMHO the current standard will survive the mentioned change just fine.

Edit: I got the wrong context, was thinking about package packaging instead of packaging of CPython itself.

amd64 was a typo, not enough tea in the morning :wink:

macOS already supports binaries with multiple architectures, and that’s something supported by CPython and setuptools for i386, x86_64, ppc and ppc64. What I’m looking into is adding support for this new architecture as well (but only when combined what x86_64, the other architectures are no longer relevant).

The primary reason for looking into this is CPython itself, this would enable having a single installer that works for all users. Another reason is packaging apps for use on different machines (such as GUI apps), having multiple app bundles is confusing for users esp. when Apple explicitly asks developers to ship fat binaries.

1 Like

Can you point to a reference for “Apple explicitly asks developers to ship fat binaries”? Was that for the i386/x86_64 platform which could run either binary or for the new platform? I would be interested to read about how the new platform enables running executables compiled for both x86_64 and arm64 by toggling a flag.

This was mentioned in the Platform State of the Union at WWDC this year, but the messaging was similar during the transition from PowerPC to Intel (and later when introducing 64-bit support on the Intel platform). The goal with using fat binaries is to reduce friction for users. The cost of this for full applications should be acceptable because a large fraction of applications are additional resources and not the binary themselves.

Technically this is fairly straightforward:

  • The MachO binary format supports fat binaries, which basically allows for combining multiple binaries (separately compiled) into a single file. The dynamic loader knows which part of the file should be used.
  • The posix_spawn(3) API has an attribute that can be used to prefer some architectures when launching a binary (see posix_spawnattr_setbinpref_np)
  • IIRC the Finder had an per-application option to disable one of the architectures

The technically impressive part for both this transition and the one from PowerPC to Intel is that the system will perform emulation of the older architecture on newer systems with acceptable performance (YMMV). That emulation works for whole binaries, not a shared library (or Python extension) loaded into a host using a different architecture.

I vote for option 2. Compressed tags exist for this exact scenario where a wheel supports more than one specific tag set.

You will also have to update packaging.tags ala https://github.com/pypa/packaging/blob/master/packaging/tags.py#L380 and friends to make this work, especially if you go with option 1.

2 Likes

I prefer option 1, doing this with compatibility tags gets ugly even with compressed tags because those tags don’t compress enough.

If I read PEP 425 correctly a “universal 2” distribution of pyobjc-core would end up being named:

pyobjc_core-6.2.1-cp36-cp36m-macosx_10_9_arm64.macosx_10_9_x86_64.whl

Compared to this for option 1:

pyobjc_core-6.2.1-cp36-cp36m-macosx_10_9_universal2.whl

I’m also not what will need to be changed to build universal wheels with a compressed tag set, while the patch in bpo-41090 for option 1 is fairly trivial and a change for packaging.tags would also be trivial.

A reason to favour easy patches is somewhat selfish: Ned and I would like to get support for building Universal 2 binaries in 3.9, a trivial patch should be easier to get past the RM :slight_smile:

Luckily, all of these changes live in third party packages. Setuptools is currently in the process of forking distutils completely and will stop using the standard one, so we can deprecate it (make it internal only). So you don’t need to race for 3.9, unless you’ve got users who won’t use the PyPA ecosystem.

Making sure the 3.9 build itself works is obviously in core, but Ned can sneak that by the RM easily :wink: (maybe I should push to release the Windows ARM64 build at the same time? Lack of ecosystem is the reason I’ve been holding back…)

See the bpo issue I mentioned earlier. That’s 99% of the work needed in CPython (other than testing on actual hardware…), the only other bit needed is a patch to Mac/Tools/pythonw.c to support arm64 there as well.

Ah, I see how that’ll flow into what packaging picks up, yeah.

Is “universal2” a name that you made up or is there some precedent for it from Apple? If they’re using it, then I’d say go for it, because something has to be returned there.

That bug probably should also be backported for people who build their own, but I’m not going to argue for that if you disagree.

Apple uses “Universal 2” as the marketing name for this combination, technically this is the same fat binary mechanism that was inherited from NeXT.

1 Like

Yeah, the mechanism isn’t what’s being represented here, so I think universal2 is the right name.

That’s what I thought as well.

If we do #1, there’s nothing stopping people from also publishing individual arm64 or intel wheels right? Like specific tags targeting just that arch will exist?

If that’s the case, then I think it’s fine to use universal2 or whatever. People who want to ship wheels targeting a specific arch can do that, or if they want to use the compressed tag for some reason instead of universal2 they’d also be able to do that.

I suspect you’d see a lot of universal2 at the start, and then in some number of years you’d see less of those and more arm64 wheels instead.

That’s correct, universal2 doesn’t preclude anything else. This just extends the support for fat binaries on macOS wel already have to a new combination of architectures.

I hope to see universal2 wheels when Apple starts shipping hardware to customers and as long as we keep supporting Python on Intel Mac’s.

1 Like

Then yea, I say let’s do universal2 and individual projects can choose the method that works best for them.

3 Likes

Just an FYI here: @ronaldoussoren and I are now working on this on the cPython side based on the first Big Sur developer preview and with no access to real hardware yet (that’s coming). With an expected first official release of Big Sur sometime in the fourth quarter of 2020 (so at least three months from now), we expect to have full support for it in a future 3.8.x bugfix release and likely with the first release of 3.9.0. The final bugfix release of 3.7.x, 3.7.8, is about to be released so it will not fully support Big Sur, in particular running on Apple Silicon hardware. For cPython branches already in the security-fix phase of their release cycles, currently 3.6 and (for a couple of months) 3.5, we generally do not support new operating system platform releases. (All other branches have already reached their end-of-life, including of course 2.7.)

3 Likes

In my opinion, alternative 1 is better, because it follows the existing practices that were used in the days when macs were either intel or ppc. It also allows for wheels to be distributed as either x86_64 only, arm64 only , or “universal2”