PEP 425 Platform tag

The platform tag in PEP 425 is defined as being the value of distutils.util.get_platform(). Given that there is work ongoing to remove distutils from the standard library, we should think about updating the PEP to not rely on distutils.

I assume the best option would be to copy the implementation from distutils into packaging and update the PEP to link to packaging. Documenting the algorithm in the PEP isn’t really practical, as it’s pretty complicated.

Some notes:

  1. I’m discounting the option of getting the information from setuptools - making everyone that wants to use tags take a dependency on setuptools seems unreasonable.
  2. This is likely to force the issue of the wheel project needing a dependency on packaging, as the stdlib will no longer have a way of getting the platform tag for a wheel.
  3. It’s not impossible that the “remove distutils” PEP could be updated to allow for moving get_platform() to elsewhere in the stdlib. I’ve not explored that possibility.
  4. This is not an immediate issue. The plan for removal of distutils is in very early stages yet - it won’t happen before Python 3.12 at the absolute earliest. But it’s probably a good idea to think about this anyway, having the standard rely on distutils, which is unmaintained, is not ideal.

It’s possible I’ve missed something here. Linux has manylinux, and MacOS seems to have multiple “platforms” at once, and I don’t really know how they fit in with the above. Windows is pretty trivial in comparison. Maybe an alternative would be to simply specify replacement rules that don’t use distutils at all? I’d love someone who understands the subject better than me to comment on whether the distutils complexity is even relevant any more.

(As I said above, there’s no rush on this. But I wanted to record the issue now, before I forget about it).

1 Like

Considering how the last couple of changes to distutils’s get_platform have been approached, this was a very weak definition in the first place :slight_smile:

Packaging cares about different meanings of platform to what core does (such as manylinux, which captures operating system info rather than CPython build info), so defining this somewhere in packaging would make sense.

IMO get_platform() should be moved to or re-implemented in a stdlib module e.g. sysconfig.

If I understand the implementation (admittedly I may not), packaging.tags (and its predecessor pep425tags) fallback to distutils.util.get_platform() when the current platform is “unknown” (in packaging terms, IOW not Windows, macOS, or manylinux). get_platform() returns a useful name for the current platform, based on how the platform describes itself (via compile-time configuration). The value should be fine most of the time, and if some platform does not like the value, it has a choice to patch distutils to return what it wants. Moving the get_platform() implementation into packaging would lose them this route, and hard-wire the platform tag logic to sys.platform (or platform.system(); I forgot). There is unfortunately not any standard about how an unknown platform should declare itself for packaging tools, so we need to provide a replacement for this patch-the-stdlib approach when we remove distutils.

BTW, it does not make sense to me to make packaging.tags depend on setuptools. setuptools already vendors (a part of) packaging and re-implemented some of its internals to depend on it; making packaging depends on seutptools would be the wrong way around.

1 Like

I just checked the implementation:

  • platform.system() of “Darwin” or “Linux” triggers special handling. Neither the spec on packaging.python.org nor PEP 425 mentions that platform.system() is involved.
  • Anything else (including Windows) goes via distutils.util.get_platform().
  • I assume packaging follows the manylinux specs for Linux, but I didn’t check that.
  • The special logic for MacOS is basically undocumented.

Are we starting to drift into a situation where the tag definition is ending up implementation defined (in the packaging library)?

The environment variable _PYTHON_HOST_PLATFORM does this for distutils.get_platform() (except on Windows…) It is of course undocumented.

That’s certainly a possibility. FWIW, get_platform() special-cases sunos, aix and cygwin out of the POSIX platforms that aren’t Linux or MacOS. I don’t really want the packaging community to have to carry the maintenance cost for that - we don’t have the expertise, and it seems more reasonable for those cases to track core Python support for the respective platforms.

An additional complexity is that the distutils platform is sometimes used for the host platform and sometimes for the target platform (when cross-compiling, these are not the same). We straightened some out when looking at ARM support on Windows, but it’s a long way from being actually usable.

I’d quite like packaging.tags to support a site override (i.e. some kind of data file in sys.prefix so that e.g. Ubuntu could put a file in there self-identifying as “ubuntu2004” or whatever).

Otherwise, I’m not totally upset about platform detection being implementation defined. We’ve already seen through the manylinux experience (and the moves towards a more implicit definition) that it’s basically inevitable. And we’ve already got too many platform identifiers in the core runtime. Since we don’t need this one (in a post-distutils world), it may as well live outside too. That leaves:

  • sys.platform to identify the compile-time platform
  • os.name to identify the POSIX emulation layer
  • platform.platform to provide user/logging-friendly platform identification
  • platform.* functions for some other detections
  • a range of third party options depending on what you need to know

By “implementation defined” I didn’t mean letting the platform/implementation choose what tag they wanted to identify as, I meant that the algorithm for tools to work out what the platform is for this interpreter isn’t documented. At the moment that algorithm is fairly complex, and should be documented. If someone wants to change the algorithm to something like "the value of site.packaging_platform" then it’s trivially easy to document, but there’s still no harm in doing so.

The harm in that is then the platform can’t change in past Python releases, unless your algorithm is "complicated but documented, and fall back on core.attribute", in which case you may as well make it just a little bit more complicated and not rely on any fallbacks. Ideally, the platform tag should be completely independent of the runtime anyway.

Allowing distributors to override the platform tag means you can get away with lazier fallbacks. Going straight to “linux_x64” is fine when anyone who knows they can be more specific can override it. But when you want to have complete control over the algorithm, yeah, you’ve got to deal with all the edge cases now, because you aren’t letting anyone else participate :slight_smile:

I’m not sure I follow but I don’t think it matters. All I really care about is that people should be able to reimplement what packaging.tags does just from reading the specs, without needing to read the source of packaging.tags. Otherwise we’ve just ended up in a situation where the standards don’t define behaviour clearly enough to be usable.

1 Like

It does (@mattip did a bunch of work to keep it ahead of perennial manylinux).

Not starting; always have been. :wink: All of that logic came from pep425tags so it’s been that way for quite some time.

Yep, we are already there. The fact I had to reverse-engineer so much to implement packaging.tags proved that point.

Honestly the trick with documenting any of this is the interpreter-specific logic of interleaving tag priorities, etc. Now if people are willing to break backwards compatibility in terms of tag priority order then it could get straightened out pretty easily in terms of an algorithm (although the verbage alone around macOS will be a pain to write).

2 Likes

Coincidentally I just stumbled on https://github.com/pypa/pip/issues/6121. The following comments seem particularly relevant:

The documentation on how these work together is pretty sorely lacking (

  • what is the valid list of platforms and what is the logic for how they match?
  • the provided examples about macosx use the wrong specifier, AFAICT, using a dash instead of an underscore.
  • the pep that is referred to is far too long to grok and it also seems out of date

After this discussion, I was sorely tempted to say that trying to specify that you want to apply tags for a system other than the currently running intepreter was never an intended use case, and is in principle not possible to do accurately…

Unless you go to that system and ask it what tags it is compatible with :slight_smile:

I think being able to predict what tags an arbitrary system will accept should be out of scope, mostly because it would spoil my desires for a particular install to accept arbitrary tags (one concrete example, there could be value in the Store package of CPython accepting an additional tag, because it has access to system APIs that the regular install does not, despite being nearly identical bits on disk).

I just created https://github.com/pypa/pip/issues/8857 as a place to discuss reviewing pip’s UI for tags.

1 Like