Add `sys.abi_features` to make information about the interpreter ABI more accessible

I’d like to draw attention to Issue 133143, which adds a new attribute to the sys module of the standard library.

This suggested change grew out of the perceived need to make information about ABI characteristics of the interpreter more easily available.
For example, currently it is not possible to determine whether the interpreter is built for a 32-bit or 64-bit architecture on platforms where both kinds of binaries are supported, such as Windows.
Furthermore, determining some aspects of the ABI, like free-threadedness or the presence of debug features is cumbersome and can vary between platforms.

To address these issues and facilitate easy determination of such characteristics the new attribute is introduced.

2 Likes

More precisely, it’s not possible to determine between AMD64 and ARM64 on platforms where both kinds are supported, such as Windows.[1]

I think this is a good idea, but want to see the values properly specified (a couple to start with, and policies/guidance on adding/removing them later).


  1. The current way is to inspect sys.winver, which is fine, but it’s not really formalised. ↩︎

I think I know the reason, but, could you explain why these are flags, rather than key-value pairs?
Under the current scheme, a (non-CPython) implementation could use 16-bit or 128-bit. One could argue that a better model would use a dict entry (or attribute) like bitness = 32 or pointer_bits = 64.

I’m a bit worried that the scope isn’t well-defined.
Does little_endian/big_endian (sys.byteorder) belong here?

Or some more obscure ABI-affecting flags, like HAVE_FORK? (The real question here being: why not?)

For context, I’ve been thinking about adding some abi_info object with a somewhat bigger scope. One possibility would be that we add that, and packaging specs could cherry-pick pieces to export as environment marker flags. This would also be more backwards compatible: if something would need to be added later, packaging tools could easily use a non-abi_info workaround on older Pythons.

Hmm, writing that made lean more toward CPython exposing the needed info in a straightforward and supported way, but not necessarily as a set of flags.

1 Like

Isn’t sys.maxsize idiomatically used for this?

bits = 64 if sys.maxsize > 2**32 else 32
1 Like

That doesn’t work for all interpreters like e.g. GraalPy, which is a 64bit VM but (due to the Java base) has a maximum of 31bits for its array sizes.

Packaging has changed their detection code to match what’s in platform.py using struct.calcsize("P") == 4.

I feel that’s obscure enough to get right to warrant a better place to query this.

6 Likes

@pitrou, you are right, that is done; I was thinking of the packaging environment markers that don’t have access to that information. However, as @timfelgentreff points out, this is not a universal way and I do think that it would be good to make this information more readily available.

@encukou, I agree with your suggestion to make this information more general. Indeed, an initial attempt at a reference implementation gathered feedback from @FFY00 and others that points in the same direction.
Keeping in mind your proposal for module slots, I think it would be best to have a static registry of abi information, active and inactive features, that can be queried both for compatibility checks and environment marker generation.

Perhaps something like

struct PyAbiFeature {
    int index;
    const char *name;
    const char *tag;
    const bool available;
};

static struct PyAbiFeature Py_ABI_FEATURES[] = {
    {0, "free-threading", "t", Py_GIL_DISABLED},
    {1, "gil-enabled", "", !Py_GIL_DISABLED},
    {2, "debug", "d", Py_DEBUG},

possibly wired into sys in one way or another?

My Py_mod_abi proposal doesn’t do what you want: it describes an extension, not the interpreter.
So we need something else.

I don’t think a C API for this makes much sense here. If the “bitness” is wrong, you can’t use any C API at all; if others don’t match you might be OK, depending on what you call.
So, let’s focus on something in sys that’s useful before the dll is loaded – e.g. to choose which DLL to build/install/load?

I don’t think the info should be based on flags. “Bitness” (sizeof(void*)) is a number – 4 or 8 bytes.
Also, we want to differentiate between a feature not being present/active, and the feature being unknown/unchecked. This will become important as future Python versions add new “flags”.
So, instead of a set of strings, I think this should be an object with attributes, in the style of sys.int_info and sys.thread_info. For example:

  • sys.abi_info.pointer_size (4 or 8), or pointer_bits (32 or 64)
  • sys.abi_info.Py_GIL_DISABLED (True/False)
  • sys.abi_info.Py_DEBUG

And perhaps all the other feature macros that define the build, e.g.:

  • sys.abi_info.Py_TRACE_REFS
  • sys.abi_info.Py_REF_DEBUG
  • sys.abi_info.Py_ENABLE_SHARED

And maybe more that describe the platform. How far do we want to go?

  • sys.abi_info.PY_BIG_ENDIAN
  • sizes of C primitive types, Py_ssize_t, wchar_t
  • sys.abi_info.HAVE_FORK
  • sys.abi_info.MS_WINDOWS
  • etc.
1 Like

I’ll have a closer look at the references and suggestions, but wanted to clarify one thing.

I did understand that you were talking about extensions. My reasoning was that, if we put a central catalog of abi features in the interpreter, extensions can refer to that for the compatibility check, i.e. the same source of truth about abi can be used for several purposes if designed carefully.

What are the purposes?

Extensions have no need for checking the “bitness” of the interpreter at runtime. If the pointer size is wrong, the extension can’t be loaded.
AFAIU, this thread is about installation metadata, and perhaps tags in filenames – things used before an extension is loaded – in build/install tools, and perhaps in the import machinery. We don’t need C API for any of that.

My Py_mod_abi proposal is for a rough check, to get a clean exception in case the packaging/installation tools got things wrong. For example: you compiled a C file manually and aren’t using any wheel stuff at all. Or: you’re using an older tool that doesn’t implement some recent detail.

There’s a bit of overlap between the use cases, but, I don’t think it’s enough to justify e.g. defining a C API & ABI for representing all the details.


BTW, you might want to add variant wheels draft from Wheel Next to the reading list: Wheel Variant Support - WheelNext

2 Likes

Thanks for the pointers, @encukou! Based on your suggestion, I came up with an implementation for sys.abi_info.

For now, I have only added the core three properties, though adding further ones is rather straight-forward.

Regarding the wheel next variants, I think it could be a good idea to align with this by declaring a special namespace, say builtin, interpreter, or, indeed, abi_info, which would allow us to refer to the pieces of abi information as, e.g. builtin :: pointer_size :: 32 and similar for purposes of packaging.

Thanks, @encukou, for the reviews. Technically, the PR is in place and ready. To me, adding Py_TRACE_REFS and Py_REF_DEBUG makes sense; the others are, imho, either easily accessible by other means, or too obscure.

What do you think?

@timfelgentreff, is there something else that you would like to see from a GraalPy perspective?

Since we’re consolidating info that’s available elsewhere, I’d put a copy of sys.byteorder in this struct too.

I guess policy could be driven by packaging needs. Is there some discussion on that side about this being useful/necessary?

1 Like

The ABIFLAGS config var is essential, but the main reason people wanted this new field is because they prefer to use sys.abiflags (which is for smuggling the config var on POSIX, and so not consistent on all platforms).

There’s a separate discussion about transitioning sys.abiflags over a couple of releases to become consistent with sysconfig (and more immediate changes to make sysconfig’s value meaningful, since there’s fewer back-compat risks there). Adding something entirely new with properly defined semantics (not just “whatever the makefile decided”) is the right way forward.

It seems to me that the consensus here is still:

So let me throw a draft in:

sys.abi_info should include information that affect the CPython ABI in ways that require a specific build of the interpreter chosen from variants that can co-exist on a single machine. For example, it does not encode the base OS (Linux or Windows), but does include pointer size since some systems support both 32- and 64- builds.
The available entries are the same on all platforms; e.g. pointer_size is available even on 64-bit-only architectures.

New entries should be added when needed for a supported platform, or (for enabling an unsupported one) by core dev consensus.
Entries should be removed following PEP 387. (Most likely the relevant part would be: “If the expected maintenance overhead and security risk of the deprecated behavior is small […], it can stay indefinitely”)


Going through my brainstorming list above, and the current PR, under this policy we should:

  • keep the pointer_size
  • keep Py_GIL_DISABLED
  • remove Py_DEBUG (it does not itself affect the ABI)
  • not add Py_TRACE_REFS (does not affect the ABI)
  • add Py_REF_DEBUG (alters behaviour of static inline functions)
  • not add Py_ENABLE_SHARED (does not affect ABI)
  • add PY_BIG_ENDIAN (some CPUs support both endians, very similarly to “bitness”)
  • not add sizes of C primitive types (that’s part of the OS, given the pointer size)
  • not add HAVE_FORK & sys.abi_info.MS_WINDOWS (part of OS)

(Removing Py_DEBUG and adding Py_REF_DEBUG is just changing a detail of what exactly sys.abi_info.debug exposes.)


This feels like something that should be generated from abi_info, rather than be part of it (though we could expose it as a special attribute on the same object despite the different semantics).
While encoding the info as a string might be useful, I’d prefer avoiding the need to parse text back to structured information.

6 Likes

It does on Windows.

So does this, though we don’t support it there (at least until we get enough PRs to support it).

Agreed on the rest.

It’s calculated at compile time and read directly out of the generated Makefile. So it should be possible to infer it from abi_features/abi_info, but it’s more important that it match the compile-time setting (in case someone overrides it during build).

1 Like