I’d like to draw attention to Issue 133143, which adds a new attribute to the sys module of the standard library.
This suggested change grew out of the perceived need to make information about ABI characteristics of the interpreter more easily available.
For example, currently it is not possible to determine whether the interpreter is built for a 32-bit or 64-bit architecture on platforms where both kinds of binaries are supported, such as Windows.
Furthermore, determining some aspects of the ABI, like free-threadedness or the presence of debug features is cumbersome and can vary between platforms.
To address these issues and facilitate easy determination of such characteristics the new attribute is introduced.
More precisely, it’s not possible to determine between AMD64 and ARM64 on platforms where both kinds are supported, such as Windows.[1]
I think this is a good idea, but want to see the values properly specified (a couple to start with, and policies/guidance on adding/removing them later).
The current way is to inspect sys.winver, which is fine, but it’s not really formalised. ↩︎
I think I know the reason, but, could you explain why these are flags, rather than key-value pairs?
Under the current scheme, a (non-CPython) implementation could use 16-bit or 128-bit. One could argue that a better model would use a dict entry (or attribute) like bitness = 32 or pointer_bits = 64.
I’m a bit worried that the scope isn’t well-defined.
Does little_endian/big_endian (sys.byteorder) belong here?
Or some more obscure ABI-affecting flags, like HAVE_FORK? (The real question here being: why not?)
For context, I’ve been thinking about adding some abi_info object with a somewhat bigger scope. One possibility would be that we add that, and packaging specs could cherry-pick pieces to export as environment marker flags. This would also be more backwards compatible: if something would need to be added later, packaging tools could easily use a non-abi_info workaround on older Pythons.
Hmm, writing that made lean more toward CPython exposing the needed info in a straightforward and supported way, but not necessarily as a set of flags.
@pitrou, you are right, that is done; I was thinking of the packaging environment markers that don’t have access to that information. However, as @timfelgentreff points out, this is not a universal way and I do think that it would be good to make this information more readily available.
@encukou, I agree with your suggestion to make this information more general. Indeed, an initial attempt at a reference implementation gathered feedback from @FFY00 and others that points in the same direction.
Keeping in mind your proposal for module slots, I think it would be best to have a static registry of abi information, active and inactive features, that can be queried both for compatibility checks and environment marker generation.
My Py_mod_abi proposal doesn’t do what you want: it describes an extension, not the interpreter.
So we need something else.
I don’t think a C API for this makes much sense here. If the “bitness” is wrong, you can’t use any C API at all; if others don’t match you might be OK, depending on what you call.
So, let’s focus on something in sys that’s useful before the dll is loaded – e.g. to choose which DLL to build/install/load?
I don’t think the info should be based on flags. “Bitness” (sizeof(void*)) is a number – 4 or 8 bytes.
Also, we want to differentiate between a feature not being present/active, and the feature being unknown/unchecked. This will become important as future Python versions add new “flags”.
So, instead of a set of strings, I think this should be an object with attributes, in the style of sys.int_info and sys.thread_info. For example:
sys.abi_info.pointer_size (4 or 8), or pointer_bits (32 or 64)
sys.abi_info.Py_GIL_DISABLED (True/False)
sys.abi_info.Py_DEBUG
And perhaps all the other feature macros that define the build, e.g.:
sys.abi_info.Py_TRACE_REFS
sys.abi_info.Py_REF_DEBUG
sys.abi_info.Py_ENABLE_SHARED
And maybe more that describe the platform. How far do we want to go?
I’ll have a closer look at the references and suggestions, but wanted to clarify one thing.
I did understand that you were talking about extensions. My reasoning was that, if we put a central catalog of abi features in the interpreter, extensions can refer to that for the compatibility check, i.e. the same source of truth about abi can be used for several purposes if designed carefully.
Extensions have no need for checking the “bitness” of the interpreter at runtime. If the pointer size is wrong, the extension can’t be loaded.
AFAIU, this thread is about installation metadata, and perhaps tags in filenames – things used before an extension is loaded – in build/install tools, and perhaps in the import machinery. We don’t need C API for any of that.
My Py_mod_abi proposal is for a rough check, to get a clean exception in case the packaging/installation tools got things wrong. For example: you compiled a C file manually and aren’t using any wheel stuff at all. Or: you’re using an older tool that doesn’t implement some recent detail.
There’s a bit of overlap between the use cases, but, I don’t think it’s enough to justify e.g. defining a C API & ABI for representing all the details.
For now, I have only added the core three properties, though adding further ones is rather straight-forward.
Regarding the wheel next variants, I think it could be a good idea to align with this by declaring a special namespace, say builtin, interpreter, or, indeed, abi_info, which would allow us to refer to the pieces of abi information as, e.g. builtin :: pointer_size :: 32 and similar for purposes of packaging.
Thanks, @encukou, for the reviews. Technically, the PR is in place and ready. To me, adding Py_TRACE_REFS and Py_REF_DEBUG makes sense; the others are, imho, either easily accessible by other means, or too obscure.
The ABIFLAGS config var is essential, but the main reason people wanted this new field is because they prefer to use sys.abiflags (which is for smuggling the config var on POSIX, and so not consistent on all platforms).
There’s a separate discussion about transitioning sys.abiflags over a couple of releases to become consistent with sysconfig (and more immediate changes to make sysconfig’s value meaningful, since there’s fewer back-compat risks there). Adding something entirely new with properly defined semantics (not just “whatever the makefile decided”) is the right way forward.
sys.abi_info should include information that affect the CPython ABI in ways that require a specific build of the interpreter chosen from variants that can co-exist on a single machine. For example, it does not encode the base OS (Linux or Windows), but does include pointer size since some systems support both 32- and 64- builds.
The available entries are the same on all platforms; e.g. pointer_size is available even on 64-bit-only architectures.
New entries should be added when needed for a supported platform, or (for enabling an unsupported one) by core dev consensus.
Entries should be removed following PEP 387. (Most likely the relevant part would be: “If the expected maintenance overhead and security risk of the deprecated behavior is small […], it can stay indefinitely”)
removePy_DEBUG (it does not itself affect the ABI)
not add Py_TRACE_REFS (does not affect the ABI)
addPy_REF_DEBUG (alters behaviour of static inline functions)
not add Py_ENABLE_SHARED (does not affect ABI)
addPY_BIG_ENDIAN (some CPUs support both endians, very similarly to “bitness”)
not add sizes of C primitive types (that’s part of the OS, given the pointer size)
not add HAVE_FORK & sys.abi_info.MS_WINDOWS (part of OS)
(Removing Py_DEBUG and adding Py_REF_DEBUG is just changing a detail of what exactly sys.abi_info.debug exposes.)
This feels like something that should be generated fromabi_info, rather than be part of it (though we could expose it as a special attribute on the same object despite the different semantics).
While encoding the info as a string might be useful, I’d prefer avoiding the need to parse text back to structured information.
So does this, though we don’t support it there (at least until we get enough PRs to support it).
Agreed on the rest.
It’s calculated at compile time and read directly out of the generated Makefile. So it should be possible to infer it from abi_features/abi_info, but it’s more important that it match the compile-time setting (in case someone overrides it during build).