Use the limited C API for some of our stdlib C extensions

pitrou · September 5, 2023, 8:30am

I wish this would happen too, but work seems to have stalled since Feb 2022. While NumPy is an attractive target, perhaps it’s really too complicated and HPy should have started with a simpler project?

malemburg · September 5, 2023, 1:50pm

Right, but this effort is not going to go away by using the stable ABI.

Maintainers will still have to test their packages with the new Python release and fix any issues they find. Note that packages typically do include Python code as well, which doesn’t magically continue to work because the included C extension used the stable ABI

FWIW, I don’t think it’s a good idea to tell users: hey, look, you can continue to use the packages released for Python 3.11 with Python 3.12, since the C extension uses the stable ABI, but without actually testing the package with 3.12.

Likewise, users should be made aware that packages they are installing with pip install may actually not be tested with the just-released new Python version.

Making it easier to port C extension packages to new Python releases sound like a much better plan, since then the effort for the maintainers is materially reduced and not just postponed.

Which is why I think effort on the core dev side is better spent on projects such as your compatibility tooling , rather than maintaining two variants of the same API.

I have already stated my opinion on this: the desktop application maintainers are in charge here. For Linux distributions, esp. the paid ones, the distribution companies should do this kind of packaging and relieve the maintainers from these tasks.

I know that handling OpenSSL upgrades is painful (we maintained a client-server product using Python and OpenSSL for many years), but that’s mostly due to the OpenSSL side of things, not so much because Python makes this difficult.

If there are many such desktop applications, perhaps the maintainers of these could join forces and create a distribution of Python which is geared towards making embedding easy and painless for them. I don’t think this is something the core dev team should be taking on.

merwok · September 5, 2023, 2:36pm

This is an interesting point. People with experience know to pin their dependencies, test their apps automatically, wait for x.y.1 release before upgrading, but someone getting started may find the official download page and get the most recent release when the paint is still fresh on it. What do people think about reorganizing the page slightly so that the current version is above a very recent version?

steve.dower · September 5, 2023, 3:00pm

As a point of reference, the python stub that ships with Windows doesn’t switch to the latest release until there’s “broad community support” for it, which is a deliberately vague definition to allow us to judge the situation around each release (and avoid having people try and game it).

Generally it’s been switching over around 3.x.2.

EpicWink · September 6, 2023, 10:22pm

I suspect it only shifts the goalposts. Right now, package maintainers are able to release compatible binary wheels as soon as Python RC1 is released (and made available to build tooling), so X.Y.0 already has some buffer. Making the default switch over later hurts increases the buffer (which there may be an argument for, but then I would say it’s easier to extend the RC period).

I think one problem here is that package maintainers aren’t aware RC releases are out simply because they have better things to do than keep up with Python pre-releases, and the first they are reminded of a new release is when users ask about support for it, which might be when X.Y.0 is out. My suggestion would be for us to be more proactive in notifying maintainers (I have an idea).

PS: I think this discussion is off-topic with respect to the limited C API usage in the standard library, and could be split off into a new thread

oscarbenjamin · September 6, 2023, 10:44pm

I will not personally do that. I will prepare for the new release of CPython and have it tested through the alpha, beta etc stages in CI. I will not push out the release of my package claiming compatibility with CPython X.Y though until I can see the build complete and tests pass with the final release of CPython X.Y.0. The time at which I issue the release to support a new CPython version is always going to be after CPython issues its release. How long it takes depends on a bunch of factors because there are almost always some other things that need to be considered at the time even if everyone involved is not just busy with entirely different things.

steve-s · September 7, 2023, 9:53am

You were looking at the wrong branch:

The last change was on May 11. Since then, we’ve worked on upstreaming the necessary changes to HPy and making it work with GraalPy and PyPy. We presented our results at EuroPython this year. It’s not stalled.

Indeed, HPy could solve many of the problems discussed here. For CPython, it can remain a separate PyPI package, acting as a shim translating HPy to the CPython C API. The only thing necessary for a new CPython release would be to ensure this shim continues to work. 3rd party packages using HPy would need no recompilation. HPy is intended to be a smaller subset of the whole CPython C API and is more abstract, so we believe that its binary compatibility can be easily maintained. Moreover, its design is tailored for providing long-term ABI compatibility. There will always be packages that require a lower level or too CPython-specific APIs, and that’s fine; they will continue using the CPython C API. However, the vast majority of packages can function with the HPy API (we believe that porting NumPy and a few other smaller packages showcases that).

Additionally, I think a smaller “clean room” API would be a good target for better “standardization.” That is, a specification of the contract of the API, detailing what is supported and what is not, documentation and tooling. The HPy design allows one to run the same binary in a “regular” mode and in a “check that I am not breaking the contract” mode. The idea is that unless your code works in the “I am not breaking the contract” mode, you cannot count on ABI compatibility, so one can add even relatively expensive checks that prevent abusing the API and seal it for future development without affecting the “production” performance.

encukou · September 7, 2023, 12:57pm

We’ve reached different personal opinions, so I’ll only answer a part of this post:

Yet, that’s what happens for most pure-Python libraries. Why should native extensions be different?
PEP 387 applies regardless of stable ABI: if something breaks without deprecation, it’s a bug in CPython (or an explicit exception).

FWIW, a hypothetical PyPI build service won’t help much with this issue. We might need a test service. And/or perhaps metadata that would allow pip to warn “this package wasn’t tested on this version of Python”.

brettcannon · September 8, 2023, 6:28pm

3 posts were split to a new topic: Have pip warn when installing for a Python version that is not covered by the Trove classifiers?

vstinner · October 5, 2023, 9:05pm

Interesting use case of the stable ABI: Wenzel Jakob has been developing nanobind as a successor project (at least for my needs) to pybind11. As of Python 3.12, nanobind uses the limited API and so can be used to create stable-ABI bindings of large C++ projects. See: Proposal to add `Py_IsFinalizing()` to the limited API/stable ABI · Issue #110397 · python/cpython · GitHub

pitrou · October 5, 2023, 9:55pm

And for reference, the nanobind project lives at GitHub - wjakob/nanobind: nanobind: tiny and efficient C++/Python bindings

storchaka · October 6, 2023, 9:44am

Sorry, posted to wrong topic.

vstinner · October 11, 2023, 12:21pm

UPDATE: I gave a talk at the Python core dev sprint at Brno, my slides: Python C API (PDF). I elaborated the benefits of the limited C API, why I consider that the stable ABI will be a key of the C API success next years, and elaborated how using the limited C API for some stdlib C extensions will increase code coverage and test coverage of the limited C API.

I created a new PR to build the _stat C extension with the limited C API: PR #110711.

wjakob · November 10, 2023, 9:16am

@vstinner looks like there will be a bunch of stable ABI packages via nanobind in the future. These are projects which transitioned from prior pybind11-based implementations). For example, JAX (a machine learning framework by Google) ported the bindings of the C++ component and specifically referenced stable ABI support as a reason to do so. Similarly, FEniCS (a popular finite element solver) and Google-Benchmark just went through similar porting efforts.

JAX: link Switch jaxlib to use nanobind instead of pybind11. · google/jax@70b7d50 · GitHub
FEniCS: Switch to nanobind by garth-wells · Pull Request #2820 · FEniCS/dolfinx · GitHub
Google-Benchmark: Switch bindings implementation to `nanobind` by nicholasjng · Pull Request #1526 · google/benchmark · GitHub

vstinner · March 17, 2024, 9:27pm

Status at March 17th, 2024. The following 16 C extensions are now built with the limited C API:

_ctypes_test
_multiprocessing.posixshmem
_scproxy
_stat
_statistics
_testimportmultiple
_testlimitedcapi
_uuid
errno
fcntl
grp
md5
pwd
resource
termios
winsound

Moreover, C API tests are now split in 3 extensions:

_testlimitedcapi: limited C API (Py_LIMITED_API)
_testcapi: public C API
_testinternalcapi: internal C API (Py_BUILD_CORE)

Since August 2023, Argument Clinic (AC) was enhanced to generate more efficient code for the limited C API. Code generated by AC for the _statistics extension is now as efficient or even a little bit more efficient since code is even inlined! The METH_FASTCALL calling convention is now used by the limited C API as well.

Other C extensions use the internal C API for various reasons or are using functions which are lacking in the limited C API. Remaining issues should be analyzed on a case by case basis.

This work shows that non-trivial C extensions can be written using only the limited C API version 3.13.