Is there a plan to make the following optimization/trick available to extension authors?
result = _Py_TryXGetRef(...);
if (!result) {
Py_BEGIN_CRITICAL_SECTION(owner);
result = /* get it properly */
Py_END_CRITICAL_SECTION(owner);
}
return result;
I realise it’s likely the sort of implementation detail that you’re now increasingly keen to avoid exposing. But I think it could probably be wrapped in a function that takes a PyObject** and an owner PyObject* (or an offset and an owner PyObject*) so it probably isn’t really exposing too much - just the concept that sometimes PyObjects own other PyObjects.
(You will also all notice that I’m asking nicely, and resisting the temptation to just include the internal headers. And that’s only partly because there’s a bunch of details that I’m worried I don’t really understand.)
I think we’re at the point where it would be broadly helpful to publish a library extension authors can depend on that provides tools needed in C to safely write extensions that scale well.
Recently I had to convert a subsystem in NumPy’s ufunc implementation to C++ to make use of std::shared_mutex to avoid a scaling issue seen using the crtitical section API.
In NumPy it worked out that we are already planning to use C++ more, but other libraries probably need to stay in C for whatever reason. It would also be nice to retain the feature of PyMutex that if a thread is blocked on acquiring a lock, it detaches the thread state (e.g. releases the GIL) in whatever locking primitive gets exposed to help users avoid deadlocks. Extension authors who naively use C/C++/Rust standard library locking primitives in code that needs to work with PyObjects and touch the C API can easily create deadlocks with the GIL or other global synchronization events.
PyMutex itself is a very bare-bones API which makes it hard to use in situations where a non-blocking failable try_lock would be useful, and unfortunately C17 standard library threads.h support isn’t universal, so at least C extension authors all probably have similar problems if they want to write portable scalable multithreaded code and we can coordinate as a community to solve them collectively.
I also think that at least in C it will likely be broadly useful to have a portable atomics header. I had to create an internal header in NumPy that adopts a small subset of the pyatomic header. Hopefully in a few years when C17 support is more broadly available this is less of a concern.
Unfortunately I don’t have the bandwidth to take this on but I’d happily help out if anyone finds this interesting. I’m not totally sure how you’d distribute it, since it would be most useful as both a build-time and runtime dependency, but that might be tricky in practice.
I’d strongly recommend static linking and/or header-only for this. I’m not a fan of bundling generic libraries into CPython’s own API, and will (continue to) oppose it when it comes up, and trying to bundle dynamic libraries in wheels is a massive pain (as you well know).
Having a separate helper library like this would be a great project though, and I’d love to see someone take it on. It could easily be bundled into a build backend, if depending on a PyPI package isn’t transparent enough, or just copy how pybind11 does it.
I definitely have sympathy for the “Python shouldn’t try to bundle everything” opinion. My original topic was a little more focused on Python reference counting (which is something that’s more justifiably in the remit of Python). But I also know I was asking for an optimization rather than a necessity.
As a general rule I’d also prefer to use standard libraries where possible. The sticking point tends to be MSVC’s C support (so if anyone who works at Microsoft could leave some subtle hints for the MSVC team that might help us all out ). C++ standard library support is generally pretty good though.
I don’t think any library needs to stay in C. Some might want to stay in C “for whatever reason” indeed, but I would question the seriousness of their reasons. We’re very long past the point when C++ support was less widespread than C support, even on obscure platforms (and how likely is it that your library is used on obscure platforms anyway?).
That’s true. The library I have in mind could at least host implementations and allow experimentation for C API functions we would like to propose exposing. Something that might need to use internal headers for 3.13, but hopefully can just use public API on 3.14.
This is fine, assuming you’re looking at things that only impact free-threaded builds. The entire build is subject to change (hence “experimental”), so using internal headers doesn’t protect you nor does it incriminate you