A new API for ensuring/releasing thread states

It actually comes quite easily when you have some code that can either be run synchronously (in an existing Python thread) or in the background / in parallel (in a potentially non-existing Python thread), depending on various execution specifics that the function itself doesn’t control.

For example in Arrow C++ we have IO abstractions that can be implemented for different backends:

And in PyArrow (the Python bindings for Arrow C++) we have an implementation of these IO abstractions that delegate to a Python file-like object. This is so that PyArrow users can use Arrow C++ functionality with arbitrary file-like objects, including their own (for example you could probably call the Arrow CSV or JSONL reader on a ZipFile entry).

Since the C++ IO abstractions can be called in any context for the purpose of reading/writing data, whether they are called from a Python thread or, say, a C++ thread pool thread, entirely depends on what functionality is being called and how. This is not under control of the IO routines themselves.

So, for example, the Tell() implementation for Python file-like objects:

wraps its underlying functionality in the SafeCallIntoPython wrapper that ensures that Python APIs can be safely called from that point:

… and PyAcquireGIL there is just a RAII wrapper around the PyGILState_Ensure/PyGILState_Release :slight_smile:

(yes, at some point we noticed that taking the GIL is not sufficient and you also need to ensure you don’t have an error status set)

1 Like