PEP 788: PyInterpreterRef: Interpreter References in the C API

ZeroIntensity · May 28, 2025, 3:24pm

I’ve made several changes to PEP 788 based on the initial round of discussion. Most importantly:

Strong and weak interpreter references are now their own type instead of being implicitly held in interpreter pointers and IDs.
Interpreter references are now a property of an interpreter, rather than a property of a thread. This means there’s no more “non-daemon thread states”.

Get it while it’s hot:

vstinner · June 2, 2025, 12:02pm

I prefer this new API .

Holding a strong interpreter reference is a visible way to prevent the interpreter shutdown: the shutdown cannot occur before PyInterpreterRef_Close().

I also like that PyThreadState_Ensure() doesn’t consume a strong interpreter reference.

pitrou · June 30, 2025, 2:34pm

Just like @vstinner , I really like this new API. Thank you!

encukou · June 30, 2025, 2:50pm

Thank you! I like this proposal :‍)

Two bigger questions:

I wonder if interpreters can switch to being destroyed when their ref-count falls to 0, rather than have a separate Close operation that blocks until the refcount falls to 0. Was something like this considered?
- Thread states count as held references
- Py_NewInterpreter* functions create a strong interpreter reference, and Py_EndInterpreter closes that reference.
- PyInterpreterRef_Close, rather than Py_EndInterpreter, is what destroys the interpreter.
- Maybe a new PyInterpreterRef_NewFromConfig function would return a reference to a new interpreter, and not create & attach a tstate yet.
Should PyThreadState_Ensure return the previous thread state (or an opaque value), and PyThreadState_Release take it? The current proposal creates an implicit stack that grows when switching between two interpreters; I think a state argument passed between “enter/exit”-style functions is a better design.

Requests for clarification:

Is the reference returned by PyInterpreterRef_Dup an equivalent of the original? In other words, can i do this:
```
ref_b = PyInterpreterRef_Dup(ref_a);
PyInterpreterRef_Close(ref_a);
PyInterpreterRef_Close(ref_a);
```
Is PyInterpreterWeakRef guaranteed to be pointer-sized?
Will PyThreadState_Ensure/PyThreadState_Release prevent the “old” interpreter (the one PyThreadState_Release switches to) from being destroyed between the calls?

Name bikeshedding:

Consider PyInterpreterRef_Main → PyInterpreterRef_GetMain (put the operation in the name)

Editorial:

The title is unnecessarily clickbaity. Something like “PyInterpreterRef: Reference-counted interpreters” would be much more descriptive and searchable, and cover the intent and the majority of the specification.
Could you add source links to the quotes in “Daemon Threads are not the Problem”?
“Weak references” – IMO it would be good to note here that: In case the interpreter has been destroyed, the promotion will fail cleanly.
“Deprecation of the GIL-state APIs”: it’s not really a “plethora” of issues, right? AFAICS there are 2-4 issues, depending on how you count: infallibility, TOCTOU, guessing an interpreter, “GIL” in the name.
“Deprecation of GIL-state APIs”: Could you add arguments to the calls, and explain how to get the “extra” arguments the functions need? (Or put a this in another section. The docs will need this kind of “porting guide” anyway.)
“Example: A Library Interface” – in the last paragraph, “your library” → “using PyGILState_Ensure”. The next example could also be clearer about which version of it has the issues.
PyInterpreterWeakRef_AsStrong: Consider explicitly mentioning that an exception is not set.

ZeroIntensity · June 30, 2025, 3:18pm

I think that’s an interesting approach, and it’s something we could still do after this PEP. Right now, I’m trying to focus on getting an API that works with the current world. We can totally add a variation of Py_NewInterpreter and Py_EndInterpreter that take interpreter references later on.

My main concern with it is that releasing an interpreter reference becomes a much heavier operation, which might not be great for applications that just want to quickly use a callback or something like that. You’ll have to worry about all sorts of re-entrancy and lock-ordering deadlocks because a bunch of finalizers could run.

In addition, when calling PyInterpreterRef_Close, you typically won’t have an attached thread state for that interpreter (such as when you just called PyThreadState_Release). To finalize, Python will have to make another thread state for the interpreter, which is wasteful, and it’s also probably a bit buggy to finalize an interpreter with a non-main thread state. Subinterpreters crash if you do this right now.

No preference here, I’m happy to hear if others would like it this way. It would complicate the usage, but simplify the implementation.

Currently, yes. But I don’t think we should guarantee that for the future’s sake. Is it useful to use PyInterpreterRef_Dup as an incref API?

Yeah, forgot to put that in there, thanks!

It’s not its job to do that, because the thread state doesn’t own the interpreter reference. The interpreter can only shut down once the caller of PyThreadState_Ensure/PyThreadState_Release releases the reference.

@vstinner suggested this before too, but I’m still in favor of keeping it as PyInterpreterRef_Main. It’s meant to be consistent with the naming of PyInterpreterState_Main, which hopefully makes it clearer that the two are very similar.

Yeah, in hindsight, we could have called it something a little clearer. I’m worried that changing it now will just cause confusion, because everything currently refers to it as “Reimagining native threads”.

No other comments about the rest of the editorial concerns, I’ll fix them.

encukou · July 1, 2025, 8:24am

Makes sense.
Now that you have the idea in your head, it’s less likely that some design choice blocks it as a future possibility :‍)

It looks like it’s not that much more complicated in most cases (where Ensure and Release are in the same function), and in the others, tying the two together could bring some clarity to the user code.
I think it’s worth reducing the implementation complexity, and avoiding the need to put the stack-related overhead in the thread state. (Note that implementations that would put the stack there, e.g. something like HPy’s debug ABI, could check for correct nesting.)

I don’t think incref semantics are useful. IMO it would make sense to mention that we don’t guarantee them, so we set the right expectations for Dup.

Ah. I emphasize that they’re not similar: PyInterpreterState_Main gives you a borrowed reference. PyInterpreterRef_Main is more like PyInterpreterRef_Get (PyInterpreterRef_GetCurrent?).
Anyway, just bikeshedding.

It should be fine to change the title. This is why PEPs have those immutable serial numbers.

ZeroIntensity · July 1, 2025, 1:12pm

Ok, how should this look? I’m leaning towards a PyThreadState ** parameter, but maybe we want an opaque version similar to PyGILState_Ensure? (Think PyThreadRef, in case we ever wanted to do something like this for thread states.)

Here’s a mockup:

PyThreadState *old;
if (PyThreadState_Ensure(ref, &old) < 0) {
    PyInterpreterRef_Close(ref);
    return -1;
}

/* ... */

PyThreadState_Release(&old);
PyInterpreterRef_Close(ref);

Well, that’s sort of what I was going for; PyInterpreterRef_Main is the “strong” version of PyInterpreterState_Main.

If we’re going to change it, how about something like “Protecting the C API from finalization”? I’m not a fan of emphasizing PyInterpreterRef in the title, because it makes PyThreadState_Ensure seem out of place.

encukou · July 2, 2025, 7:01am

An opaque ref can:

be changed to something else in the future
smuggle a bit of other data (AFAIK, a “delete when done” flag would be useful)

And zero can still be reserved for errors:

PyThreadRef old = PyThreadState_Ensure(ref);
if (!old) {
    PyInterpreterRef_Close(ref);
    return -1;
}
/* ... */
PyThreadState_Release(old);
PyInterpreterRef_Close(ref);

Thread-safe C API for attaching thread states?
IMO, putting the API name in the title makes a PEP much easier to find. (And PyThreadState_Ensure does take PyInterpreterRef as argument.)
Personally, I see reference-counting the interpreters as the main point of the PEP – it’s the solution the PEP ended up with.

pitrou · July 2, 2025, 7:44am

If there’s a zero then it’s either an integer or pointer, right? Do you suggest to add something like typedef uintptr_t PyThreadRef?

ZeroIntensity · July 2, 2025, 3:21pm

Sounds like an opaque reference is the way to go.

Hm, I’m not sure about returning the thread reference directly. What’s wrong with a PyThreadRef * input and output?

encukou · July 3, 2025, 6:30am

Nothing, it’s just a bit ugly.
Yeah, I guess it’s better as PyThreadRef * output argument with a fully opaque struct, rather than uintptr_t.

ZeroIntensity · July 9, 2025, 1:14pm

I’ve renamed the PEP to “Interpreter References”, and updated PyThreadState_Ensure/PyThreadState_Release to take a new PyThreadRef parameter.

What does everyone think?

pitrou · July 9, 2025, 2:21pm

The title should probably reflect that it’s about the C API and that it makes its use more reliable? I could propose something like “C API for reliable thread states” or “C API for reliable interpreter references”.
I don’t think it makes sense for PyThreadState_Release to take a PyThreadRef parameter, since it can’t fail anyway. What would it do if the thread ref isn’t the expected one?

ZeroIntensity · July 9, 2025, 2:54pm

Hm, maybe. I think the new title is nice and short, and you can figure out that it’s about the C API by reading the first line of the abstract. I’m not sure we need to clarify that it’s about “reliability” (ideally, all C APIs should be reliable ). I’d be happy with “Interpreter References in the C API”, but that also feels a little redudant.

It would reattach the wrong thread state, and probably end up crashing (because the Ensure() counter will be wrong). It’s pretty similar to what happens if you pass PyGILState_Release the wrong PyGILState_STATE.

My main reasoning for putting a reference parameter in PyThreadState_Release is that we can safely add similar reference counting to thread states later if we need to (with PyThreadState_Release acting as a decref). We already do this during interpreter finalization, and it doesn’t seem too far-fetched to add something like that as a public API.

pitrou · July 9, 2025, 3:01pm

That’s not redundant since it’s the title.

You might want to add this explanation to the PEP, then.

ZeroIntensity · July 9, 2025, 3:13pm

I think it’d make sense for the “What’s New in Python 3.15” page, but it would be redundant in discussions about the C API.

But anyways, before I change it, let’s come to a consensus on what it should be so we don’t need to change it for a fourth time: does “Interpreter References in the C API” work for everyone?

Ok, will do.

pf_moore · July 9, 2025, 3:35pm

Looking at the topic list, I don’t see any indication this is about the C API. My first thought was that it was to do with the new Python-level interpreter feature.

+1 from me for “Interpreter References in the C API”.

encukou · July 10, 2025, 10:33am

In those discussions, just shorten the title :‍)

My bikeshed colour is “PyInterpreterRef: C API for reference-counting interpreters”. It would be great if people call it “the PyInterpreterRef PEP” :‍)

ZeroIntensity · July 10, 2025, 12:40pm

Maybe we could combine the two? “PyInterpreterRef: Interpreter References in the C API”

ZeroIntensity · July 11, 2025, 9:10pm

The PEP has officially been renamed!

Now that we’re done with bikeshedding the title, are there any other concerns/suggestions about the API?