I’ve made several changes to PEP 788 based on the initial round of discussion. Most importantly:
Strong and weak interpreter references are now their own type instead of being implicitly held in interpreter pointers and IDs.
Interpreter references are now a property of an interpreter, rather than a property of a thread. This means there’s no more “non-daemon thread states”.
Holding a strong interpreter reference is a visible way to prevent the interpreter shutdown: the shutdown cannot occur before PyInterpreterRef_Close().
I also like that PyThreadState_Ensure() doesn’t consume a strong interpreter reference.
I wonder if interpreters can switch to being destroyed when their ref-count falls to 0, rather than have a separate Close operation that blocks until the refcount falls to 0. Was something like this considered?
Thread states count as held references
Py_NewInterpreter* functions create a strong interpreter reference, and Py_EndInterpreter closes that reference.
PyInterpreterRef_Close, rather than Py_EndInterpreter, is what destroys the interpreter.
Maybe a new PyInterpreterRef_NewFromConfig function would return a reference to a new interpreter, and not create & attach a tstate yet.
Should PyThreadState_Ensure return the previous thread state (or an opaque value), and PyThreadState_Release take it? The current proposal creates an implicit stack that grows when switching between two interpreters; I think a state argument passed between “enter/exit”-style functions is a better design.
Requests for clarification:
Is the reference returned by PyInterpreterRef_Dup an equivalent of the original? In other words, can i do this:
Is PyInterpreterWeakRef guaranteed to be pointer-sized?
Will PyThreadState_Ensure/PyThreadState_Release prevent the “old” interpreter (the one PyThreadState_Release switches to) from being destroyed between the calls?
Name bikeshedding:
Consider PyInterpreterRef_Main → PyInterpreterRef_GetMain (put the operation in the name)
Editorial:
The title is unnecessarily clickbaity. Something like “PyInterpreterRef: Reference-counted interpreters” would be much more descriptive and searchable, and cover the intent and the majority of the specification.
“Weak references” – IMO it would be good to note here that: In case the interpreter has been destroyed, the promotion will fail cleanly.
“Deprecation of the GIL-state APIs”: it’s not really a “plethora” of issues, right? AFAICS there are 2-4 issues, depending on how you count: infallibility, TOCTOU, guessing an interpreter, “GIL” in the name.
“Deprecation of GIL-state APIs”: Could you add arguments to the calls, and explain how to get the “extra” arguments the functions need? (Or put a this in another section. The docs will need this kind of “porting guide” anyway.)
“Example: A Library Interface” – in the last paragraph, “your library” → “using PyGILState_Ensure”. The next example could also be clearer about which version of it has the issues.
PyInterpreterWeakRef_AsStrong: Consider explicitly mentioning that an exception is not set.
I think that’s an interesting approach, and it’s something we could still do after this PEP. Right now, I’m trying to focus on getting an API that works with the current world. We can totally add a variation of Py_NewInterpreter and Py_EndInterpreter that take interpreter references later on.
My main concern with it is that releasing an interpreter reference becomes a much heavier operation, which might not be great for applications that just want to quickly use a callback or something like that. You’ll have to worry about all sorts of re-entrancy and lock-ordering deadlocks because a bunch of finalizers could run.
In addition, when calling PyInterpreterRef_Close, you typically won’t have an attached thread state for that interpreter (such as when you just called PyThreadState_Release). To finalize, Python will have to make another thread state for the interpreter, which is wasteful, and it’s also probably a bit buggy to finalize an interpreter with a non-main thread state. Subinterpreters crash if you do this right now.
No preference here, I’m happy to hear if others would like it this way. It would complicate the usage, but simplify the implementation.
Currently, yes. But I don’t think we should guarantee that for the future’s sake. Is it useful to use PyInterpreterRef_Dup as an incref API?
Yeah, forgot to put that in there, thanks!
It’s not its job to do that, because the thread state doesn’t own the interpreter reference. The interpreter can only shut down once the caller of PyThreadState_Ensure/PyThreadState_Release releases the reference.
@vstinner suggested this before too, but I’m still in favor of keeping it as PyInterpreterRef_Main. It’s meant to be consistent with the naming of PyInterpreterState_Main, which hopefully makes it clearer that the two are very similar.
Yeah, in hindsight, we could have called it something a little clearer. I’m worried that changing it now will just cause confusion, because everything currently refers to it as “Reimagining native threads”.
No other comments about the rest of the editorial concerns, I’ll fix them.
Makes sense.
Now that you have the idea in your head, it’s less likely that some design choice blocks it as a future possibility :)
It looks like it’s not that much more complicated in most cases (where Ensure and Release are in the same function), and in the others, tying the two together could bring some clarity to the user code.
I think it’s worth reducing the implementation complexity, and avoiding the need to put the stack-related overhead in the thread state. (Note that implementations that would put the stack there, e.g. something like HPy’s debug ABI, could check for correct nesting.)
I don’t think incref semantics are useful. IMO it would make sense to mention that we don’t guarantee them, so we set the right expectations for Dup.
Ah. I emphasize that they’re not similar: PyInterpreterState_Main gives you a borrowed reference. PyInterpreterRef_Main is more like PyInterpreterRef_Get (PyInterpreterRef_GetCurrent?).
Anyway, just bikeshedding.
It should be fine to change the title. This is why PEPs have those immutable serial numbers.
Ok, how should this look? I’m leaning towards a PyThreadState ** parameter, but maybe we want an opaque version similar to PyGILState_Ensure? (Think PyThreadRef, in case we ever wanted to do something like this for thread states.)
Well, that’s sort of what I was going for; PyInterpreterRef_Main is the “strong” version of PyInterpreterState_Main.
If we’re going to change it, how about something like “Protecting the C API from finalization”? I’m not a fan of emphasizing PyInterpreterRef in the title, because it makes PyThreadState_Ensure seem out of place.
smuggle a bit of other data (AFAIK, a “delete when done” flag would be useful)
And zero can still be reserved for errors:
PyThreadRef old = PyThreadState_Ensure(ref);
if (!old) {
PyInterpreterRef_Close(ref);
return -1;
}
/* ... */
PyThreadState_Release(old);
PyInterpreterRef_Close(ref);
Thread-safe C API for attaching thread states?
IMO, putting the API name in the title makes a PEP much easier to find. (And PyThreadState_Ensuredoes take PyInterpreterRef as argument.)
Personally, I see reference-counting the interpreters as the main point of the PEP – it’s the solution the PEP ended up with.
The title should probably reflect that it’s about the C API and that it makes its use more reliable? I could propose something like “C API for reliable thread states” or “C API for reliable interpreter references”.
I don’t think it makes sense for PyThreadState_Release to take a PyThreadRef parameter, since it can’t fail anyway. What would it do if the thread ref isn’t the expected one?
Hm, maybe. I think the new title is nice and short, and you can figure out that it’s about the C API by reading the first line of the abstract. I’m not sure we need to clarify that it’s about “reliability” (ideally, all C APIs should be reliable ). I’d be happy with “Interpreter References in the C API”, but that also feels a little redudant.
It would reattach the wrong thread state, and probably end up crashing (because the Ensure() counter will be wrong). It’s pretty similar to what happens if you pass PyGILState_Release the wrong PyGILState_STATE.
My main reasoning for putting a reference parameter in PyThreadState_Release is that we can safely add similar reference counting to thread states later if we need to (with PyThreadState_Release acting as a decref). We already do this during interpreter finalization, and it doesn’t seem too far-fetched to add something like that as a public API.
I think it’d make sense for the “What’s New in Python 3.15” page, but it would be redundant in discussions about the C API.
But anyways, before I change it, let’s come to a consensus on what it should be so we don’t need to change it for a fourth time: does “Interpreter References in the C API” work for everyone?
Looking at the topic list, I don’t see any indication this is about the C API. My first thought was that it was to do with the new Python-level interpreter feature.
+1 from me for “Interpreter References in the C API”.
My bikeshed colour is “PyInterpreterRef: C API for reference-counting interpreters”. It would be great if people call it “the PyInterpreterRef PEP” :)