API for stack switching: save state / restore state?

Feature or enhancement

Greenlet saves and restores threadstate in order to facilitate stack switching:
greenlet threadstate save and restore code
I am working on adding stack switching to Pyodide by copying the code that @jamadden, @brandtbucher, and @vstinner wrote in greenlet into Pyodide:
Pyodide code adapted from greenlet

Pitch

It would be useful to have all this code available in the Python interpreter itself to make adding support for stack switching easier.

Also, while would any of you be willing to review the file I linked and check if it looks reasonable to you? It seems to work in simple tests.

Well, someone has to design an API to ease the greenlet implementation. So far, nobody proposed such API. In the past, there were some pieces of Stackless Python in CPython directly. It seems like they are gone. Or maybe this project was fully maintained externally, I’m not sure.

I think API could be as simple as:

void* state_ptr = PyState_Save();
// ...
PyState_Restore(state_ptr);
PyState_Free(state_ptr);

It would even be okay with me for PyState_Restore to consume state_ptr, though maybe someone could some day have a reason to restore the same state twice.

It seems to me that there is pretty complicated logic to keep track of what parts of the stack are evicted after the switch, and this logic is strongly OS / architecture dependent but not dependent on the Python version. And then there is this logic to save and restore the necessary parts of the Python threadstate which is strongly dependent on the Python version but largely OS / architecture independent. It would be nice to see the threadstate code be part of Python itself.

It doesn’t matter how “simple” such an API could appear. The “pretty complicated logic” that is platform and or version specific internally is the important part. We’ll need the exact definition of its semantics in terms of what it explicitly does and does not do. Without any underspecified undefined behaviors.

I agree that it’d be nice to have this maintained within CPython given multiple things appear to want to do it and this way we could update it as our internals evolve instead of waiting for external projects to all catch up.

The first step is defining what exactly it is even supposed to be.

2 Likes

The first step is defining what exactly it is even supposed to be.

Right. Here’s an attempt to describe what should happen. Maybe not quite a definition.

Prior to entering Python frames we record the current stack pointer stack_start. Then we call switch_stack(). switch_stack() records the current stack position stack_stop. We save the Python thread state with PyState_Save() restore it to the “no Python frames” ready state so that it can run other code. We copy the range of stack between stack_start and stack_stop into a buffer, and then set the stack pointer back to stack_start and call some other Python code. When we’re done, we copy the original stack data from the buffer back onto the call stack. We set the stack pointer back to stack_stop. Then we call PyState_Restore(). Now we can return from switch_stack back into the originally executing Python context without crashing.

The goal is that PyState_Save() and PyState_Restore() save whichever parts of the tstate are needed for this to work.

We see in practice that this includes stuff like cframe, use_tracing, recursion_depth, frame, etc.

But I guess with some asm it would be possible to write a test case that does all of this. And then if this code runs and doesn’t segfault, it would be working.

Note that the code in PyState_Save() and PyState_Restore() is in fact completely entirely platform independent. It’s everything else that depends on the platform.

@brandtbucher Do you understand any of this? Your name was mentioned. :slight_smile:

I’m going to try to make a small working code example so we can have something to look at.

To clarify: are you talking about the C stack, or the Python stack?

I personally don’t think Python has any business providing APIs for swapping its own C stack in and out. Assuming you mean the Python stack, there is indeed already a discussion on the bug tracker (and an open PR) for providing this API. I’m sure your input there would be greatly appreciated, since you seem like exactly the sort of user we would define this new API for!

Add “unstable” frame stack api · Issue #91371 · python/cpython (github.com)

3 Likes

The Python stack.

Agreed, it wouldn’t make sense.

Wonderful! Thanks for the reference. I will look into it over there.