An API for controlling and introspecting the JIT

markshannon · December 5, 2024, 10:45am

The new JIT compiler will (probably) be available, but off, by default for 3.14.

We will need a way to turn in on and off, and it would be useful to be able to monitor it.
For example, users might want to know how much memory it is using.

I propose adding a sys.jit namespace, initially consisting of two properties: available and on.

available will be a read-only boolean property, set to True if the JIT is available.
on will be a writeable boolean property. Setting sys.jit.on = True will turn the JIT on and setting sys.jit.on = False will turn the JIT off.

I would also want to add some “private” methods for debugging and monitoring, but I won’t specify them here as they are likely to change as we improve the JIT.

brandtbucher · December 5, 2024, 10:53pm

I’d like to push back a bit on giving users too much control over the way the JIT runs, since we’re still figuring lots of details out ourselves.

Off the top of my head, I can’t think of any good reason why somebody would want to dynamically toggle the JIT while Python is running (the only thing I can think of is tests for CPython itself… we currently just launch subprocesses for things that need the JIT off, which seems to work fine and doesn’t require stopping the world and throwing everything away). Correctly turning the JIT on/off anywhere except startup (using PYTHON_JIT to override the default for the build if needed) is actually quite tricky in practice, since it would likely involve walking over all bytecode in the current process.

I would be supportive of starting with a read-only API… something like this?

sys._jit.available: bool: Whether or not the JIT is built.
sys._jit.enabled: bool: Whether or not the JIT is enabled.
sys._jit.active: bool: Whether or not the current frame is executing JIT code. (Ironically, branching on this will trace very poorly, so if sys._jit.active: ... probably won’t work the way you expect.)
sys._jit.memory: int Total memory usage of JIT code and data, in bytes (will be a multiple of page size).

I can also see a use-case for user-configurable JIT memory limits and tier-up/invalidation thresholds. But again, these are probably things better controlled by environment variables at startup; whoever is actually running the program likely has a better understanding of how they’d like the JIT tuned than, say, a library. And like enabling/disabling the JIT, it’s much easier to set this once at startup (as opposed to something like GC thresholds, which are just a couple of integers on the runtime state and can be changed easily at runtime).

It might be helpful to see what other JIT runtimes (like Cinder, PyPy, YJIT, LuaJIT, or V8) provide?

markshannon · December 6, 2024, 10:43am

As you said yourself, the active property is almost guaranteed to have a negative impact on performance. Not what the user wants. Also, I’m not sure it is that well defined. So let’s drop that for now.

I note you are suggesting _jit rather than jit as the namespace. Seems reasonable, but FTR can you say why?

Determining the amount of memory used may need a calculation, not just a look up, so let’s make it a function.

In summary, we have:
available: read-only boolean property
enabled: read-only boolean property
memory_used(): Returns the memory used (approximately) in bytes.

It might be helpful to see what other JIT runtimes (like Cinder, PyPy, YJIT, LuaJIT, or V8) provide?

No harm in looking, although I’d like this to be driven by what Python users need or want.

muratkhan23 · December 11, 2024, 7:21am

This is a great proposal mate. Adding sys.jit with available and on properties makes JIT management clear and user-friendly. A method for monitoring memory usage, like sys.jit.memory_usage(), could be helpful for debugging as I have personally tried it for one of my clients and that worked in my case. Private methods for advanced monitoring and debugging may sound practical but should remain flexible initially. Compatibility with existing Python internals will be key as in some cases it will retard the functionality . If have done a bit like similar project for one of my German client who was willing for a similar function but slightly different. You can analyze that work too if your willing to at BITRO.

brandtbucher · December 12, 2024, 1:22am

I was thinking something like “set a bit on the interpreter’s entry_frame when entering JIT code, and unset it when leaving”. That way you could quickly and reliably tell whether the topmost frame for any interpreter invocation is executing JIT code or not. There are also games we could play with detecting this property in the optimizer and changing it there, etc. But those seem more complicated.

My main reason for mentioning it was:

It seems useful for tests for the JIT itself.
Lots of people have asked: “How do I know if the JIT is actually working?”

This would be a small, intentional break in the transparency between JIT and non-JIT code for those purposes. But it’s not a huge deal, just seems like a nice-to-have.

I imagine that this will be a bit unstable for some time, so the underscore communicates that until we’re ready to set the API in stone. It also may not be provided by all Python implementations.