(Before diving in here, I want to explicitly state that I am posting this as a JIT contributor and not as a Steering Council member)
As we start putting together a PEP, @kj0 and I wanted to start a thread to solicit team and community perspectives on the JIT in its current form.
To recap, the JIT has been significantly upgraded in 3.15 (see the 3.15 What’s New), compared to previous versions of Python. This is largely due to an updated trace recording design and broader coverage of bytecode operations.
At a high level from an end user’s perspective, the current state of things looks like:
Performance: On the pyperformance benchmark suite, the JIT currently shows an 8-9% geometric mean improvement over the standard interpreter (all optimizations enabled) on x86-64 Linux, and 12–13% over the tail-calling interpreter on AArch64 macOS. Individual benchmarks range from roughly a 15% slowdown to over 100% speedup. You can explore the data on various machines’ runs atdoesjitgobrrr.com.
Observability: JIT frames are now visible to GDB and GNU backtrace() on supported Linux ELF platforms, which means that native debuggers and tooling can unwind through generated code rather than stopping at it.
Lower footprint: Better machine code generation on x86-64 and AArch64 generally means lower memory usage for generated code versus 3.14.
Build story: What hasn’t changed is that the current JIT implementation still relies on LLVM (now 21) at build-time for stencil generation; end users still never need it.
That said, we’d be interested in hearing your thoughts on where we are at but also perspectives on where things should go, namely whether you use, package, build on top of, or maintain code adjacent to CPython. Some areas of interest:
Performance: Does the current speed up meet your expectations? What would the JIT need to deliver for it to matter to you?
Compatibility and interop: Have you hit issues with the JIT and profilers, debuggers, free-threading, or C extensions? If you maintain a tool that introspects the interpreter, what guarantees do you need from a JIT-enabled CPython?
Platforms and packaging: If you build or distribute CPython, how has the JIT affected you?
Maintenance and contribution: For core devs and contributors who don’t work on the JIT, how has the JIT has affected your work in the project, if at all?
Concrete experiences are especially valuable. “I tried enabling it on X and hit Y” is more useful to us than abstract positions.
To set some expectations for performance, I have seen a lot of talk by internet users that PyPy is 2x faster on pyperformance (not from the PyPy devs themselves). I personally think this is people trying to napkin math the PyPy benchmark numbers into pyperformance numbers and guesstimating that. I personally think this is a bad practice, as benchmark numbers are not comparable across benchmark suites.
I would like to state that while PyPy is likely >2x faster on their benchmark suite. PyPy 3.11 is a very respectable 50% faster on macOS Apple Silicon and 80-90% faster on x86-64 Linux on pyperformance than 3.15 alpha from my own actual pyperformance runs.
This is not because PyPy is bad, but that pyperformance’ async benchmarks are notoriously hard for JITs, every JIT I’ve tried on that (PyPy, GraalPy, CPython 3.15) falls over on the async benchmarks.
Aside from that, CPython cannot break C API compatibility and do what PyPy does with collections, we also cannot abandon reference counting in the interpreter because people rely on immediate reclamation. Coupled with opaque C-level finalizers for many objects, CPython has to treat nearly every object deallocation site as a call that might call eval or something catastrophic. Even after a refcount elimination pass, there are still some sites that cannot be removed to preserve immediate reclamation. This is why I’m suggesting that 20% pyperformance geometric mean (with an up to 200% speedup range) is a good target to have considering all the constraints. I don’t mean we should require 20% for accepting the PEP, but it should be what we angle our expectations around.
JFYI: I have the utmost respect for PyPy. I have also shared the individual benchmark numbers with the PyPy devs to help them speed up PyPy so they can beat us even more .
I think at least 10% geometric mean across the board, but I will happily take the 20% @kj0 is offering as a goal as I expect that level of win is getting harder to reach.
It hasn’t directly, but then again at this point I don’t feel like I’m in a position to contribute to the eval loop itself at all as its overall structure seems way more involved than it once was.
I lost track what’s the state with the other unwinders in x86 and arm but we need to ensure all unwinders work here. @kj0 and @diegor may remember better how things are here
Also there is the question on how to reconstruct the hybrid system/python stack and symbolise even if we can unwind.
I don’t have much to add here, but as a Python user in the computational physics space, the number one complaint I hear about Python is always the speed.
Whilst this probably won’t make a huge difference to NumPy code, a 20% mean speedup would still be awesome for prototyping and native Python code.
I’m already running Python with the JIT enabled on my personal computer, and it works great for my use cases. Thanks for all the hard work!
As a former .NET user, one thing I disliked, was the initial JIT time, especially on projects with lots of references. For some projects this is fine because subsequent runs use the cache.
Now in Python, my primary use is prototyping, scripting, always changing the code, maybe invalidating the cache. I’m wondering how benchmarks are tested? Will it slow down the first run as compared to the interpreter. I think this is something to consider
20% is a phenomenal geo mean, not to mention the individual speedups. But once this 20% goal is reached, are there plans to increase it even more, or will it just be occasional improvements where possible?
Also, will there be a tier 3 JIT which will benefit frequently executed code more?
I’m not sure if the llvm requirement is a barrier to linux distro enabling of python JIT.
LLVM more generally, or LLVM 21 specifically? There’s already llvm-21 packages in Ubuntu 26.04 LTS and Debian stable backports right now, though that’s just a couple of examples they’re at least some of the more conservative server distributions.
I think the plan is to make “the JIT go brrr” as much as possible (just like Python perf in general), so any target discussed here are minimum targets in order to keep the JIT in CPython, not a stopping point.
LLVM in general was what I was thinking.
I’m wondering about Fedora in particular.
They have llvm 21 as well, but I’m not sure if they want to build python with it.
I’m packaging CPython for Gentoo. We have optional JIT builds since 3.13.
My early experience with JIT (somewhere early in 3.13) was that it broke too much, so I’ve switched back to non-JIT on my system and forgot about it.
A few months ago I’ve tried again, with 3.14. I haven’t seen any problems that would be clearly related to JIT, and I’ve been using the JIT builds since. I’m maintaining most of the Python packages in Gentoo, so I’m running lots of test suites, and I found JIT builds to be stable. That said, I haven’t done any performance comparisons, so I can’t really talk about that.
I’ve also tried building 3.15 with JIT enabled, but the build failed due to assertions in bootstrap Python:
(I’m sorry, I haven’t found time to report it properly.)
That said, the LLVM requirement itself is not that much of a problem for us (given a lot of software depends on LLVM these days), but pinning to a single LLVM version is. I mean, it practically means that anyone using Python 3.14 is stuck with LLVM 19 (which went EOL 1.5 year ago), and for Python 3.15 they need LLVM 21 instead (which went EOL over half a year ago). And if someone needs multiple Python versions (which is expected for Python developers), they either need to go for non-JIT, or install multiple LLVM versions.