AOT instead of JIT

geo_m · April 24, 2024, 9:31pm

Hi, as many of you have heard, there is an ongoing effort to add JIT to CPython as PEP 744 (PEP 744 – JIT Compilation | peps.python.org).

I think that the recent efforts to make Python faster have very good results, but the effort behind JIT can be better directed toward AOT because then runtime facilities can be embedded, eliminating the need for an external VM, which is the modern approach to streamline deployments, and it will open the door for better optimizations and performance enhancements, even if not in the near future.

Now, I have read some reactions to such ideas here and there, and I would like to reply in advance:

1- “Python is used by many for scripting and automation, not just for applications and systems”: CPython can still serve REPL and execute scripts; AOT is about the compiling process to produce optimized machine code (e.g using LLVM) in the form of platform-dependent executables.

2- “Python is a dynamic language, and it should stay true to its spirit…etc”: Python is first and foremost a general-purpose programming language with “dynamic typing,” which means type resolution is at runtime. Most modern programming languages have dynamic typing capabilities—Go, Swift, Rust…etc.—just to name a few. So, I don’t think that dynamic typing is why people use and, in my case, love to use Python. I think Python’s elegant syntax and the processes and the humble culture around it are what made Python attractive, not dynamic typing per se, which many of its users really don’t understand well.

3- “There are already many compilers for Python”: The center of Python’s community is the Steering Council and PEP process. People will hesitate to adopt any solution and integrate it into a production environment if it is not widely known and supported by the community unless it is an urgent necessity and with willingness and expertise to deal with any unexpected issues that may arise.

3- “There are many programming languages out there, use some language x instead”: I am not stuck with Python, and Python is not the only language I know and use, but Python is the one I love to use, and I would like to use it exclusively without the need for workarounds and patchwork things here and there using other programming languages.

So, I think that JIT is too old to be the future, and AOT would be a better investment, with better outcomes in the long run.

MegaIng · April 24, 2024, 9:43pm

Interesting, this strictly contradicts years of researching in creating fast programming languages. Good JITs are better than good AOTs since they know more about the program and the hardware it’s being run on than any AOT compiler can hope to achieve. This is why in some benchmarks JS beats rust, C and C++, and why Java can easily keep up.

How would an AOT compiler support dataclasses or enum, or any module/package that uses dynamic code generation?

Would it run the code generation at compile time, thereby breaking the semantics of python?
Or would it not compile the corresponding __init__/__str__ function at compile time and either interpret them/compile them at runtime?
Or would it not support such code at all? That would massively limit the amount of code that could be compiled, probably only a small fraction of the stdlib even.

Python is fundamentally dynamic. This is a strong contrast to “Go, Swift, Rust, …”. This is baked into the language design, design philosophy, community culture, existing packages, existing tools, …

Any attempt to AOT python has to either only support a small subset of usecases or only have very little benefit, especially compared to JIT (which can quite easily handle all the dynamic features). Both of these issues means that the core language itself should not try to support this.

Rosuav · April 24, 2024, 9:50pm

Remember that the perfect shouldn’t be the enemy of the good. I’m talking in generalities here rather than specifically about AOT compilers, but the point is, if you can give a speedup for 90% of Python code without actually penalizing the other 10%, it’s still worthwhile, despite not benefiting certain use-cases. For example, if you could do something that speeds up all code except for the construction of new dataclass elements, that would be extremely beneficial - dataclasses are frequently referenced, but usually only created on initialization. Even __str__() being deoptimized wouldn’t be a huge problem if enough other code could benefit.

This wouldn’t be true if there’s a notable speed penalty on those functions, though. A good JIT compiler shouldn’t have too much impact on non-JITted code, and I would hope that the same would be true for AOT compilers.

MegaIng · April 24, 2024, 9:57pm

I strongly suspect it’s more like 50%/50% split of what code can usably be AOT compiled, but I might ofcourse be wrong with that.
The problem is that an AOT can’t really make assumptions about python code (for example “a class statement produces a type from which instances can be created” isn’t true on at least two levels). And the fewer assumptions a compiler can make, the less useful it becomes. A JIT compiler can dynamically adjust to broken/changing assumptions. An AOT can’t (unless it’s baked into the executable completely, in which case it’s a JIT)

MegaIng · April 24, 2024, 10:17pm

And with the regard “perfect shouldn’t be the enemy of the good”. This is an incredibly strong argument against pushing towards AOT instead of JIT, especially now: JIT already exists (in 3.13), is obviously easier to integrate into python, and is going to be good enough (if not better!) almost always.

fungi · April 24, 2024, 10:19pm

So, I think that JIT is too old to be the future, and AOT would be
a better investment, with better outcomes in the long run.

JIT compilation is too old? AOT compilation is far older than JIT,
unless my tomes on computer history have failed me. But maybe what’s
old is new again…

Anyway, I wouldn’t recommending using the “oldness” of an idea as a
measure of what’s best for the future, unless you like being a
technology magpie. There’s nothing inherently better about a new
idea than an old one, on balance new ideas are probably more often
worse than old ideas (a recurring theme of the Ideas topic on this
forum, in fact).

Rosuav · April 24, 2024, 10:33pm

That’s true, and I am not trying to pitch either direction regarding JITs and AOTs as I haven’t dug enough into their details. I was merely responding to your complaint about how an AOT compiler would handle dataclasses and enums; the simple answer is: it doesn’t have to.

Rosuav · April 24, 2024, 10:39pm

Yes; or, the way I’d put it: Old ideas that have survived the test of time are better than average. New ideas are a full mix of good, bad, and ugly; as time goes on, the worse decisions tend (tend) to be reversed or bypassed. But for that to keep happening, we do have to take a critical look at old decisions too - for example, we can look at SQL’s bizarre mess of keywords (the standard distinguishes between “reserved” and “non-reserved”, and some contexts behave differently in some engines, so have fun trying to make a syntax highlighter) and decide to do things differently in a new language.

So in effect, the reason we say that older ideas are often better than newer ones is that we consciously and repeatedly affirm the good decisions - and the age of an idea is a reasonable proxy for the number of different contexts in which it’s been affirmed.

Which is a lot of words to say “you’re right, new ideas are often worse than old ideas”

geo_m · April 24, 2024, 10:41pm

I am fully aware of the nature of typing in Python, so I don’t expect exponential performance gain overnight, because even in languages like Go or Rust, if you rely heavily on runtime type resolution, you will impact the performance substantially.

The point of JIT and AOT is essentially about code generation and optimizations, so your program can be executed from optimized machine code directly without too much overhead.

How much optimizations with either JIT or AOT? Well, not so much if we consider the amount of dynamic type resolution that should be done by the runtime, but things can be better optimized over time. With AOT, you have more flexibility because you don’t have to run complex optimization and code-generation algorithms when running. Plus, you have the possibility to embed the runtime with your dependencies so that you don’t have to package it with your app or ensure the entire VM with a specific version is installed.

I am not aware of any research that suggests JIT could have a better advantage compared to AOT, but I disagree anyway. With AOT, you have more time and resources to do better optimizations ahead of time to ensure the machine code is optimal. Plus, the idea that you need an external VM to run your program is today a really bad idea (In cases of cross-platform deployment, it might be true if you can ensure that the platform-dependent VM is already installed or deployed with your package).

geo_m · April 24, 2024, 10:53pm

JIT compilation is too old? AOT compilation is far older than JIT,
unless my tomes on computer history have failed me. But maybe what’s
old is new again…

The promise of JIT is somewhat akin to being “cross-platform” but optimized on the host.
AOT is not new, but the current infrastructure (LLVM and MLIR) and optimization algorithms around it are better than anything JIT. So, I am not bashing on JIT, but the JIT as in PEP 744 is based on LLVM, so why not invest in proper compiling infrastructure for Python instead and streamline the dependencies.

Rosuav · April 24, 2024, 10:54pm

Okay, you know what? There’s another VERY important general principle here. Those who aren’t doing anything need to be careful when disagreeing with those who are. A JIT compiler is in process of being added to CPython. You are welcome to develop an AOT compiler for comparison purposes, but otherwise, you’ll need a lot of supporting evidence for your claims - not just “I am not aware of”.

This is the same thing again. You are very welcome to create your own competing proposal, but if you don’t want to, you’ll need a whole lot of evidence to support your claims.

geo_m · April 24, 2024, 11:00pm

I am not seeking support currently, I’ve shared my opinion to get more opinions. The work that has been done in JIT, can be utilized in AOT later anyway.

Nineteendo · April 24, 2024, 11:13pm

Is it possible to have both an AOT & JIT compiler? Then you should be able to combine the best of both worlds.

geo_m · April 24, 2024, 11:22pm

I am concerned with AOT, cause this is what I care about and I am aiming with AOT at minimizing dependencies not just the compilation strategy itself.

Rosuav · April 24, 2024, 11:23pm

Maybe this doesn’t belong in Ideas then?

geo_m · April 24, 2024, 11:26pm

Of course, because I have an idea, and I want to know your thoughts regarding it.

MegaIng · April 24, 2024, 11:30pm

I was just using those as concrete examples for the entire dynamic nature of very widely used python libraries that also happen to be in the stdlib. No, the AOT doesn’t have to handle those libraries specifically, but it has to be able to handle libraries similar to them. And I am of the believe that this is categorically impossible without shipping a JIT compiler or giving up and not compiling large parts of the program, and I have yet to hear a counter argument. Don’t forget: If a class is touched by any kind of dynamic code generation, the AOT can make zero assumptions about the behavior of the class and will therefore be on a massive disadvantage compared to a JIT.

This is an almost completely separate discussions that can be solved with AOT, without AOT, with JIT, without JIT, with both/neither, … This describes something like PyInstaller, just potentially with more support in the core interpreter. That is an interesting suggestion ^[1], but I would recommend decoupling that from talking about machine code generation. Because unless you are doing very simple programs, you are going to have a hard time avoiding to ship an entire python VM + interpreter + (AOT compiler). Unless you comprise on the semantics of python and only support a subset. Which is a valid strategy, but not something the language core devs should concern themselves with.

The point of this section of this discussion forum is for potentially actionable and realistically implementable ideas. Just “feeling the water” is not well received (as you can maybe tell).

Well, ok, it isn’t, because you havn’t actually made a suggestion, just a few ideas with nothing anywhere near concrete ↩︎

ncoghlan · April 24, 2024, 11:32pm

It’s worth studying the JIT techniques being used, as they actually are heavily based on ahead of time compilation.

The AOT compilation step occurs at build time, and makes a suite of machine code templates available to the runtime interpreter. Once the interpreter determines that a particular template is applicable, it patches in the relevant details and starts running the machine code version of that piece of the program.

The integrated JIT work is also fully compatible with the partial AOT compilation used by tools like Numba, as well as the full AOT compilation used by tools like Nuitka (those tools will bypass the integrated JIT, so they won’t gain any benefits from it, but they won’t suffer any negative effects either)

geo_m · April 25, 2024, 12:14am

I appreciate your response. I think it’s a good direction in general, and I appreciate the work done by the core team to make Python more performant. I will study the process in detail again. However, I still find the reliance on the VM as an external dependency to be annoying. I would like to see Python produce self-contained executables. I want to be able to build an executable that doesn’t require external runtime support, similar to Go and Swift, which bundle a lightweight runtime. Since the current implementation depends on LLVM, I suppose in the near future, an AOT compiler should be possible and capable of fully compiling Python code AOT with an embedded lightweight runtime and the used packages and with certain level of optimization during the build process, and this is what I am looking for.

Rosuav · April 25, 2024, 12:27am

That’s not going to happen without some pretty major restrictions in what you can write in it. It’s also not really a good target to aim at; you can already make a .pyz file with zipapp, and as soon as you try to make something that has actually NO dependencies, you force yourself to publish for every CPU and OS combination that your users need, and you have to update for every interpreter change. Quite frankly, I wish we could just tell people “no, that’s not possible”, because it’s the cause of so many problems.

Even C programs are almost never built to require no dependencies whatsoever. Not because you can’t, but because it’s actually a bad idea. I compiled a very simple Hello, World in C using gcc’s default options, and got a 16KB executable; adding -static bumped that to over 700KB. It would also not benefit from OS upgrades to my libraries, including security updates. It’s a technique done VERY rarely and only when it’s truly worthwhile. Why do it with Python scripts?